In this 3-part series, IMVU senior engineer Bill Welden describes the means and technology behind IMVU’s web services.
Part 1: REST
REST, or Representational State Transfer, is the model on which the protocols of the World Wide Web are built. It was originally described in the year 2000 in Roy Fielding’s doctoral dissertation, and was developed in order to impose some discipline on distributed hypermedia systems.
The benefits of REST have since proven valuable in defining APIs. Everybody has them these days: Twitter, Facebook, eBay, Paypal. And IMVU has been using REST as a standard for its back end services for many years.
Because the fit between REST (which is framed in terms of documents and links) and database-style applications (tables and keys) is not perfect, everybody means something slightly different when they talk about REST services.
Here is a rundown of the things that IMVU does under the aegis of REST, and the benefits that accrue:
- Virtual State Machine
In his dissertation, Fielding’s describes a REST API as a virtual state machine:
The name ‘Representational State Transfer’ is intended to evoke an image of how a well-designed Web application behaves: a network of Web pages (a virtual state-machine), where the user progresses through the application by selecting links (state transitions), resulting in the next page (representing the next state of the application) being transferred to the user and rendered for their use.
With each network request, a representation of the application state is transferred, first to the server, and then (as modified) back to the server. This is “representational state transfer” or REST.
Note that this is a two-phase thing. Once a request goes out from the client, the state of the application is “in transition”. When the response comes back from the server, the state of the client is “at REST”.
By defining a rigorous protocol which frames outgoing requests from clients in terms of following structured links, and the server’s response in terms of structured documents, REST makes it possible to build general, powerful support layers on both the client and the server. These layers then offload much of the work of implementing new services and clients.
- Separation of Concerns
Separation of concerns means clients are responsible for interface with the user and servers are responsible for the storage and integrity of data.
This is important in building useful (and portable) services, but it is not always easy to achieve. The aspect ratio, and even the resolution of graphic images ought to be under the control of the client. However, it has been easy at times to design a back-end service for a specific application. For example, static images such as product thumbnails might be provided only in one specific aspect ratio, limiting the service’s usefulness for other applications. This narrow view of the service limits the ability to roll out new versions and to share the service across products.
When we designed our Server Side Rendering service (which delivers a two-dimensional snapshot of a specific three-dimensional product model), we went to some pains to place this control – height and width of desired image – in the hands of the client through a custom HTTP header included with the service request.
- Uniform Contract
A uniform contract means that our services comply with a set of standards that we publish, including a consistent URI syntax and a limited set of verbs. I will talk in more detail about these standards in a future post, but for now note that they allow much of what would otherwise be service specific code – protocols for navigation, for manipulation of data, for security and so forth – to be implemented in generic layers. Just as important, they allow much of the documentation for our services to be consolidated in one place, so that designing becomes a more streamlined and focused process.
With a couple of very specific exceptions, the documents we send and receive are JSON, and structured in a very specific way. In particular, links are gathered together in one place within the document so that the generic software layers are better able to support the server code in generating them, and the application code in following them. This stands in contrast to web browsers which must be prepared to deal with a huge variety of text, graphic and other embedded object formats, and to find links scattered randomly throughout them.
For the API designer, these standards limit the ability to structure data in responses as freely as we might want. Our standards are still in a bit of flux as we try to navigate this tension between freedom of API design and the needs of the generic software layers we have built.
The fact, however, that our standards are based on JSON documents and are requested and provided using HTTP protocols means that the resulting services can be implemented using varied technologies (we have back end services in PHP, Haskell, Python and C++) and accessed from a broad variety of platforms.
The uniform contract also allows ancillary agents to extract generic information from messages, without knowing the specifics of the service implementation. For example, with a single piece of code, IMVU has configured its Istatd statistics collection system to collect real-time data on the amount of time each of our REST points is taking to do its work, and this data collection will now occur for every future endpoint without any initiative on the part of service implementers. The ready availability of these statistics allows for greater reliability through improved response time to outages.
In addition, the design of this uniform contract means that each service addresses a well-defined slice of functionality, allowing new services can be added in parallel with a minimum of disruption to existing code.
HATEOAS (Hypertext As The Engine Of Application State) is the discipline of treating all resource links as opaque to the clients that use them. Except for one well-known root URL, URLs are not hard coded, and they’re not constructed or parsed by clients. Links are retrieved from the server, and all navigation is done by following links. This allows us to move resources dynamically around our cluster (or even outside if necessary) and to add, refactor and extend services even while they are in active use, changing back-end software only, so that new releases of client software are required less often.
Under the REST discipline, applications are stateless – or rather the entire state needed to process a service request is sent with the request (much of it in the form of an identity token sent as a cookie). In this way, between one request and the next, the server needs to know nothing about the client’s application state, and servers do not need to retain state for every active client, which allows us to distribute requests across our cluster according to load.
In this way servers do not need to know what is going on with every active client. Now servers no longer depend on the number and states of all the clients they might be asked to service, allowing the server code to be simpler and to scale well. In addition, we can cache and stage requests at different points in our cluster, keeping them close to where they will be serviced. Responses can also be cached (keyed again by the URL of the request).
Finally, though it is not a stipulation of REST, we use IMVU’s real-time IMQ message queue to push notifications to clients (keyed by the service URL) when their locally cached data becomes invalid. This gives client the information needed to update stale data, but also allows real-time updating of data that is displayed to the user. When one user changes the outfit of their avatar, for example, all of the users in that chat room will see the updated look.
- Internet Scale
The internet comes with challenges, and REST provides us with a framework for addressing those challenges in a systematic way, but it is no panacea.
Fielding uses the term “anarchic scalability” to describe these challenges – by which he means the need to continue operating in the face of unanticipated load, or when given malformed or maliciously constructed data.
At IMVU we take issue of load into account from the outset when designing our services, but the internet, as an interacting set of heterogeneous servers and services, often displays complex emergent behavior. Even within our own cluster we have over fifty different kinds of servers (what we call roles), each kind talking to a specific set of other roles, depending on its needs. Under load, failures can cascade and feedback loops can keep the dysfunctional behavior in place even after the broader internet has stopped stressing the system.
Every failure triggers a post-mortem inquiry, bringing together all of the relevant parties (IT, engineering, product management) to establish the history and the impact of the failure. The statistics collected and recorded by Istatd are invaluable in this process of diagnosis.
Remedies are designed to address not only the immediate dysfunctional behavior, but a range of possible similar problems in the future, and can include adding new hardware or modifying software to throttle or redirect traffic. Out of these post-mortems, our design review process continues to mature, adding checklist items and questions which require designers to think about their prospective services from many different perspectives, and particularly through the lens of past failures.
There are times when a failure cannot be fully understood, even with the diagnostic history available to us. Istatd makes it easy to add new metrics which may help in understanding future failures of a similar type. We monitor more than a hundred thousand metrics, and the number is growing.
The chaos injected by the internet includes the contents of packets. Our code, on both the client and server side, is written so that it makes no assumptions about the structure of the data it contains. Even in the case when a document is constructed by our service and consumed by a client we have written, the data may arrive damaged, or even deliberately modified. There is backbone hardware, for example, which attempts to insert advertisements into web pages.
Up next: Part 2 — Nodes and Edges
In my next post I will describe the process we go through to develop a service concept, leading up to implementation.