Demystifying REST Constraints

Who do you think the actor is in the sentence “Brenda was happy to interview the actor”, a Hollywood celebrity or a member of the local community theater? The REST constraints described in Roy Fielding’s dissertation enjoy celebrity status. I cannot mention constraints without people immediately assuming that I either talk about them, ignore them, or argue against them. Few realize that I talk about additional application constraints or perhaps about strengthening one of Fielding’s constraints.  I wish I could use another word, but the well-known REST constraints and application-specific constraints aren’t much different and combine to define a concrete RESTful protocol.

I want to be free to talk about constraints because by definition a protocol is a set of rules, many of them constraints. The Wikipedia definition of a communication protocol, “…a system of digital message formats and rules for exchanging those messages in or between computing systems and in telecommunications” mentions rules, but examples from the Merriam-Webster dictionary illustrate even better that the word protocol is synonymous with a set of rules:

  • The soldier’s actions constitute a breach of military protocol [rules or regulations].
  • They did not follow the proper diplomatic protocols [rules or etiquette].
  • What is the proper protocol [rules or conventions] for declining a job offer?

The Geneva Protocol and the Kyoto Protocol set forth international regulation agreed upon by the participating countries. Similarly, the rules of a communication protocol are agreed upon by participants in the communication, not imposed by an external authority. Standards and specifications like RFC 2616 merely formalize the results of the agreement.

A constraint is a rule. What kind of rule is surprisingly hard to define exactly. Some would say a constraint is a rule about what not to do. This is close, but not close enough. Not close enough because any statement about what to do can be rephrased, however awkwardly, to say not to do the opposite. You can say “the protocol must be stateless” or “the protocol must not force the server to keep track of the client’s state”. Either way, it is the same constraint.

Constraints are like good parenting: instead of telling your kinds what to do, you set safe, age-appropriate limits, but otherwise let your kinds live their life and make their own decisions. Protocol constraints are used the same way to ensure safe, reliable, performant communication without restricting what the communication should be about. Constraints are also used to remove unessential variations which would complicate protocol implementation without any clear benefits. For example you can make you protocol simpler if you support JSON only. This is a valid constraint as long as XML support is not essential.

Constraints should never restrict an essential freedom. This is so important that we have an expression for such mistake: to over-constrain. I find it quite unfortunate that the noun constraint derives from the verb to constrain, which has meanings like “restrict by force” or “to secure by or as if by bonds”. These meanings suggest firmness and strength. Some of us subconsciously think of constraints as exceptionally strong and important rules. If somebody ever reminded you in a stern voice of a REST constraint, you know what I mean. To discuss constraints rationally we have to let go of our emotions. We don’t have to enforce constraints more vigorously than other protocol rules.

Let me give you some examples of relatively strict constraints. To show you that my use of constraints is consistent with Fielding’s and that some of his REST constraints are implicit in mine, I’ll gradually relax mine until they become equivalent with two of his.

Imagine a simple HTTP interface for a key-value data store. Clients send GET, PUT, and DELETE requests using URIs as keys and message bodies as values. The following constraint expresses a one-to-one mapping between URIs and message bodies: “As a response to a GET request the server must either return the status code 404 Not Found, or 200 OK and a message body which is a copy of the message body received in the last successful PUT request to the same URI”. Just to show I can, I expressed this as a rule about what the server must do. The remaining constraints are indeed easier to express in terms of what the server must not do, such as:

  1. The server must not return the 404 status code if there was a previous successful PUT request to the same URI
  2. The server must not return other HTTP status codes, for example 302
  3. The server must not allow the byte sequence representing the value assigned to the URI to change by any other means except as a response to an explicit PUT request
  4. The server must not return a message body for any URI for which no value was yet set

… and a few more. These constraints give my protocol some nice, useful properties. I can fully test my server via the protocol, for example.

“This is not REST”, you may object, “where is the hypermedia?” Good point. The hypermedia constraint does not say you must use hypermedia everywhere.  That would be a bit too strict and would render most of the media types from the IANA registry unfit for REST. The constraint says to use hypermedia to indicate what the valid client state transitions are and to help clients discover URIs. In my protocol all client state transitions are valid and the client does not need to discover URIs because it controls the URI space. Hence there is no hypermedia. You may say that the hypermedia constraint is not applicable. I prefer to say that it is satisfied.

To extend the applicability of my protocol beyond key-value data stores I need to relax some of my constraints. To turn my data store into a simple web server I need to eliminate constraint 4 and allow URIs pre-loaded with static content. Unfortunately the loss of constraint 4 also means that my clients no longer know what URIs exist and what they mean. Now I do indeed need to satisfy the hypermedia constraint. There are still many different ways to do it, for example I can provide a site map. Clients can still rely on their knowledge that values associated with URIs never change, use caches that never expire, for example.

Now imagine that my URIs identify Wikipedia articles. I don’t expect articles to change much over time, but I want to allow minor edits or formatting changes by eliminating constraint 3. Without constraint 3, however, message bodies for two successive GET requests may no longer be identical and I lose my ability to programmatically verify URIs still point to valid pages. Indeed, Wikipedia needs volunteers to make sure pages were not vandalized.

I hope I’ve emphasized enough that if I relax my constraints my protocol’s applicability increases but I lose some useful protocol properties. Just how far can I go with relaxing my constraints? Let’s see how far the editors of Wikipedia go. They accept in the page identified by the URI http://en.wikipedia.org/wiki/Tutankhamun any information about the Egyptian pharaoh that meets the quality standards of an encyclopedia. They would definitely not accept information about someone’s pet just because it happens to be named Tutankhamun and they would be equally perplexed to find information about Joe DiMaggio at this URI. Sounds like common sense? I’ve told you that I’m demystifying the REST constraints.

Let me generalize a bit. We expect both URIs and message bodies to mean something. When they are used together in HTTP, we expect their meanings to be related and this relationship between their meanings to remain the same. This is the same as to say that we don’t want to receive (or send) two HTTP message bodies with entirely different meanings from the same URI. This relaxed constraint was implicit in the first version of my protocol, which asked for strict one-to-one mapping between URI and message body. The relaxed version permits many-to-many relationship between the two as long as the relationship between the meaning remains one-to-one.

How useful is this relaxed constraint? The full answer doesn’t fit into this post, but let’s see if we can build a SOAP-like RPC protocol without having HTTP message bodies with different meanings sent to or returned from the same URI. If our service has a single URI, the message bodies mean “invoke method A”, “invoke method B”, and so on. Even if each method has its own URI, the message body sent means “invoke the method” while the message body returned means “this is the result”. Can we say they mean the same thing? I’m not saying the constraint was explicitly formulated just to prevent us from doing RPC over HTTP, only that it has this result as well.

Roy Fielding states the same constraint when he says that message bodies should be representations of resources identified by URIs even though the explanation some people give goes like this (I’m kidding you not): “Two of the REST constraints are the identification of resources and the manipulation of resources by representations. An identifier identifies a resource. A resource is anything that can be identified. A representation can be anything you can send over the network. Got it?” This is not what Fielding says although he says all these things. Each sentence is a quote from his dissertation, but taken out of context, combined into a meaningless blurb, and passed on as REST design wisdom. I won’t attempt to untangle this mess. The dissertation is available online, read at least section 5.2.

Fielding’s REST constraints are neither laws of nature nor religion. Following them does not guarantee success and violating them does not bring inevitable doom. They play the same role as “Stay on trails” signs in our national parks. If you stray from the beaten path there is a slight chance you will discover something wonderful nobody saw before, but you also risk falling into a ravine, starting an avalanche, or being killed by a grizzly bear.

Creative Commons Licence
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 Canada License.

Advertisements

Message Encoding in REST

The basic HTTP message framing described in my last post has some performance issues. You can think of advanced message framing techniques like chunking, compression, and multipart messages as performance optimizations. The HTTP specification calls them encodings because they alter (encode) the message body for transmission and reconstruct it at arrival. All HTTP 1.1 libraries and frameworks support chunkig, most support compression, and many can handle multipart messages too.

Chunking

You need to know the length of the message body to use the Content-Length header. While this is not an issue with static content like HTML files, it creates performance problems with dynamically generated REST messages. Not only that the entire message body has to be buffered in memory, but the network sits unused during message generation and the CPU is idle during transmission. The performance improves significantly if the two steps are combined and message generation or processing is done in parallel with transmission. To support this, a new message length prefixing model, called chunked transfer encoding, was introduced in HTTP 1.1. It is described in RFC 2616, section 3.6.1

Chunked transfer encoding is length prefixing applied not to the entire message body, but to smaller parts of it (chunks). A chunk starts with the length of data in hexadecimal format separated by a CRLF (carriage return and line feed) sequence from the actual chunk data. Another CRLF pair marks the end of the chunk. After a series of chunks a zero-length chunk signals the end of the message. The presence of the Transfer-Encoding header containing the value “chunked” and the absence of the Content-Length header tell the receiver to read the message body in chunks (Figure 1).

Figure 1: Compressed and chunked HTTP response

Figure 1: Compressed and chunked HTTP response

You don’t need to take any action to enable chunking. Basic message framing is used only for short messages. Longer messages are automatically chunked by HTTP 1.1 implementations. Just remember not to depend on the Content-Length header for message processing because it is not always present. Structure your code so that you can begin processing before receiving the full message body. Use streaming and event-based programming techniques, for example event-based JSON and XML parsers.

Compression

Compression is a data optimization technique used to reduce message sizes, thus network bandwidth usage. It also speeds up message delivery. Compression looks for repeating patterns in data streams and replaces their occurrences with shorter placeholders. How effective this is depends on the type and volume of data. XML and JSON compress quite well, by 60% or more for messages over two kilobytes in length. Experience shows that the generic HTTP compression algorithms are just as effective as specially developed JSON- and XML-aware algorithms.

HTTP compresses the message body before sending it and un-compresses it at arrival. Headers are never compressed. If present, the Content-Length header shows the number of the actual (compressed) bytes sent. Similarly, chunking is applied to the already compressed stream of bytes (Figure 1). Since compression is optional in HTTP 1.1, clients must list the compression algorithms they support in the Accept-Encoding header to receive compressed messages. Likewise, REST services should not send compressed content unless the client is prepared to receive it. Compressed messages are sent with a Content-Encoding header identifying the compression algorithm used.

Client support for HTTP compression is good since well-known public domain algorithms are used. On Android, you can receive gzip compressed messages with the Spring for Android RestTemplate Module. RestKit for Apple iOS is built on top of NSURLConnection, which provides transparent gzip/deflate support for response bodies. On Blackberry you can use HttpConnection with with GZipInputStream. When writing client-side Javascript the XmlHttpRequest implementation decompresses messages automatically.

Multipart messages

As the name suggests, multipart messages contain parts of different types within as a single message body. This helps reduce protocol chattiness, eliminating the need to retrieve each part with a separate HTTP request. In REST this technique is called batching.

Figure 2: Multipart HTTP response

Figure 2: Multipart HTTP response

Multipart encoding was originally developed for email attachments (Multipurpose Internet Mail Extensions or MIME) and later extended to HTTP and the Web. It is described in six related RFC documents: RFC 2045, RFC 2046, RFC 2047, RFC 4288, RFC 4289 and RFC 2049. Figure 2 shows a multipart message example. To receive multipart messages, clients must list the multipart formats they support in the Accept header. The Content-Type header identifies the type of multipart message sent (related, mixed, etc.) and also contains the byte pattern used to mark the boundary between message parts. The mime type of each message part is transmitted separately as shown in Figure 2.

Building and parsing multipart messages is difficult unless supported out-of-the box by libraries or frameworks. You cannot expect clients to write code for it from scratch so you must provide alternative ways of retrieving the data. Common alternatives are embedding links into messages to let clients retrieve parts separately or using JSON or XML collections for embedding multiple message parts of the same type.

REST message examples are always shown in documentation with basic message framing because it is easy to read. Nonetheless, many real-life REST protocols use a combination of chunking, compression, and multipart encoding for better performance. When you develop REST clients, remember that compression and multipart encoding are optional. REST services won’t use them unless the client sends them the proper Accept and Accept-Encoding headers. When you design REST services, consider compressing large messages to save bandwidth and using multipart messages to reduce protocol chattiness.

Creative Commons Licence
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 Canada License.

Message Framing in REST

Most REST designers take message framing for granted; something they get for free from HTTP and don’t need to worry about because it just works. You are probably wondering what motivated me to write about such an obvious and unimportant topic. I wanted to show that REST exposes message framing details to clients. This can cause some issues and may influence your REST design decisions.

The need for message framing

That you cannot send messages directly over TCP is the first difficulty in application protocol design. There is no “send message” function. You read from an input stream by calling a “receive” method, and you write to an output stream by calling a “send” method. However, you cannot assume that a single “send” will result in a single “receive”. Message boundaries are not preserved in TCP.

HTTP uses a combination of message framing techniques, delimiters and prefixing, to send messages over TCP (Figure 1).

Figure 1: HTTP message framing

Figure 1: HTTP message framing

Delimiters are predetermined markers placed inside and around messages to help the protocol separate messages and message parts from each other when reading from an input stream. The carriage return – new line (/r/n) pair of characters divide the ASCII character stream of the HTTP message header into lines. Another delimiter, white space, divides the request and status lines into sections. A third delimiter, the colon, separates header names from header values. An empty line marks the end of the header and the beginning of the (optional) message body.

Prefixing works by sending in the first part of messages information about the remaining, variable part. HTTP uses headers for prefixing, instructing the protocol implementation how to handle the message body. Length prefixing is the most important: the Content-Length header tells the protocol implementation how many bytes to read before it reaches the end of the message body.

The message framing details are clearly visible when you look at REST messages (Figure 1) and are also partially exposed in code that generates them (Listing 1).

Not a text-based protocol

That HTTP is a text-based protocol is a widespread misconception. Only the header section is sent as ASCII characters, the message body is sent as a sequence of bytes. This has the consequence that sending text in the message body is not quite as straightforward as you might expect. You need to ensure that both client and server use the same method when converting text to bytes and back.

It is much safer to be explicit about the character set used by setting and reading the character set from the Accept, Accept-Charset, and Content-Type headers than relying on defaults. Client libraries and server frameworks are partially to blame for the text-based protocol misconception because they attempt to convert text using a default character set. The Apache client library uses ISO-8859-1 by default as required by RFC2616 section 3.7.1, but this obviously can cause problems if the server is sending JSON using UTF-8.

Listing 1: Sending a text message to a URL in Java using the Apache HTTP client library

    /**
     * Sends a text message to the given URL
     * @param message the text to send
     * @param url where to send it to
     * @return true if the message was sent, false otherwise
     **/
    public static boolean sendTextMessage(String message, String url) {
        boolean success = false;
        HttpClient httpClient = new DefaultHttpClient();
        try {
            HttpPost request = new HttpPost(url);

            BasicHttpEntity entity = new BasicHttpEntity();
            byte[] content = message.getBytes("utf-8");
            entity.setContent(new ByteArrayInputStream(content));
            entity.setContentLength(content.length);
            request.setEntity(entity);
            request.setHeader("Content-Type", "text/plain; charset=utf-8");

            HttpResponse response = httpClient.execute(request);

            StatusLine statusLine = response.getStatusLine();
            int statusCode = statusLine.getStatusCode();
            success = (statusCode == 200);
        } catch (IOException e) {
            success = false;
        } finally {
           httpClient.getConnectionManager().shutdown();
        }
        return success;
    }

Restrictions on headers

The use of delimiters for message framing limits what data can be safely sent in HTTP headers. You will find these limitations in RCF 2616, section 2.2 and section 4.2, but here is a short summary:

  • All data need to be represented as ASCII characters
  • The delimiter characters used for message framing cannot appear in header names or values
  • Header names are further limited to lowercase and uppercase letters and the dash character
  • There is also a maximum limit on the length of each header, typically 4 or 8 KB
  • It is a convention to start all custom header names not defined in RFC2616 with “X-”

You might occasionally encounter message framing errors because some client library implementations expose the headers without enforcing these rules. If a HTTP framework or intermediary detects a framing error, it discards the request and returns the “400 Bad Request” status code. What may be even worse though, every so often a malformed message will get through, causing weird behavior or a “500 Internal Error” status code and some incomprehensible internal error message. To avoid such hard-to-trace errors do not attempt to send in HTTP headers any data which:

  • comes from user input
  • is shown to the user
  • is persisted
  • can grow in size uncapped
  • you have no full control over (it is generated by third-party libraries or services)

Keeping protocol data separate from application data

Notice that I did not say don’t use headers at all. Many REST protocols chose not to use them, but this may not be the wisest protocol design decision. Headers and body serve distinct roles in protocol design and both are important.

The message header carries information needed by the protocol implementation itself. The headers tells the protocol what to do, but do not necessarily show what the application is doing. If you are sniffing the headers you are not likely to capture any business information collected and stored by an application.

The message body is where the application data is sent, but it has no or very little influence on how the protocol itself works. Protocols typically don’t interpret the data sent in message bodies and treat it as opaque streams of bytes.

Sending protocol data in the message body creates strong couplings between the various parts of the application, making further evolution difficult. Once I asked someone to return the URI of a newly created resource in the Content-Location header of a POST response, a common HTTP practice. “There is no need”, he said, “the URI is already available as a link in the message body”. This was true, of course, but the generic protocol logic in which I needed this URI was up till then completely independent of any resource representations. Forcing it to parse the URI out of the representations meant that it will likely break the next time the representations changed.

Conclusion

I hope I managed to convince you that message framing in REST is not a mere implementation detail you can safely ignore.  Becoming familiar with how it works can help you avoid some common pitfalls and design more robust REST APIs. I discussed only basic HTTP message framing so far. In my next post I’ll talk about more advanced topics like chunking, compression, and multipart messages.

Creative Commons Licence
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 Canada License.

REST Design Philosophies

In my previous post I said that REST is a protocol design technique. Chances are you do not agree or at least not completely. REST can be deceptively simple and frustratingly controversial at the same time. We often find ourselves in the middle of heated debates and are quickly labeled dogmatic RESTafarians or clueless, reckless hackers when expressing an opinion. If you don’t know what I’m talking about, read this well-researched article by Ivan Zuzak or better yet, join a REST discussion group.

Why is REST controversial? Building distributed applications is hard. REST simplifies it by leveraging existing web design patterns and protocols, but many important design tradeoffs remain. Designers resolve tradeoffs based on what they consider important. What is important is not simply a matter of personal opinion, but depends on the kind of application you are building. So instead of best practices everybody agrees on REST has a number of distinct schools of thought or design philosophies; neither inherently good nor bad, neither obviously right nor wrong; each with its champions and detractors.

I’ll attempt to describe four of these design philosophies without claiming any authority in the matter. Others would likely name and describe them differently. I decided to draw quick sketches instead of painting full portraits, capturing the distinctive features and worrying less about getting every detail right.

Hypermedia APIs

This design philosophy admires and tries to replicate the properties of the World Wide Web as a distributed application development platform. Its proponents believe that the phenomenal success of the Web is due to a number of high-level design choices which together define what they call its architectural style. Roy Fielding was the first to describe this architectural style naming it REpresentational State Transfer or REST. For this reason many consider it the only “true” definition of REST.

The single most distinctive feature of this design philosophy is the use of hypermedia or HTTP messages with embedded hyperlinks, hence the name. The links guide clients to discover available resources and operations dynamically. While specific URIs patterns are used, clients are not expected to understand them and are actively discouraged from parsing or building URIs. You can get a better understanding of how Hypermedia APIs are designed and how they work watching Jon Moore build one in this video.

Much of the Hypermedia API discussion focuses on the difficult task of writing complex, large-scale distributed applications and on evolving them over decades. Design attributes like scalability and loose coupling are considered much more important than simplicity and ease of use. This discourages some who claim that while they understand Hypermedia APIs in theory, they find them difficult to use in practice.

Data APIs

If Hypermedia APIs are an architectural style, Data APIs are first and foremost an application integration pattern. Its proponents consider standardization as the most important design goal, believing that exposing the application’s underlying data in a standard format permits easier and more innovative integrations than exposing the application’s complex, cumbersome, and often restrictive business logic. Data APIs are published by organizations wishing to share useful data without giving access to the application which collects it or are added to existing enterprise software to facilitate third-party integrations.

Data APIs use the HTTP methods POST/GET/PUT/DELETE strictly as the CRUD operations Create/Read/Update/Delete. If they expose any application logic at all, they expose it implicitly. For example, if you use a Data API to manipulate purchase orders you obviously expect that products will be shipped and people billed as a result, but this has no direct effect on how the API looks or works. The behavior is simply implied by our shared understanding of what a purchase order is.

The URI structure is standardized and client applications are encouraged to use URI templates for building queries similar to SQL select statements. The format in which the data is returned is also standardized, often based on the ATOM format. For an example of a standardized Data API take a closer look at Microsoft’s Open Data Protocol.

Web APIs

The designers of Web APIs consider achieving large-scale developer adoption as the most important API design challenge. They try to drive adoption by aiming for the widest possible platform support and maintaining a very low barrier of entry.

It is only a slight exaggeration to say that if a feature is not guaranteed to work in Adobe Flash, on an underpowered mobile phone, on a remote Pacific island, through a web proxy, and from behind an ancient firewall from the 90’s, Web API designers won’t use it. If an interface cannot be accessed directly from HTML forms or called by writing less than five lines of JavaScript, Web API designers worry about adoption.

The result is that Web APIs manage to get by using a surprisingly small subset of features. They stick to GET and POST, largely eschew HTTP headers, strongly favor JSON over XML, and rarely use hyperlinks. Information is often passed in URLs, because “URLs are easy to type into the address bar of a browser”. I call them Web APIs because I see this pattern in services offered to developers at large over the Internet (often free of charge) and are primarily used in Web Mashups or equivalent mobile apps. Many of the smaller, simpler APIs listed in the Programmable Web API Directory are Web APIs. Perhaps the best known and most frequently imitated is the Twitter API.

Pragmatic REST APIs

Pragmatists believe that design trade-offs should be resolved based on concrete, project-specific requirements and constraints. They view the indiscriminate application of the same design patterns to widely different problems as a sign of dogmatic thinking, not as disciplined design. Absent specific requirements and constraints, pragmatists prefer to use the simplest possible design. What “simplest” means is again very context dependent. It may mean a design which fits the technologies, frameworks, or tools used the best.

You might argue that Pragmatic REST is not a separate design philosophy since its practitioners freely borrow and mix design approaches from all the other schools. The “pragmatic” label best describes APIs which do not fit neatly into any of the other categories. There are quite a few of these. If you look at the collection of the Google APIs you will discover characteristics of Hypermedia APIs, Data APIs, and Web APIs, but not in their purest forms.

In Lieu of Conclusion

If memory serves me well, the Big REST Debate reached its peak sometime around 2008.  The rhetoric cooled since, largely because the protagonists agreed to disagree, do REST their own way, and mostly ignore each other. But the issue is far from being settled and I doubt it will ever be. When I decided to illustrate the principles and methods of protocol design using REST, I knew that I can’t pretend not to notice the elephant in the room. That’s why I wrote this post.  In my next post I’ll return to talk about REST and protocol design.

Creative Commons Licence
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 Canada License.