The upcoming HTML5 specification includes a lot of powerful an d exiting features which turn web browsers into a fully capable rich internet application (RIA) client platform. This article takes a deeper look into two new HTML5 communication standards, Server-Sent Events andWebSockets. These new standards have the potential to become the dominant Server-push technologies for helping developers to write real-time web applications.
1. The evolution of the web
Since its beginning in the 1990s, the World Wide Web has grown rapidly and became more and more dynamic. While the first generation of web pages were static documents, later generations of web sites have been enriched by dynamic elements, followed ultimately by the development of highly-interactive browser-based rich internet applications.
The Hypertext Transfer Protocol (HTTP) was designed in the early days of the internet to transport information entities such as web pages or multimedia files between clients and servers. HTTP became a fundamental part of the World Wide Web initiative. The design of the HTTP protocol has been driven by the idea of a distributed, collaborative, hypermedia information system "to give universal access to a large universe of documents." The main design goal of the HTTP protocol was to minimize latency and network communication on the one hand, and to maximize scalability and independence of the components on the other hand.
The HTTP protocol implements a strict request-response model, which means that the client sends a HTTP request and waits until the HTTP response is received. The protocol has no support wherein the server initiates interaction with the client.
Listing 1. HTTP request on the wire
GET /html/rfc2616.html HTTP/1.1 Host: tools.ietf.org User-Agent: xLightweb/2.11.3
For instance, a browser or other user agent will request a specific web page by performing an HTTP
GET request such as shown in Listing 1. The server returns the requested information entity by a HTTP response such as shown in Listing 2. The response consists of the HTTP response header, and the HTTP response body, which are separated by a blank line. The
GET request, in contrast, includes a HTTP header only.
Listing 2. HTTP response on the wire
HTTP/1.1 200 OK Server: Apache/2.2.14 (Debian) Date: Tue, 09 Feb 2010 05:00:01 GMT Content-Length: 510825 Content-Type: text/html; charset=UTF-8 Last-Modified: Fri, 25 Dec 2009 05:49:54 GMT ETag: "302e72-7cb69-47b871ff82480" Accept-Ranges: bytes <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html lang="en" xml:lang="en"> <head> <meta http-equiv="Content-Type" content="text/html; charset=us-ascii" /> <meta name="robots" content="index,follow" /> <meta name="creator" content="rfcmarkup version 1.53" /> <link rel="icon" href="/images/rfc.png" type="image/png" /> <link rel="shortcut icon" href="/images/rfc.png" type="image/png" /> <title>RFC 2616 Hypertext Transfer Protocol -- HTTP/1.1</title> [...] </small></small></span> </body></html>
To create more interactive web applications, the AJAX approach has been established as a popular solution for dynamically pulling data requests from the server. By using AJAX the browser application performs HTTP requests based on the
XMLHttpRequest API. The
XMLHttpRequestAPI enables performing the HTTP request in the background in an asynchronous way without blocking the user interface. AJAX does not define new kinds of HTTP requests or anything else. It just performs a HTTP request in the background.
In the case of a Web mail application, for instance, the web application could periodically perform an "AJAX request" to ask the server if the mailbox content is changed. This polling approach causes an event latency which depends on the polling period. Increasing the polling rate reduces event latency. The downside of this is that frequent polling wastes system resources and scales poorly. Most polling calls return empty, with no new events having occurred.
While Ajax is a popular solution for dynamically pulling data requests from the server, it does nothing to help to push data to the client. Sure, a server push channel could be emulated by an AJAX polling approach as described above, but this would waste resources. Comet, also known as reverse Ajax, enhances the Ajax communication pattern by defining architecture for pushing data from the server to the client. For instance, the Comet pattern would allow pushing a 'new mail available' event from the mail server to the WebMail client immediately.
Like AJAX, Comet is build on the top of the existing HTTP protocol without modifying it. This is a little bit tricky because the HTTP protocol is not designed to send unrequested responses from the server to the client. A HTTP response always requires a previous HTTP request initiated by the client. The Comet approach breaks this limitation by maintaining long-lived HTTP connections and "faking" notifications.
In practice, two popular strategies has been established. Withlong-polling, the client sends a HTTP request, waiting for a server event. If an event occurs on the server-side, the server sends the response including the event data. After receiving the response containing the event data, the client will send a request again, waiting for the next event. There is always a pending request which allows the server to send a response at any time. With Http Streaming, the server keeps the response message open. In contrast to long polling the HTTP response message (body) will not be closed after sending an event to the client. If an event occurs on the server-side, the server will write this event to the open response message body. The HTTP response message body represents a unidirectional event stream to the client.
The Comet approach allows implementing a real-time web. In contrast of the beginning of the web, many of today's web applications have to receive information as soon as it's been published on the server-side. For instance a web-based chat application will be successful only if the underlying architecture supports real-time communication.
2. HTML 5
The upcoming HTML 5 extends the HTML language to better support highly interactive web applications. HTML 5 turns web browsers into a fully capable rich internet application client platform, and is a direct competitor of other client platform environments such as Adobe Flash and Microsoft Silverlight.
For instance the HTML 5 standard includes technologies such as the WebWorkers API which allows running scripts in the background independently, the WebStorage API to store structured data on the client side or the new canvas element which supports dynamic scriptable rendering of 2D bitmap images.
2.1 Server-Sent Events
HTML5 also applies the Comet communication pattern by defining Server-Sent Events (SSE), in effect standardizing Comet for all standards-compliant web browsers. The Server-Sent Events specification "defines an API for opening an HTTP connection for receiving push notifications from a server." Server-Sent Events includes the new HTML element
EventSource as well as a new mime type
text/event-stream which defines an event framing format.
EventSource represents the client-side end point to receive events. The client opens an event stream by creating an
EventSource, which takes an event source URL as its constructor argument. The
onmessage event handler will be called each time new data is received.
In general, browsers limit the connections per server. Under some circumstances, loading multiple pages that include an EventSource from the same domain can result in each
EventSource creating a dedicated connection. Often the maximum number of connections is quickly exceeded in such situations. To handle the per-server connection limitation, a shared
WebWorker, which shares a single EventSource object, can be used. Furthermore, by definition the browser-specific
EventSource implementation is free to reuse an existing connection if the event source absolute URL is equal to the required one. In this case, sharing connections will be managed by the browser-specific
Figure 1 shows the HTTP request which will be sent by a browser if an event stream is opened. The
Accept header indicates the required format,
text/event-stream. Although the new mime type
text/event is defined by the Server-Sent Events spec, the specification also allows using other formats for event framing. However, a valid Server-Sent Events implementation has to support the mime type
text/event-stream at minimum.
REQUEST: GET /Events HTTP/1.1 Host: myServer:8875 User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; de-DE) AppleWebKit/532+ (KHTML, like Gecko) Version/4.0.4 Safari/531.21.10 Accept-Encoding: gzip, deflate Referer: http://myServer:8875/ Accept: text/event-stream Last-Event-Id: 6 Accept-Language: de-DE Cache-Control: no-cache Connection: keep-alive RESPONSE: HTTP/1.1 200 OK Server: xLightweb/2.12-HTML5Preview6 Content-Type: text/event-stream Expires: Fri, 01 Jan 1990 00:00:00 GMT Cache-Control: no-cache, no-store, max-age=0, must-revalidate Pragma: no-cache Connection: close : time stream retry: 5000 id: 7 data: Thu Mar 11 07:31:30 CET 2010 id: 8 data: Thu Mar 11 07:31:35 CET 2010 [...]
Figure 1. An example event-stream
According to the mime type
text/event-stream, an event consists of one or more comment lines and/or field lines on the wire level. The event is delimited by a blank line. A comment always starts with a colon (:). Fields, on the other hand, consist of a field name and field value separated by a colon.
Figure 1 also shows an example response. To avoid caching, the response header includes cache directives which disable caching of the response. Event streams should not be cached by definition.
The example response includes 3 events. The first event includes a comment and a retry field; the second and third event includes an id field and event data. The
data field holds the event data, which is the current time in the example above. The second and third event also includes an
id field to track progress through the event stream. The example server application writes a new event on the wire every 5 seconds. If the
EventSource receives the event, the
onmmessage handler will be called.
In contrast to the second and third event, the first event will not trigger the
onmmessage handler. The first event does not contain data. It includes an
comment field and a
retry field for reconnecting purposes. The
retry field defines the reconnect time in milliseconds. If such a field is received, the
EventSource will update its associated reconnection time property with the received one. The reconnect time plays an important role in improving reliability in cases where network errors arise. If the event source instance detects that the connection is dropped, the connection will be re-established automatically after a delay equal to the reconnection time.
As shown in Figure 1, the HTTP request to establish the connection can be enriched by the
Last-Event-Idheader. This header will be set if the
EventSource's last event id property is set with a non-empty string. The
EventSource's last event id property will be updated each time an event is received that contains an
id, or it will be updated an empty string if no
id is present. Because of this, an event stream can be re-established without repeating or missing any events. This guarantees message delivery if
lastEventId handling is implemented on the server side.
Listing 4 shows an example
HttpServer based on the Java HTTP library xLightweb (with HTML5 preview extension), which is maintained by the author.
Listing 4. Example Java-based server delivering text/event-stream
The Server-Sent Events specification recommends sending a "keep-alive" comment event periodically, if no other data event is send. Proxy servers are known which drop an HTTP connection after a short period of inactivity. Such proxy servers close idling connections to avoid wasting connections to unresponsive HTTP servers. Sending a comment event deactivates this behaviour. Even though the EventSource will re-establish the connection automatically, sending comment events periodically avoids unnecessary reconnects.
Server-Sent Events are based on HTTP streaming. As described above, the response stays open and event data are written as they occur on the server side. Theoretically, HTTP streaming will cause trouble if network intermediaries such as HTTP proxies do not forward partial responses immediately. The current HTTP RFC (RFC 2616 Hypertext Transfer Protocol - HTTP/1.1) does not require that partial responses have to be forwarded immediately. However, a lot of popular, well-working web applications exists which are built on the HTTP Streaming approach. Furthermore, production-level intermediaries always avoid buffering large amounts of data to minimize their memory usage.
In contrast to other popular Comet protocols such as Bayeux orBOSH, Server-Sent Events support a unidirectional server-to-client channel only. The Bayeux protocol on the other side supports a bidirectional communication channel. Furthermore, Bayeux can use HTTP streaming as well as long polling. Like Bayeux, the BOSH protocol is a bidirectional protocol. BOSH is based on the long polling approach.
Although Server-Sent Events do have less functionality than Bayeux or BOSH, Server-Sent Events have the potential to be become the dominant protocol for use cases where a unidirectional server push channel is required only (which is the case in many instances). The Sever-Sent Events protocol is much simpler than Bayeux or BOSH. For instance, you are able to test the event stream by using telnet. No handshake protocols have to be implemented. Just send the HTTP
GET request and get the event stream. Furthermore Server-Sent Events will be supported natively by all HTML5-compatible browsers. In contrast, Bayeux and BOSH protocol are implemented on the top of the browser language environment.
Still to come...
Part 2 of this article series will cover the WebSocket API and protocol, and present concluding remarks.
- Server-Sent Events Specification - W3C Working Draft 22 December 2009
- Asynchronous HTTP and Comet architectures - Gregor Roth
- RESTful HTTP in practice - Gregor Roth
- Best Practices for the Use of Long Polling and Streaming in Bidirectional HTTP