Django is an awesome framework for building web services using the Python language. However, it is not well-suited for handling long-lived connections, which are needed for realtime data push. In this article we’ll explain how to pair Django with Fanout Cloud to reach realtime nirvana.

How it works

The Fanout Cloud gives web services realtime superpowers. The architecture is similar to a traditional caching CDN, except that Fanout Cloud is designed for pushing data rather than caching. Django applications can run behind Fanout Cloud to easily support realtime behaviors such as WebSockets, HTTP streaming, and HTTP long-polling.

djfanout

The django-grip library is used to communicate with Fanout Cloud. “GRIP” is the name of Fanout’s open protocol for communication between Fanout Cloud and the backend application. For example, to hold a request open in streaming mode, the Django application simply calls set_hold_stream() on the request object before returning the response:

from django_grip import set_hold_stream

def endpoint(request):
    set_hold_stream(request, 'test')
    return HttpResponse('[stream open]\n')

The above code will instruct Fanout Cloud to send the response to the client, but keep the request open and subscribed to a channel called test. Fanout Cloud holds the request open by stripping any Content-Length header in the response (as well as switching to chunked encoding if the client supports HTTP 1.1). From the perspective of the Django application, the HTTP request will have completed, but the request between the client and Fanout Cloud will remain open.

Whenever the server has data to push to any listening clients, it calls publish():

from gripcontrol import HttpStreamFormat
from django_grip import publish
...
publish('test', HttpStreamFormat('hello world\n'))

The HttpStreamFormat encapsulates a chunk of response body to be sent. Under the hood, the publish() call performs an asynchronous HTTP POST to send the data to Fanout Cloud.

The django-grip library also includes facilities for WebSockets and long-polling, which we will discuss later on in this article.

Why Fanout Cloud?

So many reasons. First, the inline proxying technique has some great benefits:

  • Code consolidation. No need to split out your realtime code into a custom application (e.g. a Tornado or Node.js app).
  • You get the simplicity of something like Faye or Pusher, but with the low-level network control of a custom application. Fanout Cloud is ideal for API creators.
  • Django’s many features such as authentication, middleware, debugging, etc. become available through Fanout Cloud.

Of course, Fanout Cloud being an external service means even more nice things:

  • Delegating realtime push through an edge network such as Fanout Cloud is the key to high scalability.
  • No additional servers to manage. Your Django application is all you need.

Headline example

To demonstrate how to use Fanout Cloud with Django, we’ll walk through the complete process of making a simple “headline” service that allows storage and retrieval of text messages. Whenever a headline is changed, updates will be pushed out to interested listeners. You can imagine this service being useful for marquees or broadcast alerts.

We’ll start out by first building a non-realtime version of the headline API. We generally recommend that all projects start out this way. Get your CRUD stuff working reliably before you delve into realtime.

Below is the data model for our app (“headlineapp”). Nothing out of the ordinary.

from django.db import models

class Headline(models.Model):
    type = models.CharField(max_length=64)
    title = models.CharField(max_length=200)
    text = models.TextField()
    date = models.DateTimeField(auto_now=True)

    def to_data(self):
        out = dict()
        out['id'] = str(self.id)
        out['type'] = self.type
        if self.title:
            out['title'] = self.title
        out['date'] = self.date.isoformat()
        out['text'] = self.text
        return out

    def __unicode__(self):
        return u'%s: %s' % (self.type, self.text[:100])

Headlines have a type, title, text, and automatic timestamp. The type field is intended to be machine-readable to determine how the headline should be displayed. The title and text fields are human-readable. The title field could be used as the title of a pop-up window, for example if the headline is used for an alert. The to_data() method is a convenience method that converts the object into a JSON-encodable data structure.

Now for the view code:

import calendar
from django.http import HttpResponse, HttpResponseNotModified, \
    HttpResponseNotAllowed
from django.shortcuts import get_object_or_404
from headlineapp.models import Headline

def _json_response(data):
    body = json.dumps(data, indent=4) + '\n' # pretty print
    return HttpResponse(body, content_type='application/json')

def base(request):
    if request.method == 'POST':
        h = Headline(type='none', title='', text='')
        h.save()
        return _json_response(h.to_data())
    else:
        return HttpResponseNotAllowed(['POST'])

def item(request, headline_id):
    h = get_object_or_404(Headline, pk=headline_id)

    if request.method == 'GET':
        inm = request.META.get('HTTP_IF_NONE_MATCH')
        etag = '"%s"' % calendar.timegm(h.date.utctimetuple())
        if inm == etag:
            resp = HttpResponseNotModified()
        else:
            resp = _json_response(h.to_data())
        resp['ETag'] = etag
        return resp
    elif request.method == 'PUT':
        hdata = json.loads(request.read())
        h.type = hdata['type']
        h.title = hdata.get('title', '')
        h.text = hdata.get('text', '')
        h.save()
        hdata = h.to_data()
        etag = '"%s"' % calendar.timegm(h.date.utctimetuple())
        resp = _json_response(hdata)
        resp['ETag'] = etag
        return resp
    else:
        return HttpResponseNotAllowed(['GET', 'PUT'])

Lastly, the url mappings:

from django.conf.urls import patterns, url
from headlineapp import views

urlpatterns = patterns('',
    url(r'^$', views.base, name='base'),
    url(r'^(?P\d+)/$', views.item, name='item'),
)

With this code, we have an API that lets us:

  • POST / in order to create a new empty headline object and receive its id.
  • PUT /{headline-id}/ to update a headline object.
  • GET /{headline-id}/ to retrieve a headline object (with ETag support).

Here’s how curl could be used to create a headline:

curl -X POST http://api.headlineapp.org/

Response:

{
    "date": "2014-10-30T05:47:44.666900+00:00", 
    "text": "", 
    "type": "none", 
    "id": "1"
}

Updating a headline:

curl -d '{"type":"normal", "text": "hello to the world"}' \
  -X PUT http://api.headlineapp.org/1/

Response:

{
    "date": "2014-10-30T05:48:28.601426+00:00", 
    "text": "hello to the world", 
    "type": "normal", 
    "id": "1"
}

Getting the current value of a headline:

curl http://api.headlineapp.org/1/

Response:

{
    "date": "2014-10-30T05:48:28.601426+00:00", 
    "text": "hello to the world", 
    "type": "normal", 
    "id": "1"
}

Still with us? Up until this point all we’ve done is create a conventional, non-realtime API. Next we’ll add Fanout Cloud into the mix and liven things up!

Fanout Cloud configuration

Our headline API uses the domain api.headlineapp.org, which we’ve added as a virtual host in the Fanout control panel:

fanoutdomain3

We’ve set it to route to our backend Django application running on Heroku. Of course you can run your own backend application anywhere and you can use any domain name.

We also enable the WebSocket-over-HTTP protocol:

fanoutwoh4

On the Django side, we need the django-grip library installed. We can get it with pip:

pip install django-grip

In our project’s settings.py file, we include the GRIP middleware:

MIDDLEWARE_CLASSES = (
    ...
    'django_grip.GripMiddleware',
    ...
)

Finally, we need to set GRIP_PROXIES to contain the Fanout Cloud settings:

GRIP_PROXIES = [
    {
        'key': b64decode('{realm-key}'),
        'control_uri': 'http://api.fanout.io/realm/{realm}',
        'control_iss': '{realm}'
    }
]

Substitute {realm} and {realm-key} with your own Fanout credentials.

For a more compact GRIP configuration, you can use a URL representation instead. This can be handy if you want to use environment variables for configuration, which is what we do for our instance based on Heroku. For example:

from django_grip import parse_grip_uri

grip_url = 'http://api.fanout.io/realm/{realm}?iss={realm}' \
           '&key=base64:{realm-key}'
GRIP_PROXIES = [parse_grip_uri(grip_url)]

Note: If you decide to configure via URL, be sure that your realm-key is URL-encoded. Not all Base64 characters are URL-safe.

Realtime!

Now that the Django application is configured to use Fanout Cloud, it’s time for the fun part. We’ll start simple and add support for Server-Sent Events (SSE). Browsers can consume an SSE endpoint using the EventSource JavaScript API. It is the easiest way to push data in realtime to modern browsers.

First, we’ll need some new includes:

...
from gripcontrol import HttpStreamFormat
from django_grip import set_hold_stream, publish
...

We can look for the Accept header to determine if the client wants SSE:

...
if request.method == 'GET':
    if request.META.get('HTTP_ACCEPT') == 'text/event-stream':
        set_hold_stream(request, 'headline-%s' % headline_id)
        return HttpResponse(content_type='text/event-stream')
    else:
        # original GET code
        ...

The set_hold_stream() call puts special instructions in the HTTP response telling Fanout Cloud to keep the request open and subscribe it to the specified channel.

All that’s left is to publish updates whenever a headline changes, which we can do in the PUT handler:

...
elif request.method == 'PUT':
    hdata = json.loads(request.read())
    h.type = hdata['type']
    h.title = hdata.get('title', '')
    h.text = hdata.get('text', '')
    h.save()
    hdata = h.to_data()
    etag = '"%s"' % calendar.timegm(h.date.utctimetuple())

    # publish
    formats = list()
    formats.append(HttpStreamFormat('event: update\ndata: %s\n\n' % hjson))
    publish('headline-%s' % headline_id, formats)

    resp = _json_response(hdata)
    ...

The publish call pushes data on the specified channel in one or more formats. We use the HttpStreamFormat here, which specifies a chunk of HTTP response body to send. Since we’re implementing the SSE protocol, we make sure to send data with the proper formatting. The publish() call is asynchronous and does not block the calling thread.

Consuming SSE from a browser is easy:

var es = new EventSource('http://api.headlineapp.org/1/');
es.addEventListener('update', function (e) {
    console.log(e.data);
}, false);

There you have it! Just a few lines of code and we’ve got a massively scalable realtime API, completely defined and controlled by Django. Connecting clients don’t even know Fanout is there.

What about WebSockets?

Yes, with Fanout Cloud, Django can even use WebSockets! Fanout Cloud will bundle incoming WebSocket events into HTTP requests and send them to the Django application. Fanout Cloud will expect responses to contain bundled WebSocket events as well. You don’t really have to think about this, though, as the django-grip library includes a socket-like abstraction that takes care of the marshalling for you.

Let’s add some basic WebSocket handler code to the headline project. First, we need another include:

...
from gripcontrol import WebSocketMessageFormat
...

And here’s the code that handles incoming requests and messages:

def item(request, headline_id):
    h = get_object_or_404(Headline, pk=headline_id)

    # websocket handling
    if request.wscontext:
        ws = request.wscontext
        if ws.is_opening():
            ws.accept()
            ws.subscribe('headline-%s' % headline_id)
        while ws.can_recv():
            message = ws.recv()
            if message is None:
                ws.close()
                break
        return HttpResponse()

    elif request.method == 'GET':
        ...

The django-grip middleware sets a wscontext property on every request object. If the incoming request was a WebSocket-over-HTTP request, then the property will be a socket-like object. Otherwise, the property will be set to None.

The above code accepts all connections, and subscribes them to channels. It also reads all incoming messages but does nothing with them. If a close event is received (indicated by a None response to the ws.recv() call), then the WebSocket is cleanly closed. Just to be clear here, the Django application is not using WebSockets directly. It is all simulated within the django-grip library and transported over normal HTTP. For more information about the WebSocket-over-HTTP protocol, see the spec.

You’ll notice that the WebSocket code path also returns an empty HttpResponse object. This is needed to satisfy middlewares that are expecting a proper response object to be returned by the view. However, this response is not sent back to the requestor. The django-grip middleware will end up hijacking and rewriting the response based on the actions taken with the request.wscontext object.

Now for the publishing code. We’ll just publish the headline object in JSON format as the WebSocket message:

# publish
formats = list()
formats.append(HttpStreamFormat('event: update\ndata: %s\n\n' % hjson))
formats.append(WebSocketMessageFormat(hjson)) # websocket message
publish('headline-%s' % headline_id, formats)
...

Clients can then connect via WebSocket to a URI such as ws://api.headlineapp.org/1/ and receive realtime updates of the headline object. The WebSocket.org Echo Test can be handy for testing.

Well that was easy. With just a few more lines of code, we now have a massively scalable and stateless (!) WebSocket API. As with our SSE implementation above, it is completely defined and controlled by Django, and Fanout Cloud is invisible to connecting clients.

How about long-polling?

So, you’re more of a long-polling person? We should be friends.

Holding requests open for long-polling is similar to the way we handled streaming earlier. The difference is that instead of calling set_hold_stream() and publishing HttpStreamFormat objects, you call set_hold_longpoll() and publish HttpResponseFormat objects.

For our headline service, we’ll make the long-polling work like RealCrowd’s API, where the client can perform a conditional long-polling request based on ETags. We’ll deviate slightly by using a Wait header in the request instead of a query parameter. If the client makes a request including an If-None-Match header and a Wait header, and the value of If-None-Match matches the ETag known by the server, then the server should hold the request open rather than respond right away.

The new includes:

...
from gripcontrol import HttpResponseFormat
from django_grip import set_hold_longpoll
...

And the extra code to look for the Wait header and use long-polling if necessary:

...
elif request.method == 'GET':
    if request.META.get('HTTP_ACCEPT') == 'text/event-stream':
        ...
    else:
        wait = request.META.get('HTTP_WAIT')
        if wait:
            wait = int(wait)
            if wait < 1:
                wait = None
            if wait > 300:
                wait = 300
        inm = request.META.get('HTTP_IF_NONE_MATCH')
        etag = '"%s"' % calendar.timegm(h.date.utctimetuple())
        if inm == etag:
            resp = HttpResponseNotModified()
            if wait:
                set_hold_longpoll(request, 'headline-%s' % headline_id,
                    timeout=wait)
        else:
            resp = _json_response(h.to_data())
        resp['ETag'] = etag
        return resp
...

The set_hold_longpoll() call puts special instructions in the HTTP response telling Fanout Cloud to keep the request open and subscribe it to the specified channel. Unlike streaming, where the response is sent to the client right away, in long-polling mode nothing is sent right away. The response is used for timeout purposes instead. We’ll still specify a 304 response then, which will be sent to the client if the timeout elapses.

Publishing:

...
hdata = h.to_data()
hjson = json.dumps(hdata)
etag = '"%s"' % calendar.timegm(h.date.utctimetuple())
rheaders = {'ETag': etag}
hpretty = json.dumps(hdata, indent=4) + '\n'

# publish
formats = list()
formats.append(HttpResponseFormat(body=hpretty, headers=rheaders))
formats.append(HttpStreamFormat('event: update\ndata: %s\n\n' % hjson))
formats.append(WebSocketMessageFormat(hjson)) # websocket message
publish('headline-%s' % headline_id, formats)
...

When publishing HTTP responses, the entire response may be specified. For the body, we use the pretty-printed form of the JSON, and we also specify headers. This way, the client receives a consistent style of response whether long-polling or not.

Notice how we’re specifying three formats for a single item at publish time. Fanout Cloud only delivers items to listeners if there is an available representation of the item based on the type of listener. We want to be able to publish data to WebSocket clients, streaming clients, and long-polling clients all at once, so we include all of these formats.

Let’s try it out with curl. First, we’ll get the current value of a headline:

curl -i http://api.headlineapp.org/1/

Response:

HTTP/1.1 200 OK
Content-Type: application/json
Etag: "1415266317"
...

{
    "type": "hidden", 
    "text": "", 
    "date": "2014-11-06T09:31:57.795823+00:00", 
    "id": "1"
}

Now we’ll start a long-poll for changes:

curl -i -H 'If-None-Match: "1415266317"' \
  -H 'Wait: 10' http://api.headlineapp.org/1/

If we wait 10 seconds, we’ll eventually get:

HTTP/1.1 304 NOT MODIFIED
Etag: "1415266317"
...

If we issue the same request again, and while the request is hanging open we modify the headline object with PUT, then the original request will receive the updated object right away:

HTTP/1.1 200 OK
ETag: "1415944588"
...

{
    "type": "normal", 
    "text": "hello to the world", 
    "date": "2014-11-14T05:56:28.522877+00:00", 
    "id": "1"
}

There you have it! Creating your own custom long-polling API like this couldn’t be easier.

Next-generation realtime

We hope you enjoy Fanout Cloud and the django-grip library. The complete code of the headline project is on github. Happy pushing!