Washington Post and Facebook, Part Two
Last time, I wrote an introduction to our development efforts around Washington Post's first app for the Facebook Platform. See that post to get an idea of what Platform is and why it's interesting. In this post, I'd like to talk more about how we used Django to serve the application. In part three, I'll talk about some performance-tuning lessons learned through the course of this development and deployment.
Callback Architecture
Facebook's Platform is based on a callback architecture. The application is hosted on Facebook, users connect to and interact with the application through Facebook, but any page for the application is returned from a callback URL running on our own servers. To help illustrate this, let's look at the process for registering an app on Facebook.
The figure below show the first few questions for the setup page for our app, The Compass.

Notice there is a "Callback URL" and a "Canvas Page URL". The callback URL is the
base URL on the developer's server (washingtonpost.com in our case); the canvas page URL
is the base URL on Facebook's server. When you install our app on Facebook, you
are redirected to the canvas page URL, which in turn fetches content from the callback.
You can have any number of callback pages extending off the base. If you went to
apps.facebook.com/thecompass/foo/, then that page would fetch content from
specials.washingtonpost.com/politicompass/foo/.
Now you can't go directly to specials.washingtonpost.com/politicompass/
because without the POST data Facebook submits to the callback URL, the application won't
work. If you hit our server directly without coming through Facebook, we redirect to the
Facebook URL for our app. In fact, every time Facebook hits our callback URL there is
a little setup that has to be done for each request. To incapsulate this neatly, I've got
an init function that is called at the start of every view function.
The init function looks like this:
def init_facebook(request): facebook = Facebook(config.API_KEY, config.SECRET) facebook.set_facebook_url(host=FACEBOOK_HOSTNAME) # Ensure we're running inside the Facebook frame. # All Facebook platform frame pages send a POST with fb_sig. if not request.POST.get('fb_sig'): return HttpResponseRedirect(facebook.facebook_url) user = request.POST.get('fb_sig_user') if not user: return HttpResponse('<fb:redirect url="%s" />' % facebook.get_install_url()) facebook.user = user facebook.session_key = request.POST.get('fb_sig_session_key') if facebook.session_key != facebook.get_session(): return HttpResponse('<fb:redirect url="%s" />' % facebook.get_install_url()) return facebook
init_facebook is not a view function itself. It is called from within
a view function, even though it is passed the request object like a view function.
To see this in action, let's look at the first few lines of the "index" view function:
def index(request): facebook = init_facebook(request) if not hasattr(facebook, 'api_key'): return facebook
I check the returned object to see if it's a Facebook object or not. If not, then it's an HttpResponse object, and I need to return that to the requesting client. Once the Facebook session has been setup (all of this through the Python client library), then we go about the business of processing forms, saving to the DB, etc. There are only a couple views used. An "index" view is used for the initial canvas page, which delivers the 10 questions and submits the answers to the database, and a "friend" view that builds the friend canvas page and provides XML to the Flash map on the friend page.
Profile Publish Model
If it's not readily obvious, every time a user hits one of our canvas pages, they end up
submitting a request to our servers. There is also a part of our application that lives
on the user's profile page. Once you submit the 10 question survey, we place a compass image
on your profile page. Facebook doesn't call another site's URL directly from a user's profile.
They use a publish model. To place something on the profile, we have to explicitly call
profile.setFBML. I'm doing this right after a successful save of the compass
questions to the DB.
There are advantages to the publish model, both for Facebook users and developers. For us,
it means our servers don't get hit with every profile page view on Facebook.
profile.setFBML takes a string of FBML as its chief argument. That FBML is
cached and served from Facebook's servers. For users, this means that their profile is a
little safer from being hijacked by an application. The disadvantage of this is that
you have to have the user initiate the action that changes the content on the profile. This
wasn't a problem for us, but would create problems for something like
last.fm that would want their app to dynamically update a
playlist on the user's profile.
To be continued....
Even though the profile hits are cached and served from Facebook's server farm, the canvas page traffic has still been intense. Next time, I'll go over some things we learned in tuning Apache and Postgresql given the sudden bump in traffic.
Posted by deryck on June 6, 2007


Comments
Max Battcher on June 19, 2007 at 11:38 a.m.
» The disadvantage of this is that you have to have the user initiate the action that changes the content on the profile. This wasn't a problem for us, but would create problems for something like last.fm that would want their app to dynamically update a playlist on the user's profile. «
Which seems to be exactly why last.fm uses a flash app in the profile. I think that was an interesting compromise on their part.
Paul Smith on June 19, 2007 at 1:14 p.m.
I'll be interested to hear what you have to say about tuning Apache, but you should also take a look at deploying behind nginx. It's much lighter-weight than Apache (smaller memory footprint, faster request-response cycle), so you can devote more resources to load-balancing. I'm assuming that if you're using Apache you're also deployed with mod_python -- there's no comparable setup with nginx, but I've been happy setting up Django FastCGI instances, which nginx can readily talk to. I find this setup gives you more granular control over deployment, better isolating the web server from the app server.
deryck on June 19, 2007 at 9:41 p.m.
@Max
I agree, the Flash app is nice, but if you'll notice, it doesn't show your recently played list until you click that tab on the Flash widget. last.fm can't load that info whenever a user hits his or her profile, but that seems the obvious app for last.fm, right? A constantly updated list of what I'm listening to, either on my profile or my news feed.
@Paul
That tuning/scaling post is on the way. A new project at work sidelined my personal stuff. I should get it up late this week or on the weekend for sure.
Matt Biddulph on June 19, 2007 at 10:18 p.m.
It's also possible to update the user's profile FBML by making an API call at any time later as long as they grant you an 'Infinite Session' - see http://developers.facebook.com/docume...
We use this on the Dopplr facebook app - http://www.facebook.com/apps/applicat... - to push mini-feed items and FBML changes as nightly updates.
Eyal Pincu on August 11, 2007 at 4:37 p.m.
very interesting stuff