June archive

Washington Post and Facebook, Part Two

June 6, 2007

Last time, I wrote an introduction to our development efforts around Washington Post's first app for the Facebook Platform. See that post to get an idea of what Platform is and why it's interesting. In this post, I'd like to talk more about how we used Django to serve the application. In part three, I'll talk about some performance-tuning lessons learned through the course of this development and deployment.

Callback Architecture

Facebook's Platform is based on a callback architecture. The application is hosted on Facebook, users connect to and interact with the application through Facebook, but any page for the application is returned from a callback URL running on our own servers. To help illustrate this, let's look at the process for registering an app on Facebook.

The figure below show the first few questions for the setup page for our app, The Compass.

Edit screen for The Compass

Notice there is a "Callback URL" and a "Canvas Page URL". The callback URL is the base URL on the developer's server (washingtonpost.com in our case); the canvas page URL is the base URL on Facebook's server. When you install our app on Facebook, you are redirected to the canvas page URL, which in turn fetches content from the callback. You can have any number of callback pages extending off the base. If you went to apps.facebook.com/thecompass/foo/, then that page would fetch content from specials.washingtonpost.com/politicompass/foo/.

Now you can't go directly to specials.washingtonpost.com/politicompass/ because without the POST data Facebook submits to the callback URL, the application won't work. If you hit our server directly without coming through Facebook, we redirect to the Facebook URL for our app. In fact, every time Facebook hits our callback URL there is a little setup that has to be done for each request. To incapsulate this neatly, I've got an init function that is called at the start of every view function.

The init function looks like this:


def init_facebook(request):
    facebook = Facebook(config.API_KEY, config.SECRET)
    facebook.set_facebook_url(host=FACEBOOK_HOSTNAME)

    # Ensure we're running inside the Facebook frame.
    # All Facebook platform frame pages send a POST with fb_sig.
    if not request.POST.get('fb_sig'):
        return HttpResponseRedirect(facebook.facebook_url)

    user = request.POST.get('fb_sig_user')
    if not user:
        return HttpResponse('<fb:redirect url="%s" />' % facebook.get_install_url())

    facebook.user = user
    facebook.session_key = request.POST.get('fb_sig_session_key')
        
    if facebook.session_key != facebook.get_session():
        return HttpResponse('<fb:redirect url="%s" />' % facebook.get_install_url())
    return facebook
  

init_facebook is not a view function itself. It is called from within a view function, even though it is passed the request object like a view function. To see this in action, let's look at the first few lines of the "index" view function:


def index(request):
    facebook = init_facebook(request)
    if not hasattr(facebook, 'api_key'):
        return facebook
  

I check the returned object to see if it's a Facebook object or not. If not, then it's an HttpResponse object, and I need to return that to the requesting client. Once the Facebook session has been setup (all of this through the Python client library), then we go about the business of processing forms, saving to the DB, etc. There are only a couple views used. An "index" view is used for the initial canvas page, which delivers the 10 questions and submits the answers to the database, and a "friend" view that builds the friend canvas page and provides XML to the Flash map on the friend page.

Profile Publish Model

If it's not readily obvious, every time a user hits one of our canvas pages, they end up submitting a request to our servers. There is also a part of our application that lives on the user's profile page. Once you submit the 10 question survey, we place a compass image on your profile page. Facebook doesn't call another site's URL directly from a user's profile. They use a publish model. To place something on the profile, we have to explicitly call profile.setFBML. I'm doing this right after a successful save of the compass questions to the DB.

There are advantages to the publish model, both for Facebook users and developers. For us, it means our servers don't get hit with every profile page view on Facebook. profile.setFBML takes a string of FBML as its chief argument. That FBML is cached and served from Facebook's servers. For users, this means that their profile is a little safer from being hijacked by an application. The disadvantage of this is that you have to have the user initiate the action that changes the content on the profile. This wasn't a problem for us, but would create problems for something like last.fm that would want their app to dynamically update a playlist on the user's profile.

To be continued....

Even though the profile hits are cached and served from Facebook's server farm, the canvas page traffic has still been intense. Next time, I'll go over some things we learned in tuning Apache and Postgresql given the sudden bump in traffic.

Washington Post and Facebook Platform Development

June 3, 2007

A little over a week ago, my group at WPNI was involved in developing an application for Facebook's latest version of their platform. If you haven't yet heard about this, Facebook Platform now lets developers create full blown Facebook applications just like Facebook's own Photos, Events, and Notes applications. Any developer can create any kind of application which can run on Facebook itself. Our first application is called The Compass. (You must be a Facebook user to view or use the app.)

Rob does a much better job than I ever could of explaining the ideas, the creative process, and the Compass itself. I thought I would try to add a little on the technical aspects of the app, specifically the Facebook/washingtonpost.com intergration, deploying on Django, and server performance issues from the Facebook traffic.

Facebook Platform

Facebook has had a developer's API since late last year. This API allowed developers outside of Facebook a way to create web or desktop applications that could integrate data from Facebook's social network. A Facebook user could go to a website I created, and given the proper Facebook credentials, login to my site and have some or all of their social data from Facebook follow them. For example, their list of friends, groups, event information, and so on.

The current, updated version of this API contains all the methods that existed before --


facebook.auth.getSession
facebook.friends.get
facebook.groups.get
facebook.evets.get
  

but also, there are new methods specific to loading an app within the context of Facebook itself --


facebook.feed.publishStoryToUser
facebook.feed.publishActionOfUser
facebook.profile.setFBML
facebook.profile.getFBML
  

The latter methods are the hooks for publishing to a section on the user's profile. Calling those methods (through the Facebook client library of your choice) would allow you to publish a string of FBML into the user's profile. FBML is where the new version of the Facebook platform gets interesting and begins to separate itself from other simple widget platforms.

FBML

FBML, or Facebook Markup Language, is essentially html (stripped of script and other potentially dangerous tags) with a few Facebook specific tags. These fb tags allow the developer to access a Facebook user's data in a generic fashion. The real advantage of this approach is that with just the UID of a Facebook user, I can load related data in a Facebook page or on the user's profile without having to do any processing on my end.

For example, this --


{% for friend in friends %}
    <fb:profile-pic uid="{{ friend }}" />
    <fb:userlink uid="{{ friend }}">
      <fb:name uid="{{ friend }}" useyou="false" />
    </fb:userlink>
{% endfor %}
  

is an example in Django template syntax using FBML that would produce the following --

Facebook renders the FBML tags for you, which has some significant advantages. The user's profile picture is always set to the current profile pic just to throw out the most obvious.

Note: in the code example above, "friends" is a list of Facebook UIDs returned by calling facebook.friends.get.

Python Client Libraries

Everything we do at WPNI uses the Django web framework. Being a Python framework, this means everything we do is written in Python (short of several shell scripts for updating code, managing deploying, etc.). The initial example library that we received from Facebook was written in PHP. There was an existing Python version of the Facebook API hosted on Google Code, but it didn't have the new methods mentioned above nor did it seem to work all that well for me. So to get started with Facebook, I had to rewrite that PHP client library in Python. I did borrow a couple methods from the original Python library, but mine is largely a one to one copy of the current PHP library hosted from the Facebook developers page.

Since I was under a bit of a deadline pressure, I didn't port every method or else I would release my version. I did notice that the current Python library on Google Code has been updated. I haven't tried it to see if it works any better. I had hoped to spend a little time after our application's launch to finish out this library and do a little more Facebook development, but other deadlines are pressing down on our team. It looks like I may not get back to this, but if that changes, I'll post here with new info. If I ever have the chance to finish out the library, I'll certainly make it public.

To be continued....

Okay, this got to be a longer post than I first imagined it would be. I'll break this into sections. In a day or two, I'll post on how the Facebook/Django intergration actually works, and a day or two after that, I'll post on the server issues we experienced and offer some performance tips for scaling a mod_python/Postgresql application.