An if-substring-in-string Django Template Construction
Here's a quick tip for Django template hackers. It's a known fact of Django
templates that the syntax
is purposefully limited. I've been living with the need for an
if-substring-in-string construction. Of course, I could write a custom
template tag, but work is quite
busy. So on a
whim and a 10 minute break I tried this yesterday, and it worked well for me.
Take a look and let me know what you think.
First, the problem.
Bookmark Formatting
I have a custom app for pulling in my Delicious bookmarks and formatting them
as link posts here on this site. I include the HTML I want in the original
bookmark and have a bit of template code to insert preview images generated by
ShrinkTheWeb. When I link a YouTube
video, I do a quick cut-n-paste of the video to embed the player, and in these
cases, I don't want to include the preview image. In Python, this would be
super simple:
if 'http://www.youtube.com/' not in link_url:
show_thumb()
However, Django templates don't have a similiar syntax.
A Solution Using {% with %}
So here's what I did:
{% with link_url|slice:":23" as short_url %}
{% ifnotequal short_url "http://www.youtube.com/" %}
My ShrinkTheWeb image goes here.
{% endifnotequal %}
Other HTML common to all links goes here.
{% endwith %}
What do others think? A decent solution? There are some issues; if I want
to use multiple video sites, for example. At that point, I would be better to
use link categories or a custom tag. But for a 10 minute fix, I think it's
quite nice.
Link | Posted by deryck on July 3, 2009 | 4 comments
A Django Auth Backend for Second Life
I've started hacking away at a personal project of mine around
Second Life. More on that in the days to
come, but I did want to share some code created last night while
playing around with Second Life logins. I've worked up a
Django authentication backend for authenticating users
on a Django-based site against Second Life's login process. I've created a
Google code project for the code, so cleverly named
slauth.
This is code of the "release early, release often" variety. There are no
docs, no tests, not even a README. I just wanted to get this up while I had
5 minutes today. I welcome feedback, and I'm certain I will be working on this
as the larger project evolves. I'm not even certain I'll use this in
the final project. I feel uncomfortable taking username and password for
another "site," but without a proper login API for site-to-site authentication,
this seems to be the only viable route. This uses the same XMLRPC auth process
of the Second Life viewer code, which seemed to legitimize it a little for me
(since this is how third party viewers have to authenticate). It's certainly
better than page scraping the response of the Second Life web site's login
form. If there were a way to register you application with the login process,
I would be totally cool with this. Then, the user could verify a site as being
legitimate -- or at least more legitimate than any joe running this auth backend. ;)
Having said all that, it's pretty easy to authenticate via this package.
Just make sure the module lives on your PythonPath, and then add slauth.backends.SLAuthBackend to your Django AUTHENTICATION_BACKENDS setting. You'll even be able to login through the
Django admin with your full SL username ("Anders Falworth" in my case, as
an example). Of course, you won't get into the admin until you add make
the SL account "staff". (This app creates a stub Django user account
for each successful SL login, and then you can check is_staff, then login again
with the SL account, and you'll see the successful entry into the
Django admin site.)
This is the main class that does all the work:
from django.contrib.auth.models import User
from slauth.utils import valid_sl_login, get_or_create_sl_user
class SLAuthBackend:
"""
A Second Life authentication backend for Django-based sites.
"""
def authenticate(self, **kwargs):
"""
Use kwargs to make the authenticate method more flexible.
Django's admin app assumes username/password logins, so
allow first and last in one username. For example,
username could be 'Bob Smith' and this method will split
that apart into the first and last names SL login expects.
So either of the following would work:
>>> from django.contrib.auth import authenticate
>>> authenticate(first_name='Bob', last_name='Smith',
password='foo')
>>> authenticate(username='Bob Smith', password='foo')
"""
first_name = kwargs.get('first_name', '')
last_name = kwargs.get('last_name', '')
password = kwargs.get('password', '')
if kwargs.get('username', ''):
username = kwargs.get('username', '')
if ' ' in username:
first_name, last_name = username.split(' ')
authenticated = valid_sl_login(first_name, last_name, password)
if authenticated:
user = get_or_create_sl_user(first_name, last_name)
return user
return None
def get_user(self, user_id):
try:
return User.objects.get(pk=user_id)
except User.DoesNotExist:
return None
You could certainly use parts of this without using Django, even though it's
written with Django in mind. There is a utils module that has sl_login and valid_sl_login which returns the
response from a login attempt or a True/False on success or failure of a
login attempt.
Please have at the code at it's Google project home if you
have need for or want to play with Second Life logins via Django or Python.
Comments, suggestions, and of course, contributions are always welcome. This
code is released under the GNU GPL v2.
Link | Posted by deryck on March 3, 2008 | 0 comments
Facebook/Washington Post, Performance Tuning
This final post about my group's work (at Washington Post.Newsweek Interactive) on our Facebook Platform app
The Compass
is long overdue. But now the time has come! Let's talk Postgresql and Apache performance.
In the first two posts on this subject, I wrote about
the
Facebook Platform itself and
the
Compass' architecture. In this post, we'll look at some of the challenges we encountered
while serving the app and areas we focused on to improve our Postgresql and Apache performance.
NOTE: All of this is anecdotal, based on my experience with this app. I'm
no performance guru and don't hold myself up as such. I think, too, different applications
have different needs, and the requirements of something like Facebook could not be
optimal for other situations.
Caching Limitations
As I mentioned last time, all of FBML we load into a profile is cached and served by
Facebook, but the hits to our application pages are hits to our servers as well. The first
thing that comes to mind with Django is, "well, make sure you have caching enabled." There
are a couple reasons why this doesn't work as well as one would like.
First, the caching for a Django site is bypassed when the request contains GET or POST data.
Every request from Facebook contains POST data. Each callback request has a few fb_sig*
parameters that are POSTed to your page to verify the request comes from Facebook.
This is great for security and passing data from Facebook back to your application, but
it kills the normal caching process for Django-based sites.
Second, each request can potentially be unique. In our case, the only Facebook canvas pages
we serve are the one that submits the compass survey questions and the one to display the
Flash map of your friends who have installed the compass. It's hard to do much low-level caching
of Django querysets because you don't want to inadvertently give the user someone else's data.
We do a little of this, though. See, for example, what we do here when we display the
compass based on your last answer:
cache_key = 'compass_entries_%s' % facebook.user
compass_entries = cache.get(cache_key)
if not compass_entries:
compass_entries = Compass.objects.filter(user__exact=facebook.user).order_by('-id')[:10]
cache.set(cache_key, compass_entries, 60 * 15)
We also reset these entries in the cache when a user resubmits the survey. So we save a few DB
hits if the same user retakes the survey a few times back to back. However, there's just not
much in common across users to really take advantage of Django's cache. We're pretty well left to
raw DB performance.
Bypass the ORM
One of the first things we did to help performance was to bypass Django's ORM. We store the
user's answer to each question via a save method on the form that is submitted. Using the ORM
this would look something like:
from politicompass.models import Compass
def save(self, uid):
q1 = self.clean_data.get('q1')
q2 = self.clean_data.get('q2')
q3 = self.clean_data.get('q3')
q4 = self.clean_data.get('q4')
q5 = self.clean_data.get('q5')
q6 = self.clean_data.get('q6')
q7 = self.clean_data.get('q7')
q8 = self.clean_data.get('q8')
q9 = self.clean_data.get('q9')
q10 = self.clean_data.get('q10')
for i in range(1,11):
answer = 'q%s' % i
compass = Compass(user=uid, question_id=i, answer=answer)
compass.save()
We refactored this before launch to bypass the ORM and excecute the INSERTs in one connection:
from django.db import connection
from politicompass.models import Compass
def save(self, uid):
q1 = self.clean_data.get('q1')
q2 = self.clean_data.get('q2')
q3 = self.clean_data.get('q3')
q4 = self.clean_data.get('q4')
q5 = self.clean_data.get('q5')
q6 = self.clean_data.get('q6')
q7 = self.clean_data.get('q7')
q8 = self.clean_data.get('q8')
q9 = self.clean_data.get('q9')
q10 = self.clean_data.get('q10')
sql = ""
for i in range(1,11):
answer = 'q%s' % i
sql += "INSERT INTO facebook_compasses (user, question_id, answer) VALUES (%s, %s, %s);" % (uid, i, answer)
cursor = connection.cursor()
cursor.execute(sql)
connection._commit()
There were other performance-conscious moves we made along these lines, and still, once the app
started to grow in popularity, we had users submitting that form in such numbers that our
DB server load stayed at a freakishly high level. (NOTE: Prior to Facebook, we
normally ran at about a .20-.35 load. Once the Facebook app launched, our load jumped up into
the 3.00-4.30 range depending on site activity.)
Tuning Postgresql
I had already tuned Postgresql once for some spikes we had encountered when some of our apps
were linked up by MSN and MSNBC. These tunings included raising the max_connections
limit and bumping up the amounts for the following settings:
shared_buffers
work_mem
maintenance_work_mem
max_stack_depth
The most significant of these for us was shared_buffers. With the hits we had received
from MSN and MSNBC, raising shared_buffers to about 1.6 GB (we have 8 on the box) and increasing max_connections
was enough to keep us humming along nicely. With the Facebook traffic we had to increase shared_buffers to
about half the available RAM on the box and everything dropped back to a sane level. We are running
on Solaris and so we had to have our box increase the amount of shared memory available from the kernel
in order to give so much RAM to shared_buffers, but again, once this happened, the load recovered amazingly
well.
Hits Under Facebook
Just to toss out some raw numbers, when we first loaded our app to Facebook, we were doing about
5-10 hits a second during peak usage. We ended up doing about 2.5 million hits the first week, just
from Facebook alone. We run 4 other sites off the same server. This is a single Postgresql server.
We do have our two web servers behind a load balancer, and our static media is served from the normal
media.washingtonpost.com setup. Needless to say, there are certainly higher numbers that other sites
boast, but the single DB, with some tuning and planning, survived the spike pretty well.
Currently, we're doing about 10 million hits a month from this setup, and we're really at its limits now.
To do much more, we'll have to look at replicating the database. Having said that, were it not
for the Facebook traffic and a similar Newsweek week app bypassing the cache for reasons outlined
above, I think we could easily do twice the traffic on the same setup. Caching really saves on DB
load, so use it all you can if possible.
Apache Tuning
Luckily, we never felt the Facebook traffic from an Apache stand point. I will point out, for the
sake of LAMP stack completeness, that the best trick I learned for Apache is to set
MaxRequestsPerChild to something in the range of 500. This keeps Apache memory size
down while also serving a decent amount of traffic per process. And if you don't know this already,
never serve a Django-based site with DEBUG=True. Not only is it bad from a security
stand point, but Django in DEBUG mode stores the queries run in memory, so you can quickly eat up
your RAM if you forget to turn this off.
Again, this is just my experience of tuning our stack, so YMMV, but I hope sharing this
info will prove useful.
Link | Posted by deryck on August 15, 2007 | 6 comments
Washington Post and Facebook, Part Two
Last
time, I wrote an introduction to our development efforts around Washington Post's
first app for the Facebook Platform. See
that post to get an idea of what Platform is and why it's interesting. In this post,
I'd like to talk more about how we used Django to serve the application. In part three,
I'll talk about some performance-tuning lessons learned through the course of this
development and deployment.
Callback Architecture
Facebook's Platform is based on a callback architecture. The application is hosted
on Facebook, users connect to and interact with the application through Facebook, but
any page for the application is returned from a callback URL running on our own servers.
To help illustrate this, let's look at the process for registering an app on Facebook.
The figure below show the first few questions for the setup page for our app,
The Compass.

Notice there is a "Callback URL" and a "Canvas Page URL". The callback URL is the
base URL on the developer's server (washingtonpost.com in our case); the canvas page URL
is the base URL on Facebook's server. When you install our app on Facebook, you
are redirected to the canvas page URL, which in turn fetches content from the callback.
You can have any number of callback pages extending off the base. If you went to
apps.facebook.com/thecompass/foo/, then that page would fetch content from
specials.washingtonpost.com/politicompass/foo/.
Now you can't go directly to specials.washingtonpost.com/politicompass/
because without the POST data Facebook submits to the callback URL, the application won't
work. If you hit our server directly without coming through Facebook, we redirect to the
Facebook URL for our app. In fact, every time Facebook hits our callback URL there is
a little setup that has to be done for each request. To incapsulate this neatly, I've got
an init function that is called at the start of every view function.
The init function looks like this:
def init_facebook(request):
facebook = Facebook(config.API_KEY, config.SECRET)
facebook.set_facebook_url(host=FACEBOOK_HOSTNAME)
# Ensure we're running inside the Facebook frame.
# All Facebook platform frame pages send a POST with fb_sig.
if not request.POST.get('fb_sig'):
return HttpResponseRedirect(facebook.facebook_url)
user = request.POST.get('fb_sig_user')
if not user:
return HttpResponse('<fb:redirect url="%s" />' % facebook.get_install_url())
facebook.user = user
facebook.session_key = request.POST.get('fb_sig_session_key')
if facebook.session_key != facebook.get_session():
return HttpResponse('<fb:redirect url="%s" />' % facebook.get_install_url())
return facebook
init_facebook is not a view function itself. It is called from within
a view function, even though it is passed the request object like a view function.
To see this in action, let's look at the first few lines of the "index" view function:
def index(request):
facebook = init_facebook(request)
if not hasattr(facebook, 'api_key'):
return facebook
I check the returned object to see if it's a Facebook object or not. If not,
then it's an HttpResponse object, and I need to return that to the requesting client. Once the
Facebook session has been setup (all of this through the Python client library), then we go
about the business of processing forms, saving to the DB, etc. There are only a couple views
used. An "index" view is used for the initial canvas page, which delivers the 10 questions and
submits the answers to the database, and a "friend" view that builds the friend canvas page and
provides XML to the Flash map on the friend page.
Profile Publish Model
If it's not readily obvious, every time a user hits one of our canvas pages, they end up
submitting a request to our servers. There is also a part of our application that lives
on the user's profile page. Once you submit the 10 question survey, we place a compass image
on your profile page. Facebook doesn't call another site's URL directly from a user's profile.
They use a publish model. To place something on the profile, we have to explicitly call
profile.setFBML. I'm doing this right after a successful save of the compass
questions to the DB.
There are advantages to the publish model, both for Facebook users and developers. For us,
it means our servers don't get hit with every profile page view on Facebook.
profile.setFBML takes a string of FBML as its chief argument. That FBML is
cached and served from Facebook's servers. For users, this means that their profile is a
little safer from being hijacked by an application. The disadvantage of this is that
you have to have the user initiate the action that changes the content on the profile. This
wasn't a problem for us, but would create problems for something like
last.fm that would want their app to dynamically update a
playlist on the user's profile.
To be continued....
Even though the profile hits are cached and served from Facebook's server farm, the
canvas page traffic has still been intense. Next time, I'll go over some things we learned
in tuning Apache and Postgresql given the sudden bump in traffic.
Link | Posted by deryck on June 6, 2007 | 5 comments
Washington Post and Facebook Platform Development
A little over a week ago, my group at
WPNI was involved
in developing an application for Facebook's latest version of their platform. If
you haven't yet heard about this, Facebook Platform now lets developers create
full blown Facebook applications just like Facebook's own Photos, Events, and
Notes applications. Any developer can create any kind of application which can
run on Facebook itself. Our first application is called
The Compass. (You must be a
Facebook user to view or use the app.)
Rob does a much better job than I ever could of
explaining
the ideas, the creative process, and the Compass itself. I thought
I would try to add a little on the technical aspects of the app,
specifically the Facebook/washingtonpost.com intergration, deploying on Django,
and server performance issues from the Facebook traffic.
Facebook Platform
Facebook has had a developer's API since late last year. This API allowed
developers outside of Facebook a way to create web or desktop applications
that could integrate data from Facebook's social network. A Facebook user
could go to a website I created, and given the proper Facebook credentials,
login to my site and have some or all of their social data from Facebook
follow them. For example, their list of friends, groups, event information,
and so on.
The current, updated version of
this API contains all the methods that existed before --
facebook.auth.getSession
facebook.friends.get
facebook.groups.get
facebook.evets.get
but also, there are new methods specific to loading an app within the context
of Facebook itself --
facebook.feed.publishStoryToUser
facebook.feed.publishActionOfUser
facebook.profile.setFBML
facebook.profile.getFBML
The latter methods are the hooks for publishing to a section on the user's
profile. Calling those methods (through the Facebook
client library of your
choice) would allow you to publish a string of FBML into the user's
profile. FBML is where the new version of the Facebook platform gets interesting
and begins to separate itself from other simple widget platforms.
FBML
FBML, or Facebook Markup Language, is essentially html (stripped of script and other
potentially dangerous tags) with a few Facebook specific tags.
These fb tags allow the developer to access a Facebook user's data in a generic
fashion. The real advantage of this approach is that with just the UID of a
Facebook user, I can load related data in a Facebook page or on the user's profile
without having to do any processing on my end.
For example, this --
{% for friend in friends %}
<fb:profile-pic uid="{{ friend }}" />
<fb:userlink uid="{{ friend }}">
<fb:name uid="{{ friend }}" useyou="false" />
</fb:userlink>
{% endfor %}
is an example in Django template syntax using FBML that would produce
the following --
Facebook renders the FBML tags for you, which has some significant advantages.
The user's profile picture is always set to the current profile pic just to throw
out the most obvious.
Note: in the code example above, "friends" is a list of Facebook UIDs returned by calling
facebook.friends.get.
Python Client Libraries
Everything we do at WPNI uses the Django
web framework. Being a Python framework, this means everything we do is written in
Python (short of several shell scripts for updating code, managing deploying, etc.).
The initial example library that we received from Facebook was written in PHP. There was
an existing Python version of the Facebook API hosted on Google Code, but it didn't have
the new methods mentioned above nor did it seem to work all that well for me. So to get
started with Facebook, I had to rewrite that PHP client library in Python. I did borrow
a couple methods from the original Python library, but mine is largely a one to one
copy of the current PHP library hosted from the Facebook developers page.
Since I was under a bit of a deadline pressure, I didn't port every method or else
I would release my version. I did notice that the current Python library on Google Code
has been updated. I haven't tried it to see if it works any better. I had hoped
to spend a little time after our application's launch to finish out this library and
do a little more Facebook development, but other deadlines are pressing down on our team.
It looks like I may not get back to this, but if that changes, I'll post here with
new info. If I ever have the chance to finish out the library, I'll certainly make it
public.
To be continued....
Okay, this got to be a longer post than I first imagined it would be. I'll break this
into sections. In a day or two, I'll post on how the Facebook/Django intergration actually
works, and a day or two after that, I'll post on the server issues we experienced and offer
some performance tips for scaling a mod_python/Postgresql application.
Link | Posted by deryck on June 3, 2007 | 4 comments
Any Django People Coming To LinuxWorld?
Anyone working on/with or just interested in Django coming to LinuxWorld? If so,
let me know and we'll see about an informal meetup.
Link | Posted by deryck on August 11, 2006 | 0 comments
Django .95 Is Here!
Yes,
Django .95 has just
been released. For those who don't run off SVN trunk, there will be some upgrade
issues. Read Removing
The Magic on the Django wiki carefully. The changes to Django are well worth
any upgrade inconvenience. And some things are just plain easier to get done post
magic removal, as I found out this week back-porting a Google site map generator app
I wrote while running from SVN.
Congratulations to Adrian, Jacob, and all who contribute code for the release!
Link | Posted by deryck on July 29, 2006 | 0 comments
Django/AJAX Beating
James Bennett is taking a bit of a beating for his
Django/AJAX
suggestions.
A lot of the criticism is unmerited Rails envy, I imagine.
Rails has RJS — great! Django is not Rails.
If you want to build more than toy apps, you'll need something more sophisticated
than these little server-side helper functions. And if you just want partial page updates
or DHTML UI tricks, any JavaScript toolkit can make this quick and painless for you.
I also don't see why when someone says "Django's AJAX support shouldn't look like RJS,"
people hear, "Django isn't going to include AJAX support." AJAX, for all it's usefulness as a term,
is used in many different ways. I think the confusion in this case is due to the same word being
used to mean two completely different things.
Link | Posted by deryck on July 4, 2006 | 0 comments
New Job with Naples Daily News
I've taken a new job with Naples Daily
News. I'll be a developer in the new media department building cool stuff
with Django. I'm excited about the work
and about working with Rob Curley,
Eric Moritz, and the rest of the team.
I'll still be Alabama working from home, being that I just bought a house 3 months
ago. I'll travel a bit more now, making monthly trips to Naples, which is a really beautiful
and interesting city.
I think this is one of the best moves I've ever made. When I first started working
with Django, I was so impressed by what I heard about
World Online and really
wanted to be doing the same kind of work and be in the same kind of environment.
Now, I get that chance. How cool is this!
I finish at the library on Tuesday, July 11,
and start work with Naples News the next day.
Link | Posted by deryck on July 4, 2006 | 0 comments
Now Django Powered
I finally got this site converted to Django, more or less. There are a few
static pages lying around, but on the whole, I'm Django powered now. It
was quite a hack job, of which I'll (maybe) relate later.
My hosting service Jump Domain doesn't
really advertise Python or Django support. It can be done, though it's probably not
for the faint of heart. Scott with Jump Domain has been ultra helpful and I'll
try to talk with him more about what can be done to improve support for Django and Python.
Link | Posted by deryck on July 1, 2006 | 0 comments