Here lies the last 25 entries in my weblog. Follow the yearly archives to the right for more.

Why Not?

I saw Eric do a me too today, so why not join in:


naminanu:~/projects/wpni deryck$ history|awk '{a[$2]++} END{for(i in a){printf "%5d\t%s\n",a[i],i}}'|sort -rn|head
  128   vi
   76   python
   52   ls
   46   svn
   27   cd
   21   psql
   17   grep
   10   mv
   10   clear
    9   sudo

Link | Posted by deryck on April 16, 2008 | 0 comments

How Do We "take the paper" in Virtual Worlds

It's no secret that the news business has been dramatically changed by the Web. My generation and younger rarely "take the paper" the way my grandparent's or parent's generations did or continue to do. As virtual worlds like Second Life grow in popularity, yet another avenue for delivering news appears. I started thinking about this recently, and especially about how metaphors work in online media. We read Web pages and visit Web sites, which aren't really made of paper or bound to a location.

So does the virtual world offer us a way to "take the paper" again? Will this prove useful and interesting for this or the next generation? I don't know, but I want to play with the idea a bit and see what I come up with. With that in mind, I've started my own personal development project to test out some ideas.

Reading the News "Reading" the paper in Second Life.

I'm doing all my work out in the open and being transparent about the process, even showing my newbie attempts at coding LSL. I've added a Flickr set to keep track of news-related in world work.

Link | Posted by deryck on March 27, 2008 | 0 comments

Democratic Primary Predictions

In the spirit of all those "my predictions for 2008" posts that we see at the beginning of every year, here are my my predictions for the rest of this Democratic primary season. I'm really just writing them down here as an exercise in fun. I want to be able to look back and see how I did when the primary is over.

My predictions are:

  • The primary will not go all the way to the convention as everyone is predicting.
  • Sometime in the next four weeks, between now and the Pennsylvania primary, an agreement will be reached to seat the delegates from Michigan and Florida.
  • The delegates will not be seated according to the election results, but rather as some sort of arrangement between the candidates, with Hillary taking a slight advantage in delegates. Something like a 60/40 split, but don't hold me to the exact percentage. This keeps both candidates more or less happy with the agreement.
  • Obama is trailing Hillary by 12 points in Pennsylvania as of this posting. By the Pennsylvania primary, he will cut the lead to somewhere close to the polling margin of error. Let's say if the margin of error is 4% on the last poll taken, he will be 4 or 5 points under her by the time Pennsylvanians vote.
  • Obama then wins Pennsylvania by a very small margin. I'll guess 51% of the vote.
  • Hillary then reluctantly withdraws from the race at the insistence of party leadership.

I'll go on record as an Obama supporter in the interest of full disclosure. Feel free to accuse me of wishful thinking. However, I have reasons for each of the points above, having thought about this a lot lately. But again, I'm just posting this for fun to see if I get close.

Link | Posted by deryck on March 19, 2008 | 0 comments

Status Update on Google-driven Web Development Book

After posting a link to a conference on building web apps with Google this morning, I'm reminded that I haven't written here much lately about my book. So here's the latest news for those who might be wondering, "Hey, what ever happened to Deryck's Google book project?"

The short story is that a complete book is just not a possibility for me. While I lack no dedication to my work, this just wasn't the season for me to do a book. My career took off right about the time I signed the book agreement, and now two job changes, several projects, and two years of aggressive development while also trying to write a book have taken their toll on my enthusiasm to see this book completed. I am also at a different place now, with different technical and intellectual interests, so turning back to the topic of Google-driven development is a little like turning back in time.

I have done quite a bit of work on the book in the last couple years, though. To keep this from being lost work, I'm working on turning the book material into 3 different shortcut chapters as part of Prentice Hall's digital short cuts program. I am really excited about this. The material will see some published form, and the work is not a complete loss.

I'm finishing the first of these this week, and expect to move fairly quickly on the other two (on the order of a couple months, I hope.) I'll certainly update here as these shortcuts are released.

Link | Posted by deryck on March 12, 2008 | 0 comments

Lazy Man's Twitter Updates via CLI

While the various twitter apps around are nice for reading tweets, I'm too lazy to want to fire up the browser or reach for a desktop menu when I just want to twitter. After a pointer from coworker Sean Stoops to a Lifehacker article, Send Twitters from Command Line in Any OS, I kicked up this little shell script to make it even that much easier.


#!/bin/bash

if [ $# -eq 0 ]; then
	echo 'Usage: twit "STATUS IN QUOTES"'
	exit 1
elif [ $# -gt 1 ]; then
	echo 'Usage: twit "STATUS IN QUOTES"'
	exit 1
fi

STATUS=$1
TWITUSER=yourusername
TWITPASS=yourpassword

curl -u $TWITUSER:$TWITPASS -d status="$STATUS" http://twitter.com/statuses/update.xml
  

Now it's just -- twit "MY STATUS UPDATES" -- and I'm done.

Link | Posted by deryck on March 7, 2008 | 2 comments

Quick Python Tip: Socket Timeouts for Page Scrapers

It's not well documented, but there is a way to set a timeout for urllib, urllib2, and the like. This is done by setting the default timeout on the global socket. So if you're constantly hanging cron scripts because some resource you want to scrape is never responding, add the following to your script:


import socket                                             

# "timeout" is a float and 
# is the value you want in seconds.
timeout = 2.5
socket.setdefaulttimeout(timeout)

Any subsequent calls to urllib or any other module based off socket will now generate an IOError if the response is not returned before reaching timeout.

How you handle IOError is up to you. :-)

Link | Posted by deryck on March 4, 2008 | 0 comments

A Django Auth Backend for Second Life

I've started hacking away at a personal project of mine around Second Life. More on that in the days to come, but I did want to share some code created last night while playing around with Second Life logins. I've worked up a Django authentication backend for authenticating users on a Django-based site against Second Life's login process. I've created a Google code project for the code, so cleverly named slauth.

This is code of the "release early, release often" variety. There are no docs, no tests, not even a README. I just wanted to get this up while I had 5 minutes today. I welcome feedback, and I'm certain I will be working on this as the larger project evolves. I'm not even certain I'll use this in the final project. I feel uncomfortable taking username and password for another "site," but without a proper login API for site-to-site authentication, this seems to be the only viable route. This uses the same XMLRPC auth process of the Second Life viewer code, which seemed to legitimize it a little for me (since this is how third party viewers have to authenticate). It's certainly better than page scraping the response of the Second Life web site's login form. If there were a way to register you application with the login process, I would be totally cool with this. Then, the user could verify a site as being legitimate -- or at least more legitimate than any joe running this auth backend. ;)

Having said all that, it's pretty easy to authenticate via this package. Just make sure the module lives on your PythonPath, and then add slauth.backends.SLAuthBackend to your Django AUTHENTICATION_BACKENDS setting. You'll even be able to login through the Django admin with your full SL username ("Anders Falworth" in my case, as an example). Of course, you won't get into the admin until you add make the SL account "staff". (This app creates a stub Django user account for each successful SL login, and then you can check is_staff, then login again with the SL account, and you'll see the successful entry into the Django admin site.)

This is the main class that does all the work:


from django.contrib.auth.models import User

from slauth.utils import valid_sl_login, get_or_create_sl_user

class SLAuthBackend:
    """
    A Second Life authentication backend for Django-based sites.
    """
    def authenticate(self, **kwargs):
        """
        Use kwargs to make the authenticate method more flexible.

        Django's admin app assumes username/password logins, so
        allow first and last in one username.  For example,
        username could be 'Bob Smith' and this method will split
        that apart into the first and last names SL login expects.

        So either of the following would work:

            >>> from django.contrib.auth import authenticate
            >>> authenticate(first_name='Bob', last_name='Smith', 
                                               password='foo')
            >>> authenticate(username='Bob Smith', password='foo')
        """
        first_name = kwargs.get('first_name', '')
        last_name = kwargs.get('last_name', '')
        password = kwargs.get('password', '')

        if kwargs.get('username', ''):
            username = kwargs.get('username', '')
            if ' ' in username:
                first_name, last_name = username.split(' ')

        authenticated = valid_sl_login(first_name, last_name, password)

        if authenticated:
            user = get_or_create_sl_user(first_name, last_name)
            return user
        return None

    def get_user(self, user_id):
        try:
            return User.objects.get(pk=user_id)
        except User.DoesNotExist:
            return None

You could certainly use parts of this without using Django, even though it's written with Django in mind. There is a utils module that has sl_login and valid_sl_login which returns the response from a login attempt or a True/False on success or failure of a login attempt.

Please have at the code at it's Google project home if you have need for or want to play with Second Life logins via Django or Python. Comments, suggestions, and of course, contributions are always welcome. This code is released under the GNU GPL v2.

Link | Posted by deryck on March 3, 2008 | 0 comments

Why Worry About the Future of the Web?

I know this may seem odd for a web developer to say, but I don't get all the worry about the future of the Web. Seriously, why worry so much about it? Do a quick Google for "the future of the Web." Now take it all in -- semantic web! browser wars! web standards! file sharing! My two personal favorites among the page titles in the search results are:

  • Is XAML the future of the web?
  • Yahoo!: The Web's Future Is Not In Search

The first one makes me just snicker, and the second one makes me feel a little sad for Yahoo. Actually, the second one illustrates a good point, too. Worry about the future of the Web too much, and you can miss it's present.

Link | Posted by deryck on February 22, 2008 | 0 comments

Authentication, Identity, and Data Portability

I attended Zero Linden's office hours today, and there was a very interesting discussion about identity in Second Life. Maybe it's better to say there was a large discussion about a whole host of things related to virtual world interoperability, which began (and continued to loop back to) talks about identity. I'm not trying to be too wordy with my description here, but really, everyone returned to identity as if that was the central issue of the discussion -- and in some ways, I can see that it was -- but it other ways, there was a lot being discussed and certain areas being lumped together with others. I would like to go over some of the things that came up today, just to clarify my own thinking on these things, and also to separate out issue into distinct domains. So let's look at authentication, identity, and data portability all of which came up today.

Authentication

Authentication and identity were lumped together pretty heavily today, especially when we got around to talking about OpenID. The two are tied together in a system, there's no doubt, but to equate one with the other is inaccurate. Authentication is the act of securing permission for an identity, but not the identity itself. Usually, this means giving a token (a cookie on the web) to someone, or something more accurately, to allow that agent to act on behalf of a given identity. Or to act as that identity. But authenticating an identity is not the same thing as presenting the identity, or describing the identity. (And this point will make more sense down the page, I hope.)

I get the confusion or the conflagration of the two, especially where OpenID is concerned. OpenID even defines itself as "a place to store your digital identity." While this may be the goal of OpenID, that's not how it's used today. OpenID is used in authentication as a means of matching an end user, or an end user's computer, with an online identity. OpenID is not the identity itself.

Don't believe me? I can log into ma.gnolia with an OpenID of my choosing. I am known as deryck on ma.gnolia. Most people would identify me by that profile link. That is more closely my identity than the OpenID. In fact, I can drop the current OpenID I have with ma.gnolia, authenticate another one, and now I have a new means by which to authenticate. OpenID is just authentication, at least at this point in the game. It is just the means by which one system is matched with another (the me sitting, typing at this computer with the virtual me stored on a server). The authentication system is wholly separate from my identity itself.

Identity

So what is identity?

This is a great philosophical question, and really was the question at the heart of our discussion at Zero's office today. How we answer that question -- i.e. who am I? -- is a tough one. Clearly with social networks, or web sites more generally even, and with virtual worlds like Second Life, we invest a lot of time in carefully crafting an identity. For Second Life, the identity and login name are synonymous, but for other systems this might not be true. See the OpenID discussion above, or how we can use an email for login at ma.gnolia or Facebook. So our identity is that which projects ourself through the system. In Second Life, this is very "physical" in nature, even if a virtual physicality. We have a body, a shape, wear certain clothes, call ourselves by a certain name, where certain group tags. All of these things taken together identify us.

In some ways, we are the things we collect virtually. In SL, these are virtual things, like real life stuff. On ma.gnolia, the things I present to represent myself are the few bits of data I write about myself on the profile and my collection of links. Take a look at my top tags, and you'll learn a bit about me. These things are not really me, so we'll leave that really large discussion behind, but they do -- when taken together -- identify me, at least on the given system.

Which brings us to....

Data Portability

If I could take a few basic things with me from web site to web site, or a few basic "objects" in Second Life from one server to the next, I can recreate the identity I've made for myself. This is the real issue, and the tougher to solve because it involves multiple entities working together for a single end. I'm optimistic -- based on what I read is happening at Linden Lab, IBM, Google, and other places -- that companies these days are more interested in keeping you as a user than keeping your data. There are still those who don't feel this way, though, and as much as I dig Facebook and think the people working there are smart and wonderful, Facebook is one of the worst about locking up my data. (And trust me, I know the FB platform very well! For all the good things that it is, and all its coolness, it ain't about data portability.)

I like that Linden Lab is working hard to make data portability a real possibility. My understanding from what Zero said today is that ultimately that is the real goal. That's what he means by "interoperability" -- the ability to carry my data about myself from place to place virtually. Sure, we all have to worry about the mechanics like authentication and usernames, but these aren't really identity. Just logging in to play WoW with my Second Life account -- if all I do is create my identity new on WoW -- isn't really interoperability or my identity or data portability. It's the data that matters, not the mechanics of authentication or establishing a user name. This point seemed lost today, and may be why people got bogged down and needlessly worried about UUIDs and RFIDs, which really seem irrelevant to a system that allows me to take my data with me. Of course, I don't have to take my data if I want to create a new account each time for each system, but right now, I have no option of carrying data with me.

So returning to my optimism about the very real discussion going on around this in virtual worlds... my gut (and experience building for/on the web) says Second Life and other virtual worlds can get to this point quicker than the web. This is likely the path to the 3D space supplanting the 2D web. If for no other reason than that the system is being built (or rebuilt) at these early stages with data portability in mind.

Link | Posted by deryck on February 7, 2008 | 1 comment

Devurandom on the Planet

I mentioned recently that I had migrated this site to a new server. I'm pretty happy about the move and just wanted to mention it here. This site now sits on a dedicated server hosted at The Planet. So far, so good. And (keeping in mind my experience with them is limited to this year only) I would recommend their servers.

Jerry and I went in together on the server, and that has kept the cost low. We're hosting both of our personal sites and the two or three other sites we have for play. We were both on very limited shared hosting before this -- and man! the work I had to do to get Django running under that (no shell access, fcgi hacks, etc.) -- and now this feels like my site is getting the attention it deserves.

I'm running the site on Django still, but now I'm doing it properly with mod_python. I kept with MySQL, which is what my earlier host had because it saved me migrating to PostgreSQL (a trivial matter, I know) and also MySQL seems to run better with less required attention on a small site like mine. I've already fixed a number of bugs with my site because it's easier to do so now. And I've got a number of little changes and new features that I'll be working on as time allows of the next couple months.

Since I work on web sites all day long, my site never really gets the love it deserves, but with the new hardware I feel like I'm loving on you again Devurandom.org! And look, I've blogged 5 times in the last 4 weeks, a new record!

Link | Posted by deryck on February 6, 2008 | 0 comments

Any Given Super Bowl Sunday

A reminder to us all: it's not the fastest or the strongest that wins the game. It's the fastest and the strongest that day that wins.

Link | Posted by deryck on February 3, 2008 | 0 comments

Lots of Changes Since the New Year, and Without Any Resolutions to Speak of

Wow, it's been a crazy start to 2008. We're at the one month mark, and already I've seen substantial things happen in my life. I've decided to change career paths, migrated this site to a new server, reconfigured my workflow and technology to get serious about GTD, and even (sigh, yes it's true) started using Twitter.

The best part of all this is that it's all been positive change that happened naturally rather than via some artificial new year resolutions. I don't mean to suggest that I didn't take stock of my life at the end of 2007 and make plans according to what I would like to change this year, but I think I have had a more serious assessment of myself, rather than the "I want to lose weight" kinds of resolutions.

Oh, but hey, I have lost 6 pounds this month, too. :-)

I'll do a post on each change over the next week, but to start with the most life changing one first....

Career Changes

I've been working in the news industry about two years now. Before that, I worked for a large academic library for a couple years. All of this has been web development, but it's been working within a non-technical domain. There's nothing inherently tech about news and libraries, though each industry has similar issues to deal with given the age we live in. Both are trying to "save" themselves as is sometimes said of them.

There are really cool things about working in that kind of industry. In fact, I'd say mostly positive things about each. Each field has to really depend on programmers and technologists now to plan and move going forward, so it's nice to be a developer in that kind of environment. I have friends who never want to leave. There are, however, a couple things recently that have made me want to look for new work.

One, I started a Second Life for myself. Seriously, this has been a first life altering experience for me, both personally and in terms of the kind of issues I'm interested in technically and intellectually. I also see there's a lot of work to be done marrying the web and virtual worlds as the two grow closer together in the future, and I believe my skills as a web developer might be put to good use in some way as related to Second Life or other virtual world development endeavors. I also see growth for myself there, having lots to learn about metaverse development.

Two, I've just grown tired of being the sole developer, or one of a handful of developers working on projects. It's very lonely work in many ways, made more so by working in industries that don't totally get or know how to support telecommuters, and the number of software enginners -- or developers with engineer-like tendencies :-) -- is very low. Again, I have friends that love the non-engineering quality of our work environment, but I'm at the point in my life where I want to be around more software developers or software engineers, working on interesting and compelling problems. In other words, something like virtual worlds and virtual world development are interesting because the issues themselves are interesting and complex, rather than having to come up with interesting solutions to normal or common problems.

So What's Next?

The short answer is "I have no idea."

The not-as-short-but-not-long-answer is... I let Rob my wonderfully cool boss (though he hates being called the boss) know that I'm going to start looking for other work. I have indeed applied at a company or two related to virtual worlds, and I'll report here if something works out. I will still be at WPNI until I find something, and everyone at WPNI has been gracious and awesome to not replace me until I find something new and exciting for me to do.

So nothing has really changed for now, but making up my mind to jump out and do something new has been pretty rewarding personally. I have a great new sense of purpose to my work, and focus follows from that purpose. And I'm enjoying all that I'm learning about Second Life, virtual worlds, and all that goes with building virtual worlds. Those who know me well know how aggressive I can be about learning when I get excited about something new, so each day seems new and interesting and fresh.

I'll try to share some of things I'm learning here, and also keep this blog updated as I make progress on the job hunt. It's been quite a year so far, even if only a month has passed. I can't wait to see what the other 11 months hold.

Link | Posted by deryck on January 31, 2008 | 0 comments

You Can Call Me Anders

What's in a name? According to Shakespeare not much, but I had a hard time picking my avatar's name when I first signed up for a Second Life account. Googling for "choosing a second life name" turns up various suggestions about picking a name for yourself in SL, but I couldn't find much writing about why someone chose their own particular name. So I thought I would write about what led me to choose the name Anders Falworth.

The first thing someone has to do when signing up for a Second Life account is choose a name. This name is unlike the usernames you have to choose for any other web site account in that it must be a first and last name. I almost always use some form of deryck when I sign up for a service depending on what's available -- I'm known variously as deryck, deryckh, and deryckhodge across Google, Flickr, Facebook, et al. Since I had to choose a last name other than my own, and given that deryck insert-other-last-name-here didn't seem to have the ring of my own name, I decided to just embrace a new world, a new me, and create a completely unrelated name from my own.

I literally spent 30 minutes at the account creation screen trying to think of a new name. At the time that I signed up, I wanted Second Life to be a diversion from work over the holidays. Something fun and immersive, like going on vacation digitally, but also a way to learn new things while playing. So the thoughts going through my mind were of all the things I had immersed myself in over the last 10 years -- music, comics, and sci-fi TV and film. And for some reason -- maybe given what the virtual world reminded me of -- I couldn't get the agent from the Matrix out of my head. All I could hear was "Mr. Anderson" with that snide, chiding tone. Then I mind jumped over to Battlestar Galactica and thought of the character Anders from that show. And Anders became the first name.

The last name was not a big choice for me. I scrolled the list trying various names until I liked the combination. Falworth seemed mysterious, adventurous, even a little dark maybe.

An email later to activate the account and here we are. It's funny... I've been called "Anders" so much in world, I'm starting to think of myself by that name somewhat. I guess it's a bit like IRC nicks. I've seen people on Samba Team answer in real life to either their given names or their nicks. If someone came up to me in real life and said, "Hey, Anders," I'd probably say hey back and start talking before it dawned on me that wasn't my real name. So it's okay... call me Deryck or Anders. I'll likely answer to either.

Link | Posted by deryck on January 24, 2008 | 0 comments

Developing a Second Life

Over the month of December, I signed up for a Second Life account. Okay, and yeah, I'm hooked. To say I'm hooked is an understatement of an understatement. Not that I mean to wax too dramatic about it -- those who know me well know I can be dramatic at times (!) -- but discovering Second Life has been like buying my first album, reading Flannery O'Connor for the first time, or even meeting my wife and knowing instantly I wanted to marry. That is to say, I know this is an experience that will change my life, and quite possibly my career. At the very least, I have a serious new interest that is shaping the kinds of technical questions and curiosities that I'm sure will spill over into my coding, writing, and general geeky endeavors.

And Second Life is just plain fun, too. Don't mean to forget that part.

So what does this mean? There will likely be posts cropping up here on my blog related to Second Life. I've started a Second Life Flickr page for my in-world photos (not much there yet, but I hope to get more interesting stuff up). And I most certainly will play with some ways to merge my web development interests with SL. What this means or what will come of this blending of the web and SL for me is still to be determined, but I think there's some interesting work still to be done in more closely marrying the web and Second Life.

Anyway, we'll see what comes of it all. I'm enjoying being in world, and excited about playing with SL related code, sites, and ideas. Look me up in world if you're an SL resident. I'm known as Anders Falworth in SL.

Hmmmmmm, I'm thinking there's also a post around the corner about why I chose that name.

Link | Posted by deryck on January 16, 2008 | 0 comments

The Oldness of New Media

I've never really cared for the term "new media." It has some usefulness for describing writing, video, and other publishing taking place online, but here's my big gripe with the term -- no one uses the term but people in "old" media. You never here people from YouTube describe their site as a "new media site for sharing user-generated video." For me, "new media" seems to be a way to make publishing online seem more important than it really is. I hate to break it to fans of "new media," but there's nothing special about putting your writing or videos or what have you online. Everyone's doing it these days.

So with that out of the way, here are a few things that point out your oldness, you new media publishers you:

putting the .com in .com

Only old media publishers brand their sites with the .com actually in the logo. Look at Slashdot or Digg. In fact, just notice that we refer to those sites as "Slashdot" or "Digg," not "Slashdot.org" or "Digg.com." Now, look at NBC.com or CNN.com. I understand that you're trying to distinguish your online presence from your traditional venue of the TV, but no one else thinks of you like this.

Maybe in a few years when the TV is just another computer screen, these sites will finally drop the .com from the branding. Let's hope they're not irrelevant by then.

User-generated content

I've been working on a project for work that has been called our "user-generated content" project. I think the term is habit more than that anyone I work with really thinks of it like this, but the phrase -- which is batted around way too much -- is a clear sign that you're old media trying to be new media. Here's the problem I have with the phrase:

There is nothing on the Web that doesn't originate with the user.

Facebook and MySpace and other social sites are the most obviously purely user-based sites, but even Google wouldn't exist without a user's search terms. The very act of linking to your site is a user act, not a publisher one. If your thinking of content as an us and them proposition, your days are numbered. Sorry to break it to you. :-)

The Author Complex

This is a hard one to explain succinctly, but we'll call it the "author complex." I'll reach back to my "Introduction to Creative Writing" class to explain.

In fiction, the author is God. The author decides what happens and to whom it happens and why it happens. The author controls the experience. The author guides the reader through a narrative, controlling pace, setting, point of view, and access or lack thereof to a character's thoughts. In fiction, the author has complete control over the reader's experience of the text.

It's all too common these days that old media publishers want to carry this control forward to the Internet. DRM is the most obvious and heinous way in which publishers try to control the use of their content. The political and ethical debates run deep on this one, so I won't even get into DRM itself. Both iTunes and NBC Direct have DRM, but once I buy something on iTunes I can basically do what I want with it. With NBC Direct, I need a proprietary, Windows-only player and the download only works for 48 hours. Yes, I can redownload after 48 hours, so long as it's within 7 days of when the show last aired.

gah!

That's complicated to understand, much less use, and stands as a clear example of where NBC has an author complex and wants to control my viewing experience. My cable company's video on demand service is more flexible.

But DRM is not the only example of this. JavaScript ads that take control of the web browser, opening new windows to external sites, and even navigation elements themselves can be used to try to control the user's experience of the content.

If you look at features of a web site as a way to direct or control a user's experience rather than as a way to enable your users to do what they want to do, you're old media trying to be new.

Link | Posted by deryck on November 29, 2007 | 0 comments

Do You Trust Google?

I was visiting recently with a former colleague at Auburn University Libraries and when the discussion turned to my Google book, this person said something to the effect of, "So what about Google now? They want to take over the world, don't they?" I had a similar conversation with another person when the topic of Google's open source program came up. This other person floated the idea that Google's open source efforts are little more than a veiled recruiting effort. This seems to be a recurring theme for me lately, and since I spend a lot of timing working with Google APIs and code, working on my book, and thinking about Google generally, I wanted to get down some questions I had here.

Certainly, Google is growing rapidly these days, that's true. I'm also sure Google has hired open source developers it found through the Summer of Code or it's code hosting service. But is that cause for alarm? And do those two things alone offer enough now to make Google -- once the poster child for corporate good citizens -- the next great, greedy tech company?

In defense of my friends' comments, I don't think anyone really argues that Google is doing evil things with their power these days. But there does seem to be this growing concern, a concern that too much influence over our daily lives will ultimately lead a company to do bad things. So my question: is that true? Does power corrupt? Or can Google ultimately do good things with all it's services, acquisitions, growing user base, etc? When is too much finally too much? And is it Google we fear, or is this really just a fear of technology's encroachment on our lives?

My gut feeling is that even those of us who live on the Internet are getting a little scared of what all this tech might add up to one day. But I'd love to hear what someone else thinks about this.

Link | Posted by deryck on November 27, 2007 | 0 comments

One Year with the Washington Post

This past October 24 marked my first year anniversary at the Washington Post. It's been a crazy, insane, interesting, complex, fun, frightening, rewarding year. To celebrate, here's a link dump of everything I've had some hand in over the last year:

And there were a ton of apps we added to Ellington for Loudoun Extra, which I won't list individually, but they include a church directory, school directory, and a deal of the day business directory. I've got stuff in the queue, too, for Loudoun that just hadn't seen a public launch yet.

All in all, a pretty busy year. If anyone I work with wanders by, did I forget anything?

Link | Posted by deryck on November 8, 2007 | 0 comments

Heroes: More Than A Monday Night Well Spent

Heroes Season 2 premiered last night. Anyone who knows me at all knows I am seriously into Heroes. To say that I was excited about the new season is an understatement. Overall, I felt exactly as I felt when the first season premiered — not blown away by anything particular but interested in all the places the story could go. It was the same feeling last night, which makes me pretty excited about the coming season. That's the attraction of the show to me: the way it gets better over the course of the year.

Not only is Heroes a great TV show, but NBC did an incredible job last year of extending that story to the Web. Granted they don't have a clue with regard to how to handle video downloads online — see the recent iTunes fiasco — but otherwise, the Heroes 360 Experience, as they called it last year, was pretty sweet. (This year it looks like they're calling it Heroes Evolutions.)

Henry Jenkins argues that "what we are calling Web 2.0 is fandom without the stigma". I totally agree with that statement, even as I tire of the abuse of the term "Web 2.0." And Heroes certainly allows some fandom. Last year, not only did I watch every episode on TV, I got each from iTunes (not this year!), read the graphic novel every Tuesday, received SMS and email from Wireless (a character on the show), read her blog, applied for a job at Primatech Paper, and visited sites from other characters on the show.

Oh, and then there was that time we helped stop an early plan of Linderman's for getting Petrelli elected. :-)

So you're beginning to get the idea of how deeply I'm into this show, right?

It's not that you miss anything if you just watch the TV show, but the Web is a nice way to deliver additional content for those who want more. I hope the network continues with the online experience this year. The sites from last year are still around, and a new character is featured in a faux documentary. Heroes certainly seems to be off to a great start both online and off. And really, I think this kind of cross-platform story telling is just beginning to get interesting. There's so many more interesting things yet to be done. It's a great time to be a web developer with an interest in telling great stories.

Link | Posted by deryck on September 25, 2007 | 1 comment

Facebook/Washington Post, Performance Tuning

This final post about my group's work (at Washington Post.Newsweek Interactive) on our Facebook Platform app The Compass is long overdue. But now the time has come! Let's talk Postgresql and Apache performance.

In the first two posts on this subject, I wrote about the Facebook Platform itself and the Compass' architecture. In this post, we'll look at some of the challenges we encountered while serving the app and areas we focused on to improve our Postgresql and Apache performance.

NOTE: All of this is anecdotal, based on my experience with this app. I'm no performance guru and don't hold myself up as such. I think, too, different applications have different needs, and the requirements of something like Facebook could not be optimal for other situations.

Caching Limitations

As I mentioned last time, all of FBML we load into a profile is cached and served by Facebook, but the hits to our application pages are hits to our servers as well. The first thing that comes to mind with Django is, "well, make sure you have caching enabled." There are a couple reasons why this doesn't work as well as one would like.

First, the caching for a Django site is bypassed when the request contains GET or POST data. Every request from Facebook contains POST data. Each callback request has a few fb_sig* parameters that are POSTed to your page to verify the request comes from Facebook. This is great for security and passing data from Facebook back to your application, but it kills the normal caching process for Django-based sites.

Second, each request can potentially be unique. In our case, the only Facebook canvas pages we serve are the one that submits the compass survey questions and the one to display the Flash map of your friends who have installed the compass. It's hard to do much low-level caching of Django querysets because you don't want to inadvertently give the user someone else's data. We do a little of this, though. See, for example, what we do here when we display the compass based on your last answer:


cache_key = 'compass_entries_%s' % facebook.user
compass_entries = cache.get(cache_key)
if not compass_entries:
    compass_entries = Compass.objects.filter(user__exact=facebook.user).order_by('-id')[:10]
    cache.set(cache_key, compass_entries, 60 * 15)

We also reset these entries in the cache when a user resubmits the survey. So we save a few DB hits if the same user retakes the survey a few times back to back. However, there's just not much in common across users to really take advantage of Django's cache. We're pretty well left to raw DB performance.

Bypass the ORM

One of the first things we did to help performance was to bypass Django's ORM. We store the user's answer to each question via a save method on the form that is submitted. Using the ORM this would look something like:


from politicompass.models import Compass
def save(self, uid):
    q1 = self.clean_data.get('q1')
    q2 = self.clean_data.get('q2')
    q3 = self.clean_data.get('q3')
    q4 = self.clean_data.get('q4')
    q5 = self.clean_data.get('q5')
    q6 = self.clean_data.get('q6')
    q7 = self.clean_data.get('q7')
    q8 = self.clean_data.get('q8')
    q9 = self.clean_data.get('q9')
    q10 = self.clean_data.get('q10')

    for i in range(1,11):
        answer = 'q%s' % i
        compass = Compass(user=uid, question_id=i, answer=answer)
        compass.save()

We refactored this before launch to bypass the ORM and excecute the INSERTs in one connection:


from django.db import connection
from politicompass.models import Compass
def save(self, uid):
    q1 = self.clean_data.get('q1')
    q2 = self.clean_data.get('q2')
    q3 = self.clean_data.get('q3')
    q4 = self.clean_data.get('q4')
    q5 = self.clean_data.get('q5')
    q6 = self.clean_data.get('q6')
    q7 = self.clean_data.get('q7')
    q8 = self.clean_data.get('q8')
    q9 = self.clean_data.get('q9')
    q10 = self.clean_data.get('q10')

    sql = ""
    for i in range(1,11):
        answer = 'q%s' % i
        sql += "INSERT INTO facebook_compasses (user, question_id, answer) VALUES (%s, %s, %s);" % (uid, i, answer)

    cursor = connection.cursor()
    cursor.execute(sql)
    connection._commit()

There were other performance-conscious moves we made along these lines, and still, once the app started to grow in popularity, we had users submitting that form in such numbers that our DB server load stayed at a freakishly high level. (NOTE: Prior to Facebook, we normally ran at about a .20-.35 load. Once the Facebook app launched, our load jumped up into the 3.00-4.30 range depending on site activity.)

Tuning Postgresql

I had already tuned Postgresql once for some spikes we had encountered when some of our apps were linked up by MSN and MSNBC. These tunings included raising the max_connections limit and bumping up the amounts for the following settings:


shared_buffers
work_mem
maintenance_work_mem
max_stack_depth

The most significant of these for us was shared_buffers. With the hits we had received from MSN and MSNBC, raising shared_buffers to about 1.6 GB (we have 8 on the box) and increasing max_connections was enough to keep us humming along nicely. With the Facebook traffic we had to increase shared_buffers to about half the available RAM on the box and everything dropped back to a sane level. We are running on Solaris and so we had to have our box increase the amount of shared memory available from the kernel in order to give so much RAM to shared_buffers, but again, once this happened, the load recovered amazingly well.

Hits Under Facebook

Just to toss out some raw numbers, when we first loaded our app to Facebook, we were doing about 5-10 hits a second during peak usage. We ended up doing about 2.5 million hits the first week, just from Facebook alone. We run 4 other sites off the same server. This is a single Postgresql server. We do have our two web servers behind a load balancer, and our static media is served from the normal media.washingtonpost.com setup. Needless to say, there are certainly higher numbers that other sites boast, but the single DB, with some tuning and planning, survived the spike pretty well.

Currently, we're doing about 10 million hits a month from this setup, and we're really at its limits now. To do much more, we'll have to look at replicating the database. Having said that, were it not for the Facebook traffic and a similar Newsweek week app bypassing the cache for reasons outlined above, I think we could easily do twice the traffic on the same setup. Caching really saves on DB load, so use it all you can if possible.

Apache Tuning

Luckily, we never felt the Facebook traffic from an Apache stand point. I will point out, for the sake of LAMP stack completeness, that the best trick I learned for Apache is to set MaxRequestsPerChild to something in the range of 500. This keeps Apache memory size down while also serving a decent amount of traffic per process. And if you don't know this already, never serve a Django-based site with DEBUG=True. Not only is it bad from a security stand point, but Django in DEBUG mode stores the queries run in memory, so you can quickly eat up your RAM if you forget to turn this off.

Again, this is just my experience of tuning our stack, so YMMV, but I hope sharing this info will prove useful.

Link | Posted by deryck on August 15, 2007 | 177 comments

Washington Post and Facebook, Part Two

Last time, I wrote an introduction to our development efforts around Washington Post's first app for the Facebook Platform. See that post to get an idea of what Platform is and why it's interesting. In this post, I'd like to talk more about how we used Django to serve the application. In part three, I'll talk about some performance-tuning lessons learned through the course of this development and deployment.

Callback Architecture

Facebook's Platform is based on a callback architecture. The application is hosted on Facebook, users connect to and interact with the application through Facebook, but any page for the application is returned from a callback URL running on our own servers. To help illustrate this, let's look at the process for registering an app on Facebook.

The figure below show the first few questions for the setup page for our app, The Compass.

Edit screen for The Compass

Notice there is a "Callback URL" and a "Canvas Page URL". The callback URL is the base URL on the developer's server (washingtonpost.com in our case); the canvas page URL is the base URL on Facebook's server. When you install our app on Facebook, you are redirected to the canvas page URL, which in turn fetches content from the callback. You can have any number of callback pages extending off the base. If you went to apps.facebook.com/thecompass/foo/, then that page would fetch content from specials.washingtonpost.com/politicompass/foo/.

Now you can't go directly to specials.washingtonpost.com/politicompass/ because without the POST data Facebook submits to the callback URL, the application won't work. If you hit our server directly without coming through Facebook, we redirect to the Facebook URL for our app. In fact, every time Facebook hits our callback URL there is a little setup that has to be done for each request. To incapsulate this neatly, I've got an init function that is called at the start of every view function.

The init function looks like this:


def init_facebook(request):
    facebook = Facebook(config.API_KEY, config.SECRET)
    facebook.set_facebook_url(host=FACEBOOK_HOSTNAME)

    # Ensure we're running inside the Facebook frame.
    # All Facebook platform frame pages send a POST with fb_sig.
    if not request.POST.get('fb_sig'):
        return HttpResponseRedirect(facebook.facebook_url)

    user = request.POST.get('fb_sig_user')
    if not user:
        return HttpResponse('<fb:redirect url="%s" />' % facebook.get_install_url())

    facebook.user = user
    facebook.session_key = request.POST.get('fb_sig_session_key')
        
    if facebook.session_key != facebook.get_session():
        return HttpResponse('<fb:redirect url="%s" />' % facebook.get_install_url())
    return facebook
  

init_facebook is not a view function itself. It is called from within a view function, even though it is passed the request object like a view function. To see this in action, let's look at the first few lines of the "index" view function:


def index(request):
    facebook = init_facebook(request)
    if not hasattr(facebook, 'api_key'):
        return facebook
  

I check the returned object to see if it's a Facebook object or not. If not, then it's an HttpResponse object, and I need to return that to the requesting client. Once the Facebook session has been setup (all of this through the Python client library), then we go about the business of processing forms, saving to the DB, etc. There are only a couple views used. An "index" view is used for the initial canvas page, which delivers the 10 questions and submits the answers to the database, and a "friend" view that builds the friend canvas page and provides XML to the Flash map on the friend page.

Profile Publish Model

If it's not readily obvious, every time a user hits one of our canvas pages, they end up submitting a request to our servers. There is also a part of our application that lives on the user's profile page. Once you submit the 10 question survey, we place a compass image on your profile page. Facebook doesn't call another site's URL directly from a user's profile. They use a publish model. To place something on the profile, we have to explicitly call profile.setFBML. I'm doing this right after a successful save of the compass questions to the DB.

There are advantages to the publish model, both for Facebook users and developers. For us, it means our servers don't get hit with every profile page view on Facebook. profile.setFBML takes a string of FBML as its chief argument. That FBML is cached and served from Facebook's servers. For users, this means that their profile is a little safer from being hijacked by an application. The disadvantage of this is that you have to have the user initiate the action that changes the content on the profile. This wasn't a problem for us, but would create problems for something like last.fm that would want their app to dynamically update a playlist on the user's profile.

To be continued....

Even though the profile hits are cached and served from Facebook's server farm, the canvas page traffic has still been intense. Next time, I'll go over some things we learned in tuning Apache and Postgresql given the sudden bump in traffic.

Link | Posted by deryck on June 6, 2007 | 136 comments

Washington Post and Facebook Platform Development

A little over a week ago, my group at WPNI was involved in developing an application for Facebook's latest version of their platform. If you haven't yet heard about this, Facebook Platform now lets developers create full blown Facebook applications just like Facebook's own Photos, Events, and Notes applications. Any developer can create any kind of application which can run on Facebook itself. Our first application is called The Compass. (You must be a Facebook user to view or use the app.)

Rob does a much better job than I ever could of explaining the ideas, the creative process, and the Compass itself. I thought I would try to add a little on the technical aspects of the app, specifically the Facebook/washingtonpost.com intergration, deploying on Django, and server performance issues from the Facebook traffic.

Facebook Platform

Facebook has had a developer's API since late last year. This API allowed developers outside of Facebook a way to create web or desktop applications that could integrate data from Facebook's social network. A Facebook user could go to a website I created, and given the proper Facebook credentials, login to my site and have some or all of their social data from Facebook follow them. For example, their list of friends, groups, event information, and so on.

The current, updated version of this API contains all the methods that existed before --


facebook.auth.getSession
facebook.friends.get
facebook.groups.get
facebook.evets.get
  

but also, there are new methods specific to loading an app within the context of Facebook itself --


facebook.feed.publishStoryToUser
facebook.feed.publishActionOfUser
facebook.profile.setFBML
facebook.profile.getFBML
  

The latter methods are the hooks for publishing to a section on the user's profile. Calling those methods (through the Facebook client library of your choice) would allow you to publish a string of FBML into the user's profile. FBML is where the new version of the Facebook platform gets interesting and begins to separate itself from other simple widget platforms.

FBML

FBML, or Facebook Markup Language, is essentially html (stripped of script and other potentially dangerous tags) with a few Facebook specific tags. These fb tags allow the developer to access a Facebook user's data in a generic fashion. The real advantage of this approach is that with just the UID of a Facebook user, I can load related data in a Facebook page or on the user's profile without having to do any processing on my end.

For example, this --


{% for friend in friends %}
    <fb:profile-pic uid="{{ friend }}" />
    <fb:userlink uid="{{ friend }}">
      <fb:name uid="{{ friend }}" useyou="false" />
    </fb:userlink>
{% endfor %}
  

is an example in Django template syntax using FBML that would produce the following --

Facebook renders the FBML tags for you, which has some significant advantages. The user's profile picture is always set to the current profile pic just to throw out the most obvious.

Note: in the code example above, "friends" is a list of Facebook UIDs returned by calling facebook.friends.get.

Python Client Libraries

Everything we do at WPNI uses the Django web framework. Being a Python framework, this means everything we do is written in Python (short of several shell scripts for updating code, managing deploying, etc.). The initial example library that we received from Facebook was written in PHP. There was an existing Python version of the Facebook API hosted on Google Code, but it didn't have the new methods mentioned above nor did it seem to work all that well for me. So to get started with Facebook, I had to rewrite that PHP client library in Python. I did borrow a couple methods from the original Python library, but mine is largely a one to one copy of the current PHP library hosted from the Facebook developers page.

Since I was under a bit of a deadline pressure, I didn't port every method or else I would release my version. I did notice that the current Python library on Google Code has been updated. I haven't tried it to see if it works any better. I had hoped to spend a little time after our application's launch to finish out this library and do a little more Facebook development, but other deadlines are pressing down on our team. It looks like I may not get back to this, but if that changes, I'll post here with new info. If I ever have the chance to finish out the library, I'll certainly make it public.

To be continued....

Okay, this got to be a longer post than I first imagined it would be. I'll break this into sections. In a day or two, I'll post on how the Facebook/Django intergration actually works, and a day or two after that, I'll post on the server issues we experienced and offer some performance tips for scaling a mod_python/Postgresql application.

Link | Posted by deryck on June 3, 2007 | 5 comments

Dark Night of the Coder's Soul

So where has Deryck been, I hear you ask? Coding, coding, coding. The last few weeks have been an intense period of work for me. The end result is Sprig, a product site for people interested in green-friendly products. My group at WPNI did the development work building the site, but the idea for the site was from the "New Ventures" group at WPNI, headed by Mark Whitaker. I'm still cleaning out a couple annoying bugs on the site, but I'm close to moving on to the next thing.

Anyway, now that the site is behind me, I actually feel a little refreshed and ready for something new. In the next couple weeks, I'll post a little on our use of Django at the Post, hopefully catch up on some Samba work, and maybe even get back into my book a little. That's still up in the air, but (/me crosses fingers) it looks like the day-job work might allow enough of a life after work to allow this now.

That reminds me, I should ping my editor... so until next time. Cheers!

Link | Posted by deryck on May 4, 2007 | 0 comments

ma.gnolia feeds

I'm wondering what's up with the broken RSS feeds from ma.gnolia. I've been a happy user of the bookmarking site so far, but my links look terribly out of date with the feed being broken. And it's not consistently broken. Sometimes it has my first 4 bookmarks, sometimes any random 2 bookmarks. I'm still bookmarking through them, but the links page here looks really out of date.

Hopefully they'll have this fixed soon.

Link | Posted by deryck on April 3, 2007 | 0 comments

Definitions

In my last post, I was thinking about how routines can shift over time, gradually changing us in the process. This idea has really affected how I view myself and my work lately. Defining oneself by one's occupation is the norm, right? If I meet you for the first time and we strike up a conversation, inevitably someone will ask, "So what do you do?" As if what a person does is somehow indicative of who that person is.

I think for people working in Web-related businesses, it's a dangerous thing to define people by their job title. This is the day of the mashup, right? No site is built on a single idea anymore. Google even does more than search now. The web is constantly changing, too, so how we deal with the web, work on the web, has to constantly change. And really, there's no term that accurately describes what I do. Am I a web developer? Yes. Am I a programmer? Yes. Do I do a whole bunch of other stuff that doesn't fit neatly within those two categories. Yes, a whole lot of stuff.

I participate in design and architecture decisions for our sites, am chiefly responsible if the servers go offline or don't serve the traffic properly, have a voice in setting our development/design schedule, and just generally have a hand in most every area of our work at WPNI. As does everyone else I work with. Our editors, Tim and Cara, write, shoot video, take photos, and a whole lot of other stuff. Levi is one part project manager, one part journalist, and several doses of HTML/CSS developer. Jesse does video crunching, flash, javascript, HTML, and CSS. And these are just the quick and easy examples.

Rob is all over the place, too, in terms of what he does. He's really the one we take our cues from and who is largely responsible for this work environment. In fact, it's one of the things I like most about working with Rob. No one on our team is cornered into a certain definition of his or her role. Everyone has a voice in the process and is actually able to work without worrying about artificial boundaries like definitions of "what we do."

I'm probably not explaining this properly, or doing the idea justice, in this quick post, but I think it's key to having a successful and fun time on the web. This is the nature of the web itself. The thing just isn't bound to a single person's notion of what it is. Each blog reflects the personality of its writer. Each site — or any good site — is infused with the life of its creators. And people really are more than any single definition does them justice. My dad is much more than an accountant, my wife more than a massage therapist. And I'm sure who they are is nothing like the personality you just ascribed to them when you read what they do in that last sentence.

So what do I do? A whole bunch of stuff. Oh yeah, and it's all fun.

Link | Posted by deryck on March 23, 2007 | 0 comments

Routines

My four year old, Zoe, just went to bed. Every night we watch movies or cartoons until she falls asleep on my shoulder. Then, I carry her up to her bed, tuck her in, and get some work done before I lay down beside her in her princess bed. She calls it this because she has Disney princess sheets and pillow cases.

My wife, Wendy, sleeps with our younger daughter, Waverly. Wavy, as we call her, will be six months old next month. They start to bed around seven, but their routine takes a little longer. Wavy likes to eat her solids, then her formula, and then play awhile before calling it a night. Playing, for her, consists of lying on her back, laughing at her mommy, and trying to catch her toes. She puts up a fight every night, but somewhere around 8:30 or 9:00 she and Wendy fall asleep.

This hasn't always been our routine. Before Wavy was born, we all slept in the same bed in mine and Wendy's room. Me, Wendy, Zoe, and our dog Macy. Macy is a 100 lb German Shepard and Lab mix. Zoe loved sleeping with her mommy, and it was hard on her when Wavy came along. But one night, I'm not sure when, we started watching movies together to buy some time for Wendy to get Wavy to sleep. Now it's our routine. Funny how routines change over time.

For the longest time, I considered myself a musician. I lived to play and actually hacked out a living at it. Then, I wanted to be a writer and scholar, and worked hard to get through an English degree. I taught high school for a little while and somewhere along the way picked up web development and programming. Now I consider myself a programmer, but really it's only some small part of who I am.

I don't make a living playing music anymore, but I still love to play when I can. I never managed to finish my Masters in English, but I still read a lot of serious fiction and poetry. I don't know how long I'll write code, but I'm sure I always will no matter what my next career. Routines are funny that way. One gives way to another, each one leaving its impression, small shifts in behavior that over time inform who we are.

Link | Posted by deryck on March 19, 2007 | 0 comments