WSGI talk code now online

At the September 2009 Detroit Perlmongers / dynamic language meetup I gave a talk on Python and WSGI.

I walked through six different examples showing what WSGI is and some parts of the WSGI web development ecosystem. The six examples were:

  1. the simplest WSGI app, and serving that same app under many different WSGI servers
def application(environ, start_response):
    '''The simplest WSGI app.'''
    start_response('200 OK', [('content-type', 'text/html')])
    yield '<h1>Hello world</h1>'
  1. request and response wrappers, introduction to middleware
  2. a slightly more full example of middleware
  3. a fleshed out app with templates, URL routing, and some more middleware
http://lost-theory.org/images/wsgi-middleware.png

Great diagram explaining middleware from the Pylons documentation.

  1. form generation and validation
  2. using an existing app with authn / authz middleware

You can find the code and instructions for running the examples on Bitbucket.

Ltchinese 0.1 release

I've given one of my old projects, ltchinese, an official release on PyPI, the Python Package Index.

I followed this excellent guide to make the package and publish it to PyPI. This is my first real open source Python package.

ltchinese is a small library of tools I built up when creating some of my Chinese language learning pages Ocrat mirror site, Mandarin phonetics table, etc.). It would be useful for developers that are building tools or web apps that deal with the Chinese language. It also includes a programmatic interface to some of the data on my site.

There is also documentation available (which is hosted by PyPI, cool).

I also got my first bit of feedback that someone was able to use the library for something useful. Thank you Vathanan!

Psychological randomness

If you ask a person to give you a 'random' number from 1 to 20, you are more likely to hear some numbers than others. There have been experiments that numbers like 13 or 17 sound more random than numbers like 10 or 20. For example:

Results revealed that for the entire sample the greatest percentage of tickets chosen for the first four selections were "random" tickets. Further, the most commonly cited reason for selecting and changing a lottery ticket was perceived randomness. -- Underlying cognitions in the selection of lottery tickets

Despite having an equal chance of being picked from a hat, certain combinations of numbers are perceived as more random than others.

Aside: Also see The Secret Lives of Numbers for something related, and awesome. It's shows how popular numbers are in search results. Phone numbers, tax forms, zip codes, famous dates, etc. show up more often than other more 'random' numbers and create some interesting patterns.

So, sometimes psychological randomness is more important than true randomness when dealing with perception.

One of the things I dislike about the program I use to play music (Foobar 2000) is that that the "Random song" button will pick the same song twice in a row. For a random number generator, that is a fine result, all numbers should have an equal chance of being picked. However, for the purpose of a music player, we don't want a "real" random number. Most people use a "Random" or "Shuffle" feature to listen to new songs. How often do you want to hear the same song twice in a row? If you want to hear the same song twice, why would you even use the "Random" feature in the first place?

I think "psychological randomness" should be one of the primary goals for any shuffle / random feature.

The easiest thing to do would be to keep a list of recently played tracks (at least ten), and when picking the next song, re-pick if the song is on that list. Another thing you could do is keep a shuffled version of the original list (a random permutation, then pick songs in order from that list.

But, once you start going down this path, it could be hard to stop! Would it be 'random' for two songs from the same artist to play? Same album? The more data you look at about the song (artist, album, title, user rating, genre, lyrics, mood), the smarter your randomization could be. You could build your own Recommendation system based on that song metadata, like Pandora.

Running zine on Dreamhost PS

I started using Dreamhost PS July 2009 and after a few hours had a Zine blog running. That's was one of the quickest turnarounds I've ever experienced for getting a Python web app up and running and exposed to the web.

It was better than starting from scratch on a new VPS, but there was some weirdness. Dreamhost PS is like a VPS where you can run what you want, but without root. It's like using your friend's server and he doesn't trust you very much. :)

Here are my steps for getting Zine running on Dreamhost PS.

Administrivia

  • First, I cranked the resources down to the minimum in the Dreamhost PS admin panel. They start you off at the highest setting so you can monitor how much resources your server actually needs, but if you're going to have little traffic to start I don't see the point.
  • Next, I created a new shell user on my Dreamhost PS instance.
  • Finally, I made a new subdomain (blog.lost-theory.org) belonging to that user.

virtualenv & pip

DreamhostPS has a good version of Python (2.4.4) (update for 2011: this qualified as good in 2009) and easy_install, so you can dive right in. I started by first setting up my virtualenv:

$ cd ~
$ mkdir -p zine/lib
$ easy_install --install-dir=~/zine/lib virtualenv
$ easy_install --install-dir=~/zine/lib pip
$ cd zine
$ virtualenv .

Install Zine and its packages

You can then start installing Zine and the packages it depends on into that virtualenv.

$ cd ~/zine
$ wget http://zine.pocoo.org/releases/Zine-0.1.2.zip
$ unzip Zine-0.1.2.zip
$ pip -E . install Werkzeug
$ pip -E . install Jinja2
$ pip -E . install MySQL-python
$ pip -E . install SQLAlchemy
$ pip -E . install simplejson
$ pip -E . install pytz
$ pip -E . install Babel
$ pip -E . install html5lib
$ (try to install lxml...)
(over 9000 compilation errors)

Now here you will run into a problem since lxml requires the libxml and libxslt packages. On Dreamhost PS we don't have root, so we can't install these packages with apt-get install. I took a peek at how Zine uses lxml and it seemed like I might be able to get away without installing it:

./importers/wordpress.py:15:    from lxml import etree
./importers/feed.py:13:    from lxml import etree
./zxa.py:21:    from lxml import etree

I tried wrapping those lines with try/except like so:

try:
    from lxml import etree
except:
    print "Skipping lxml import... will die later"

That will let you start serving up your Zine instance. I haven't had a problem so far with skipping the lxml import (because I haven't used those features yet that require it). It might be possible to use elementtree instead, but it's working fine for now.

Install and quickstart

Install the Zine package:

$ cd ~/zine/Zine-0.1.2/
$ ./configure --prefix=~/zine
$ make install

After that you can create and start an instance:

$ cd ~/zine
$ mkdir instance
$ ./Zine-0.1.2/scripts/server -I instance

This will start the install wizard on port 4000. Go check it out!

Database

You won't get very far without a database though. One important thing to remember is that you do not need to use the Dreamhost PS MySQL service. You can use your existing DH sahred hosting MySQL. All I had to do was set up a new user and database on the existing MySQL from my regular shared hosting service.

After that's set up put the DB URI in the install wizard and you're pretty much done.

At this point you'll have Zine fully functional on port 4000. You can start writing entries and checking out the themes and all that. But we can do better than running on port 4000 with a development server, can't we!

Serve Zine using cherrypy and lighttpd

I want to run Zine on port 80 and serve it with something a little more powerful, so I checked out what the Dreamhost admin panel offered. There are settings for proxying and Mongrel and FCGI, but those don't really apply.

DH gives you two choices for serving on port 80: Apache (the default) or lighttpd. You can run your own long-running processes, but they have to serve through Apache/lighty (using CGI, FCGI, and I think Phusion Passenger and maybe a few other options). You can't run your own server on 80 since you don't have root.

I chose lighty for the smaller footprint and because I find its configuration a bit easier. To proxy all requests at the root of the domain from port 80 to port 4000 you can use the following:

$HTTP["host"] == "blog.lost-theory.org"
{
  proxy.server = (
    "" => (
      "blog" => (
        "host" => "127.0.0.1",
        "port" => 4000
      )
    )
  )
}

The configs are stored per-host here: /usr/local/dh/lighttpd/servers/your-user/

Once that was set up I decided to swap out the Werkzeug development server for Cherrypy's WSGI server. You can keep most of the Zine-0.1.2/scripts/server script the same, just pip install cherrypy and switch Werkzeug's server to CherryPyWSGIServer (yay WSGI).

One more important change: Zine will probably still think that the address of the blog is http://example.com:4000/. This will make all the links point to that site, which is ugly. To fix this just drop the port number off of the blog_url setting in zine.ini.

Conclusion

That's all it took! I am pretty happy with how easy it was.

Python web apps are not as easy as something like PHP to get up-and-running, but this process was pretty ideal in my opinion. I hated using a bare bones VPS because you end up spending more time on sysadmin and thinking about security holes than on building and deploying cool apps. Dreamhost PS seems like a good middle ground between bare bones VPS / dedicated servers and shared hosting. It has a lot of good defaults, polished administration (via the panel), decent price, and the ability to scale upwards.

Google App Engine is a similar service, but I haven't tried it yet. I am a little worried that existing Python apps / code aren't portable to GAE.

As far as resources go, the 150MB memory has been smooth so far. I'll monitor if I need to up my resources. If I do need to increase my resources I'll probably do it since they make it so easy and the price is reasonable. :)

http://lost-theory.org/images/dreamhostps-usage.png

One interesting thing is that after switching to lighttpd+cherrypy (after the spike in the graph) my memory usage went down. I've heard that Dreamhost PS has about a 100MB footprint when idle. After switching to lighty+cherrypy my memory usage is ~50MB when idle.

Hope this was helpful if you're interested in running Python apps on DreamhostPS. Happy serving to you!

Update: Dreamhost PS now gives you root access by setting up a new account under "Manage Admin Users" in your control panel. This makes things a lot easier and gets rid of all my whinging about not having root.

Update 2012: I moved away from Dreamhost PS and Zine for a static blog (using Blohg). It is funny re-reading this now and seeing that I thought this was "easy". Now I am spoiled by ep.io and dotcloud.

Test post

Welcome to my blog :)

Heading 2

  • List
  • List
  • List
  1. Number
  2. Number
  3. Number
import something

#comment
def f(x):
    '''Just testing...'''
    return x**2

class C(object):
    def __init__(self, name=None):
        if name is None:
            name = "Anonymous"
        self.name = name

'''
>>> f(8)
64
'''

Heading 3

Heading 4
Heading 5
Illustration of a grassy knoll