lost-theory

The blog of Steven Kryskalla.


 

Perl - how to get the keys of a constant hashref

written by stevek, on Jul 12, 2010 10:59 PM.

How do you get the keys of a constant hashref?

$> use constant A => { 1 => 100, 2 => 200, 3 => 300}
()
$> A
{ 1 => 100, 2 => 200, 3 => 300 }

$> keys A
ERROR: Type of arg 1 to keys must be hash (not constant item) at (eval 13) line 3, at EOF
$> keys %A
()
$> (keys %{A})
Ambiguous use of %{A} resolved to %A at (eval 18) line 1, <IN> line 8.
()

All seems lost until you learn that a constant is actually a function. You can leave off the parentheses to call a function in perl, but in this case you have to call the function to make it clear.

$> (keys A())
ERROR: Type of arg 1 to keys must be hash (not constant item) at (eval 16) line 3, near "A())
$> (keys %{A()})
(1, 3, 2)

Missing "Pause / Break" Key On Dell Studio Laptops

written by stevek, on Jul 6, 2010 11:29:26 PM.

I have a Dell Studio XPS laptop and the keyboard is missing the all-important (not really, but sometimes it's needed) Pause / Break key.

Here's the key combination to trigger Pause/Break: Ctrl + Fn + F12.

If you use Windows, here's another little twist: Windows + Pause/Break brings up the System Properties on a standard keyboard, but Windows+Ctrl+Fn+F12 doesn't work. Instead, you have to use Windows + Fn + F12.

More info from Wikipedia on the Pause/Break key.

Uses of toothpicks, in order of popularity

written by stevek, on Jun 12, 2010 11:44:28 AM.

  1. Testing if brownies are cooked
  2. Holding a sandwich together
  3. Picking food out of your teeth

Spec runner using withhacks

written by stevek, on Mar 13, 2010 2:16 PM.

Here is an interesting video (and blog post) on Ruby vs. Python by Gary Bernhardt. One of the advantages Gary gives to Ruby is the ability to develop and use tools like rspec or cucumber, which use Ruby's block syntax for really nice looking unit tests, runnable specs, etc.

In the talk Gary shows that he was able to create a spec runner with syntax similar to rspec by using "really nasty" and "ugly" hacks (sys.settrace I believe). But, even if these techniques are ugly, they are becoming more easily accessible and more easy to develop with tools like withhacks, which abstracts the ugliness away.

I'm sure this does much less than what Gary's mote does, but here is a simple spec runner using withhacks:

from __future__ import with_statement

from withhacks import CaptureOrderedLocals, CaptureBytecode

class specs(CaptureOrderedLocals,CaptureBytecode):
    def __init__(self,what, *args, **kwargs):
        self.__what = what
        self.__args = args
        self.__kwargs = kwargs
        self.results = []
        super(specs,self).__init__()

    def __exit__(self,*args):
        retcode = super(specs,self).__exit__(*args)
        results = self.run_specs(self.locals)
        self.results = results

    def run_specs(self, cases):
        results = []
        num_pass, num_fail = 0,0
        print "Testing %s specs for %s:" % (len(cases), repr(self.__what))
        for (name, func) in cases:
            if not callable(func): continue
            name = name.replace('_', ' ')
            name = name.capitalize() + '.'
            print "->",
            try:
                func()
                error = None
                num_pass += 1
            except BaseException, e:
                error = repr(e)
                num_fail += 1
            if error:
                print "[FAIL]", name
                print "--->", error
                results.append((name, False, error))
            else:
                print "[pass]", name
                results.append((name, True, None))
        print "Result: %s/%s passed, %s/%s failed" % (num_pass, len(cases), num_fail, len(cases))
        print "-"*20
        return results

And here is an example spec:

class MyClass(object):
    def add(self, a, b):
        return a+b

with specs(MyClass):
    def it_adds_two_and_two():
        c = MyClass()
        assert c.add(2,2) == 4

    def it_adds_negatives():
        c = MyClass()
        assert c.add(10,-10) == 0

    def it_fails_adding_int_and_string():
        c = MyClass()
        try:
            c.add(10, 'foo')
        except TypeError:
            pass #correct!

    def testing_what_a_spec_failure_looks_like():
        c = MyClass()
        c.thisdoesntexist()

And here is the output:

Testing 4 specs for <class '__main__.MyClass'>:
-> [pass] It adds two and two.
-> [pass] It adds negatives.
-> [pass] It fails adding int and string.
-> [FAIL] Testing what a spec failure looks like.
---> AttributeError("'MyClass' object has no attribute 'thisdoesntexist'",)
Result: 3/4 passed, 1/4 failed
--------------------

I put the code on bitbucket here.

Decorator for preventing recursion

written by stevek, on Mar 9, 2010 7:53 AM.

Here's a decorator that will prevent a recursive function from calling itself:

def norecursion(default=None):
    '''Prevents recursion into the wrapped function.'''
    def entangle(f):
        def inner(*args, **kwds):
            if not hasattr(f, 'callcount'):
                f.callcount = 0
            if f.callcount >= 1:
                print "recursion detected %s calls deep. exiting." % f.callcount
                return default
            else:
                f.callcount += 1
                x = f(*args, **kwds)
                f.callcount -= 1
                return x
        return inner
    return entangle

It's based on this recipe. The function in that recipe relies on keeping track of which arguments were passed into the function, which means that it could not work on a function without any arguments. The decorator above works by attaching an attribute to the wrapped function for keeping track of how many calls have been made and exiting when the number of nested calls goes above a certain number.

Here's how you use it:

@norecursion(default=1)
def fact(x):
  if x <= 1:
    return 1
  else:
    return x*fact(x-1)
Now when you call fact it won't make the recursive call, instead it will return the default value of 1:
>>> fact(0)
1
>>> fact(1)
1
>>> fact(2)
recursion detected 1 calls deep. exiting.
2
>>> fact(3)
recursion detected 1 calls deep. exiting.
3

Why I needed this: I have a function on a Jinja2 template which builds a list of all pages and their metadata (a bunch of variables defined at the top of the template). Let's say I use the function on index.html. When it iterates over all the pages, it comes to index.html and then tries to get the list of all pages again. This causes the infinite recursion. On the second call deep, I don't need the whole page list, I only need the template metadata, so I can safely wrap the function in @norecursion(default=[]) to prevent it from running subsequent times.

Note that this is probably not threadsafe, I think there would need to be a lock around where the increment-call-decrement part happens.

Writing Mercurial plugins

written by stevek, on Mar 1, 2010 10:44 PM.

Getting my feet wet with writing some Mercurial plugins... First glance is that the API is very low-level, but I guess that makes sense since HG (and its plugins) have to be low-level to perform well.
#!/usr/bin/env python

from mercurial import hg
from binascii import hexlify
from mercurial import util

def interact(ui, repo, **opts):
    """poke around the mercurial API for this repo in a python interpreter"""
    print "Locals are:", dir()
    import code; code.interact(local=locals())

def short_incoming(ui, repo, **opts):
    """Shows a shortened form of 'hg incoming'"""
    default = hg.repository(ui, ui.expandpath('default'))
    inc = repo.findincoming(default)
    nodes = default.changelog.nodesbetween(inc, None)[0]
    for node in nodes:
        cs = default.changelog.read(node)
        print hexlify(cs[0])[:6], '|', cs[1], '|', util.datestr(cs[2]), \
              '|', len(cs[3]), 'files', '|', cs[5], '|', cs[4]

cmdtable = {
    "interact": (
        interact,
        [],
        interact.__doc__
    ),
    "short": (
        short_incoming,
        [],
        short_incoming.__doc__
    ),
}
Part of me just wants to scrape the text of the different subcommands.

WSGI talk code now online

written by stevek, on Dec 20, 2009 6:25 PM.

At the September 2009 Detroit Perlmongers / dynamic language meetup I gave a talk on Python and WSGI.

I walked through six different examples showing what WSGI is and some parts of the WSGI web development ecosystem. The six examples were:

  1. the simplest WSGI app, serving the same app under different servers
def application(environ, start_response):
    '''The simplest WSGI app.'''
    start_response('200 OK', [('content-type', 'text/html')])
    return ['<h1>Hello world</h1>']
  1. request and response wrapping, introduction to middleware
  2. an example of middleware
  3. a fleshed out app with templates, URL routing, and some more middleware


Great diagram explaining middleware from the Pylons documentation.
  1. form generation and validation
  2. using an existing app with authentication / authorization middleware

You can find the code and instructions for running the examples on Bitbucket.

ltchinese 0.1 released

written by stevek, on Nov 13, 2009 12:23 AM.

I've given one of my old projects, ltchinese, an official release on PyPI, the Python Package Index.

It took a bit of figuring out (following this excellent guide).. but it looks like it worked. This is my first real Python package and first open source release.

ltchinese is a small library of tools I built up when creating some of my Chinese language learning pages (Ocrat mirror, Mandarin phonetics table, etc.). It would be useful for developers building tools or web sites that deal with the Chinese language. It also includes a programmatic interface to some of the data on my site.

There is also documentation available (now hosted by PyPI, which is very cool).

I also got my first bit of feedback that someone was able to use the library for something useful. Thank you Vathanan!

Psychological Randomness

written by stevek, on Aug 28, 2009 10:04 AM.

If you ask a person to give you a 'random' number from 1 to 20, you are more likely to hear some numbers than others. There have been experiments that numbers like 13 or 17 sound more random than numbers like 10 or 20. For example:

Results revealed that for the entire sample the greatest percentage of tickets chosen for the first four selections were "random" tickets. Further, the most commonly cited reason for selecting and changing a lottery ticket was perceived randomness. -- Underlying cognitions in the selection of lottery tickets

Despite having an equal chance of being picked from a hat, certain combinations of numbers are perceived as more random than others.

Note: Also see The Secret Lives of Numbers for something related, and awesome. It's a java applet showing how popular numbers are in search results. Phone numbers, tax forms, zip codes, famous dates, etc. show up more often than other more 'random' numbers and create some interesting patterns.

So, sometimes psychological randomness is more important than true randomness when dealing with perception.

One of the things I dislike about the program I use to play music (Foobar 2000) is that that the "Random song" feature will pick the same song twice in a row. For a random number generator, that is a fine result, all numbers should have an equal chance of being picked. However, for the purpose of a music player, this is not a psychologically random result. Most people use a "Random" or "Shuffle" feature to listen to a new, random arrangement of their songs (like a radio station). How often do you want to hear the same song twice in a row? If you want to hear the same song twice, why would you even use the "Random" feature in the first place?

I think "psychological randomness" should be one of the primary goals for any shuffle / random feature.

The easiest thing to do would be to keep a list of recently played tracks (at least ten), and when picking the next song, re-pick if the song is on that list. Another thing you could do is keep a shuffled version of the original list (a random permutation), then pick songs in order from that list.

But, once you start going down this path, it could be hard to stop :) Would it be 'random' for two songs from the same artist to play? Same album? The more data you look at about the song (artist, album, title, user rating, genre, lyrics, mood), the smarter your randomization could be. You could build your own recommendation system based on that song metadata. Pandora does a good job of that as an example. They even call themselves "a new kind of radio station."

Running Zine on DreamHost PS

written by stevek, on Jul 4, 2009 1:46 PM.

I started using DreamHost PS yesterday and after a few hours had a Zine blog running. That's the quickest turnaround I've ever experienced for getting a Python web app up and running and exposed to the web.

It was better than starting from scratch on a new VPS, but there was some weirdness. DreamHost PS is like a VPS where you can run what you want, but without root. It's like using your friend's server and he doesn't trust you very much. :)

Here are my steps for getting Zine running on DreamHost PS.

Administrivia

  • First, I cranked the resources down to the minimum in the DreamHost PS admin panel. They start you off at the highest setting so you can monitor how much resources your server actually needs, but if you're going to have little traffic to start I don't see the point.
  • Next, I created a new shell user on my DreamHost PS instance.
  • Finally, I made a new subdomain (blog.lost-theory.org) belonging to that user.

virtualenv & pip

DreamHost is nice enough to give you a good version of Python (2.4.4) and easy_install, so you can dive right in. I started by first setting up my virtualenv:

$ cd ~
$ mkdir -p zine/lib
$ easy_install --install-dir=~/zine/lib virtualenv
$ easy_install --install-dir=~/zine/lib pip
$ cd zine
$ virtualenv .

Install Zine and its packages

You can then start installing Zine and the packages it depends on into that virtualenv.

$ cd ~/zine
$ wget http://zine.pocoo.org/releases/Zine-0.1.2.zip
$ unzip Zine-0.1.2.zip
$ pip -E . install Werkzeug
$ pip -E . install Jinja2
$ pip -E . install MySQL-python
$ pip -E . install SQLAlchemy
$ pip -E . install simplejson
$ pip -E . install pytz
$ pip -E . install Babel
$ pip -E . install html5lib
$ (try to install lxml...)
(over 9000 compilation errors)

Now here you will run into a problem since lxml requires the libxml and libxslt packages. On DreamHost PS we don't have root, so we can't install these packages with apt-get install. I took a peek at how Zine uses lxml and it seemed like I might be able to get away without installing it:

./importers/wordpress.py:15:    from lxml import etree
./importers/feed.py:13:    from lxml import etree
./zxa.py:21:    from lxml import etree

I tried wrapping those lines with try/except like so:

try:
    from lxml import etree
except:
    print "Skipping lxml import... will die later"

And sure enough that will work when you start serving up your Zine instance. I haven't had a problem so far with skipping the lxml import (because I haven't used those features yet that require it). It might be possible to use elementtree instead, but it's working fine for now.

Install and quickstart

Install the Zine package:

$ cd ~/zine/Zine-0.1.2/
$ ./configure --prefix=~/zine
$ make install

After that you can create and start an instance:

$ cd ~/zine
$ mkdir instance
$ ./Zine-0.1.2/scripts/server -I instance

This will start the install wizard on port 4000. Go check it out!

Database

You won't get very far without a database though. One important thing to remember is that you do not need to use the DreamHost PS MySQL service. You can use your existing DH MySQL. All I had to do was set up a new user and database on the existing MySQL from my regular shared hosting service.

After that's set up put the DB URI in the install wizard and you're pretty much done.

At this point you'll have Zine fully functional on port 4000. You can start writing entries and checking out the themes and all that. But we can do better than running on port 4000 with a development server, can't we!

Serve Zine using cherrypy and lighttpd

I want to run Zine on port 80 and serve it with something a little more powerful, so I checked out what the DreamHost admin panel offered. There are settings for proxying and Mongrel and FCGI, but those don't really apply. I checked out Private Servers > Configure Server and saw that you can switch between Apache serving on port 80 or port 81. I switched it over to 81 thinking I could run whatever server I want on port 80.

Unfortunately, you'll get a socket.error: (13, 'Permission denied') error when firing up your own server on port 80. I first thought it was because Apache was still bound to 80, but it's actually because unprivileged users can't bind to ports under 1024. Not having root bites us again. :(

DH gives you two choices for serving on port 80: Apache (the default) or lighttpd. You can run your own long-running processes, but they have to serve through Apache/lighty (using CGI, FCGI, and I think Phusion Passenger and maybe a few other options). You can't run your own server on 80.

I chose lighty for the smaller footprint and because I find its configuration a bit easier. To proxy all requests at the root of the domain from port 80 to port 4000 you can use the following:

$HTTP["host"] == "blog.lost-theory.org"
{
  proxy.server = (
    "" => (
      "blog" => (
        "host" => "127.0.0.1",
        "port" => 4000
      )
    )
  )
}

The configs are stored per-host here: /usr/local/dh/lighttpd/servers/your-user/

We can't mess with the main lighty config file, again, because we don't have root. You only have write access to the individual config files for each host.

Once that was set up I decided to swap out the Werkzeug development server for cherrypy's WSGI server. You can keep most of the Zine-0.1.2/scripts/server script the same, just pip install cherrypy and make this change:

#change this...
run_simple(options.hostname, options.port, app, threaded=options.threaded,
           use_reloader=options.reloader, use_debugger=options.debugger)

#to this:
from cherrypy import wsgiserver
import threading
s = wsgiserver.CherryPyWSGIServer((options.hostname, options.port), app)
threading.Thread(target=s.start).start()

Note: This is one of the reasons I <3 WSGI.

One more important change: Zine will probably still think that the address of the blog is http://example.com:4000/. This will make all the links point to that site, which is ugly. To fix this just drop the port number off of the blog_url setting in zine.ini.

Conclusion

That's all it took! I am pretty happy with how easy it was.

Python web apps are not as easy as something like PHP to get up-and-running, but this process was pretty ideal in my opinion. I hated using a bare bones VPS because you end up spending more time on sysadmin and thinking about security holes than on building and deploying cool apps. DreamHost PS seems like a good middle ground between bare bones VPS / dedicated servers and shared hosting. It has a lot of good defaults, polished administration (via the panel), decent price, and the ability to scale upwards.

Google App Engine is a similar service, but I haven't tried it yet. I am a little worried that existing Python apps / code aren't portable to GAE.

As far as resources go, the 150MB memory has been smooth so far. I'll monitor if I need to up my resources. If I do need to increase my resources I'll probably do it since they make it so easy and the price is reasonable. :)

My DreamHost PS memory/cpu usage

One interesting thing is that after switching to lighttpd+cherrypy (after the spike in the graph) my memory usage went down. I've heard that DreamHost PS has about a 100MB footprint when idle. After switching to lighty+cherrypy my memory usage is ~50MB when idle.

Hope this was helpful if you're interested in running Python apps on DreamHost PS. Happy zine-ing to you!

Update:Dreamhost PS now gives you root access by setting up a new account under "Manage Admin Users" in your control panel. This should make things much easier!