Hosting a naked (no-www) domain on dotcloud, GAE, etc.

Some platform-as-a-service providers like Google App Engine, dotCloud, etc. don't allow you to host a naked domain (example.com), and only serve requests from the www subdomain (www.example.com).

However, there is a neat hosted service called wwwizer which will redirect your no-www domain to your www domain, for free.

Here are my steps for hosting a naked domain with wwwizer on dotCloud. I also threw Cloudflare into the mix because they have a nice interface for editing DNS records and some other cool features (CDN, DoS protection, etc.). Using Cloudflare is not required for wwwizer to work.

  1. Add the domain to Cloudflare
  2. Cloudflare will copy your existing DNS records
  3. Cloudflare will provide you with two new name servers (x.ns.cloudflare.com and y.ns.cloudflare.com). Delete all your existing NS records and replace them with the two Cloudflare name servers.
  4. Wait until the nameservers switch over, you will get a message in the Cloudflare admin area. You can also check an HTTP response from your server, the "Server" HTTP header should say "cloudflare".

To host the site on dotCloud you then make these two changes from within Cloudflare:

  1. Set the A record for example.com (the naked domain) to 174.129.25.170 (wwwizer.com)
  2. Set the A record for www to the IP address of gateway.dotcloud.com

Now your naked domain example.com will redirect to www.example.com, which will be hosted by dotCloud. Step #6 is the only part that is specific to dotCloud, just change this to whatever your PaaS provider tells you to, and you should be all set.

How to parse the output of git log

Here is how to get the output of "git log" in an easy to parse format and build a python dict from the result. You could then convert the dict to JSON, XML, HTML, etc.

First, look at the git-log man page and find the section on "Pretty Formats." There are different codes to use (like printf) for the commit metadata (e.g. %an for author name).

Store these codes, along with the corresponding field names in two lists:

GIT_COMMIT_FIELDS = ['id', 'author_name', 'author_email', 'date', 'message']
GIT_LOG_FORMAT = ['%H', '%an', '%ae', '%ad', '%s']

Then, join the format fields together with "\x1f" (ASCII field separator) and delimit the records by "\x1e" (ASCII record separator). These characters are not likely to appear in your commit data, so they are pretty safe to use for parsing.

GIT_LOG_FORMAT = '%x1f'.join(GIT_LOG_FORMAT) + '%x1e'

Then run git log --format="..." with your format string, split the fields, and make a dict from them:

p = Popen('git log --format="%s"' % GIT_LOG_FORMAT, shell=True, stdout=PIPE)
(log, _) = p.communicate()
log = log.strip('\n\x1e').split("\x1e")
log = [row.strip().split("\x1f") for row in log]
log = [dict(zip(GIT_COMMIT_FIELDS, row)) for row in log]

Output:

$ python commits.py
[{'author_email': 'skryskalla@gmail.com',
  'author_name': 'stevek',
  'date': 'Sat Feb 18 12:58:00 2012 -0800',
  'id': 'f1dc488e092e5e725c2ec3b7afc3962f0ba707d3',
  'message': 'third commit'},
 {'author_email': 'skryskalla@gmail.com',
  'author_name': 'stevek',
  'date': 'Sat Feb 18 12:57:54 2012 -0800',
  'id': '1bf26e9aa0cb8c9b95b579695c6af349319a88ab',
  'message': 'second commit'},
 {'author_email': 'skryskalla@gmail.com',
  'author_name': 'stevek',
  'date': 'Sat Feb 18 12:57:47 2012 -0800',
  'id': '9c2db5dffa7c70358ab78b6092539ce26006775b',
  'message': 'this is the first commit'}]

Full working example.

New blog

I decided to migrate my old Zine blog to blohg. Blohg is a static site generator that uses ReStructured Text, mercurial, Flask, and Frozen-Flask for the backend. This should be easier to keep up to date.

I am also shutting down my DreamhostPS. The service was alright, but recently it kept getting restarted for going over its quota, even though it was very under-utilized. Also, I have been using ep.io and Dotcloud for all my new python web app experiments.

The new blog is also using a design made by my buddy Joe. Thank you Joe!

American Dreamers series

Interesting series on eccentric folks, the first few on artists are very good:

Perl - how to get the keys of a constant hashref

How do you get the keys of a constant hashref in perl?

$> use constant A => { 1 => 100, 2 => 200, 3 => 300}
()
$> A
{ 1 => 100, 2 => 200, 3 => 300 }

$> keys A
ERROR: Type of arg 1 to keys must be hash (not constant item) at (eval
13) line 3, at EOF
$> keys %A
()
$> (keys %{A})
Ambiguous use of %{A} resolved to %A at (eval 18) line 1, <IN> line 8.
()

To get the keys of a constant hashref you first need to learn that a constant is actually a function. You can leave off the parentheses to call a function in perl, but in this case you have to call the function to make perl happy.

$> (keys %{A()})
(1, 3, 2)

Missing Pause/Break key on Dell Studio laptops

I have a Dell Studio XPS laptop and the keyboard is missing the all-important (not really, but sometimes it's needed) Pause / Break key.

Here's the key combination to trigger Pause/Break: Ctrl + Fn + F12.

If you use Windows, here's another little twist: Windows + Pause/Break brings up the System Properties on a standard keyboard, but Windows+Ctrl+Fn+F12 doesn't work. Instead, you have to use Windows + Fn + F12.

Also, here's the history of the Pause/Break key. Fascinating...!

Uses of toothpicks, in order of popularity

  1. Testing if brownies are cooked
  2. Holding a sandwich together
  3. Picking food out of your teeth

Spec runner using withhacks

Here is an interesting video (and blog post) on Ruby vs. Python by Gary Bernhardt. One of the advantages Gary gives to Ruby is the ability to develop and use tools like rspec or cucumber, which use Ruby's block syntax for really nice looking unit tests, runnable specs, etc.

In the talk Gary shows that he was able to create a spec runner with syntax similar to rspec by using "really nasty" and "ugly" hacks (sys.settrace I believe). But, even if these techniques are ugly, they are becoming more easily accessible and more easy to develop with tools like withhacks, which abstracts the ugliness away.

I'm sure this does much less than what Gary's mote does, but here is a simple spec runner using withhacks:

from __future__ import with_statement

from withhacks import CaptureOrderedLocals, CaptureBytecode

class specs(CaptureOrderedLocals,CaptureBytecode):
    def __init__(self,what, *args, **kwargs):
        self.__what = what
        self.__args = args
        self.__kwargs = kwargs
        self.results = []
        super(specs,self).__init__()

    def __exit__(self,*args):
        retcode = super(specs,self).__exit__(*args)
        results = self.run_specs(self.locals)
        self.results = results

    def run_specs(self, cases):
        results = []
        num_pass, num_fail = 0,0
        print "Testing %s specs for %s:" % (len(cases), repr(self.__what))
        for (name, func) in cases:
            if not callable(func): continue
            name = name.replace('_', ' ')
            name = name.capitalize() + '.'
            print "->",
            try:
                func()
                error = None
                num_pass += 1
            except BaseException, e:
                error = repr(e)
                num_fail += 1
            if error:
                print "[FAIL]", name
                print "--->", error
                results.append((name, False, error))
            else:
                print "[pass]", name
                results.append((name, True, None))
        print "Result: %s/%s passed, %s/%s failed" % (num_pass, len(cases), num_fail, len(cases))
        print "-"*20
        return results

And here is an example spec:

class MyClass(object):
    def add(self, a, b):
        return a+b

with specs(MyClass):
    def it_adds_two_and_two():
        c = MyClass()
        assert c.add(2,2) == 4

    def it_adds_negatives():
        c = MyClass()
        assert c.add(10,-10) == 0

    def it_fails_adding_int_and_string():
        c = MyClass()
        try:
            c.add(10, 'foo')
        except TypeError:
            pass #correct!

    def testing_what_a_spec_failure_looks_like():
        c = MyClass()
        c.thisdoesntexist()

And here is the output:

Testing 4 specs for <class '__main__.MyClass'>:
-> [pass] It adds two and two.
-> [pass] It adds negatives.
-> [pass] It fails adding int and string.
-> [FAIL] Testing what a spec failure looks like.
---> AttributeError("'MyClass' object has no attribute 'thisdoesntexist'",)
Result: 3/4 passed, 1/4 failed
--------------------

I put the code on bitbucket here.

Decorator for preventing recursion

Here's a decorator that will prevent a recursive function from calling itself:

def norecursion(default=None):
    '''Prevents recursion into the wrapped function.'''
    def entangle(f):
        def inner(*args, **kwds):
            if not hasattr(f, 'callcount'):
                f.callcount = 0
            if f.callcount >= 1:
                print "recursion detected %s calls deep. exiting." % f.callcount
                return default
            else:
                f.callcount += 1
                x = f(*args, **kwds)
                f.callcount -= 1
                return x
        return inner
    return entangle

It's based on this recipe. The function in that recipe relies on keeping track of which arguments were passed into the function, which means that it could not work on a function without any arguments. The decorator above works by attaching an attribute to the wrapped function for keeping track of how many calls have been made and exiting when the number of nested calls goes above a certain number.

Here's how you use it:

@norecursion(default=1)
def fact(x):
  if x <= 1:
    return 1
  else:
    return x*fact(x-1)

Now when you call fact it won't make the recursive call, instead it will return the default value of 1:

>>> fact(0)
1
>>> fact(1)
1
>>> fact(2)
recursion detected 1 calls deep. exiting.
2
>>> fact(3)
recursion detected 1 calls deep. exiting.
3

Why I needed this: I have a function on a Jinja2 template which builds a list of all pages and their metadata (a bunch of variables defined at the top of the template). Let's say I use the function on index.html. When it iterates over all the pages, it comes to index.html and then tries to get the list of all pages again. This causes the infinite recursion. On the second call deep, I don't need the whole page list, I only need the template metadata, so I can safely wrap the function in @norecursion(default=[]) to prevent it from running subsequent times.

Update: Reading this post again I think I could have just used a memoization decorator instead. At the time preventing recursion with a decorator seemed like an okay solution, but memoization would have been a little less weird and probably worked fine.

Writing mercurial plugins

Getting my feet wet with writing some Mercurial plugins... First impression is that the API is very low-level, but I guess that makes sense since HG (and its plugins) have to be low-level to perform well.

#!/usr/bin/env python

from mercurial import hg
from binascii import hexlify
from mercurial import util

def interact(ui, repo, **opts):
    """poke around the mercurial API for this repo in a python interpreter"""
    print "Locals are:", dir()
    import code; code.interact(local=locals())

def short_incoming(ui, repo, **opts):
    """Shows a shortened form of 'hg incoming'"""
    default = hg.repository(ui, ui.expandpath('default'))
    inc = repo.findincoming(default)
    nodes = default.changelog.nodesbetween(inc, None)[0]
    for node in nodes:
        cs = default.changelog.read(node)
        print hexlify(cs[0])[:6], '|', cs[1], '|', util.datestr(cs[2]), \
              '|', len(cs[3]), 'files', '|', cs[5], '|', cs[4]

cmdtable = {
    "interact": (
        interact,
        [],
        interact.__doc__
    ),
    "short": (
        short_incoming,
        [],
        short_incoming.__doc__
    ),
}

Part of me just wants to scrape the text of the different subcommands.

Illustration of a grassy knoll