Pylons with Tokyo Cabinet Beaker Sessions

Posted by Jack Hsu Wed, 27 May 2009 20:20:00 GMT

Maintaining a session is something most web apps need to do, and all web frameworks implement some sort of session management system.

Pylons allows you to specify the type of session your web app should use: database, file, memcached, etc. The great thing about it is that you can easily implement your own session manager and plug it right in.

Here I am implementing a session manager that uses Tokyo Cabinet and JSON (instead of cPickle). The simplejson module now has speedups that use C extensions so load and dump are more or less on par with cPickle. With JSON serialization you’ll be able to share the session data in apps written in other languages (such as Java, Perl, Ruby).

Tokyo Cabinet is a class of DBM that is a key-value store. It differs from RDBMS in that data are stored solely as key-value pairs, with no relations to each other. The advantage over RDBMS is that Tokyo Cabinet is lightning fast in comparison. If you’re familiar with memcached, you can think of it like memcached except with persistent data. Since sessions need to be accessed frequently, and sessions are essentially key-value pairs – a key/namespace mapping to a session object – I thought it’d be a good candidate to use Tokyo Cabinet.

Assuming you already have a Pylons project setup, let’s create a file under [project path]/lib/ called tcjson.py (Tokyo Cabinet JSON… get it?), and fill it with the following.

import logging

from beaker.container import NamespaceManager, Container
from beaker.synchronization import file_synchronizer
from beaker.util import verify_directory

try:
    from pytyrant import PyTyrant
except ImportError:
        raise InvalidCacheBackendError("PyTyrant cache backend requires the 'pytyrant' library")

try:
    import json
except ImportError:
    try: 
        import simplejson as json
    except ImportError:
        raise InvalidCacheBackendError("PyTyrant cache backend requires the 'simplejson' library")

log = logging.getLogger(__name__)

class TokyoCabinetManager(NamespaceManager):
    def __init__(self, namespace, url=None, data_dir=None, lock_dir=None, **params):
        NamespaceManager.__init__(self, namespace)

        if not url:
            raise MissingCacheParameter("url is required") 

        if lock_dir:
            self.lock_dir = lock_dir
        elif data_dir:
            self.lock_dir = data_dir + "/container_tcd_lock"
        if self.lock_dir:
            verify_directory(self.lock_dir)            

        host, port = url.split(':')
        self.tc = PyTyrant.open(host, int(port))

    def get_creation_lock(self, key):
        return file_synchronizer(
            identifier ="tccontainer/funclock/%s" % self.namespace,
            lock_dir = self.lock_dir)

    def _format_key(self, key):
        return self.namespace + '_'  

    def __getitem__(self, key):
        return json.loads(self.tc.get(self._format_key(key)))

    def __contains__(self, key):
        return self.tc.has_key(self._format_key(key))

    def has_key(self, key):
        return key in self

    def set_value(self, key, value):
        self.tc[self._format_key(key)] =  json.dumps(value)

    def __setitem__(self, key, value):
        self.set_value(key, value)

    def __delitem__(self, key):
        del self.tc[self._format_key(key)]

    def do_remove(self):
        self.tc.clear()

    def keys(self):
        raise self.tc.keys()


class TokyoCabinetContainer(Container):
    namespace_manager = TokyoCabinetManager

Now, put this in your [project path]/config/environment.py file.

import investor.lib.tcjson as tcjson 
import beaker
beaker.cache.clsmap['ext:tcjson'] = tcjson.TokyoCabinetManager

You can then specify ext:tcjson as your beaker session in your INI file (e.g. development.ini).

beaker.session.type = ext:tcjson
beaker.session.url = 127.0.0.1:1978

Fire up your app and it should be using Tokyo Cabinet for your sessions! :)

Playing around with Pylons

Posted by Jack Hsu Wed, 27 May 2009 15:39:00 GMT

I started playing with a Python web framework called Pylons. I’ve been using Django for the last little while, but after hearing many positive feedback about the Pylons, so I decided it’s time to get my hands dirty.

Although I’ve barely scratched the surface of Pylons, but here are some things I like about it.

  1. I love Python Paste. Deploying and dispatching your web app suddenly becomes very easy. You can even register your application at PyPi.
  2. The default Mako templating engine is very, very fast. It has a newline filter (backslash) to consume the newline character before moving to the next line – which is useful if you want nicely formatted HTML. Comments can be multi-lined – something I wish was possible in Django. And it’s also nice to have arbitrary Python code embedded in the page… should be used sparingly, but really useful sometimes.
  3. SQLAlchemy is powerful. Arguably the best ORM in any language. Doesn’t get in your way when designing your database and application architecture. You can map objects to any arbitrary joins or selects.
  4. Really gives you full control over your application. Pylons doesn’t give you out-of-the-box admin or user authentication (like Django), but it does allow you to build your application exactly the way you want without having to work around the framework.

Don’t get me wrong. I like Django, and will continue to use it where it makes sense. Forms in Django 1.0 is really awesome, and I especially like ModelForm for creating forms from models. The admin view that comes with Django is great, and saves a lot of development time.

If you do use SQLAlchemy with Pylons, you can always check out formalchemy, which many of the same functionality as Django’s ModelForm, but with SQLAlchemy mapped objects. As a bonus, you can even use a Pylons extension with formalchemy that gives you automagically created admin interface for your objects.

And some more thoughts about Django.

  1. You can use SQLAlchemy in Django. That is, if you’re willing to give up the built-in admin and generic views, and ModelForms.
  2. There are ways to get other more powerful Python templating engines into Django (e.g. django-mako), but it does require you do change your view code.

 So here are my not-so final thoughts.

I’ll continue to use Django for the many  web applications that are basically just CRUD applications. For anything that you need to have fine-grained control over, Pylons seems to be the better choice. It also allows you do use

Pythonistas Rejoice! Curly Braces Are In 1

Posted by Jack Hsu Wed, 01 Apr 2009 16:09:00 GMT

Finally, we can do without the annoying "whitespace is signigicant" crap. You can import a new feature called braces from the __future__ module. For those who don’t know, the __future__ module contains new features that are not yet included in the current Python version.

Now, instead of doing this:

def foo():
    # Ugh, I hate whitespace indents
    print 'Hello Braces!'

You can do this:

from __future__ import braces
def foo() { print 'Hello Braces!' } # Only one line!!!11!!11

April Fools!

Programming Language Trends (Java==Cobol?)

Posted by Jack Hsu Fri, 13 Mar 2009 01:13:00 GMT

I ran some queries through Google Insights, just for fun. One of the queries I ran was python vs. java vs. ruby vs. erlang vs. cobol, under the Programming category. Here’s the resulting graph of growth relative to category.

Ignore the dark blue line, it’s a plot of the Programming category (which is actually sliding.. hmm).

Seems like Erlang and Ruby have generated a lot of interests over the last three years, with 242% and 124% growth in interest since January, 2004. Python has remained rather steady, losing 5%. Java on the other hand has dropped quite significantly, at -53% growth since January 2004. And Cobol is way down at -62%.

It gets more interesting when you view the interest by region. The most interesting data I think is that Java and Cobol are both pretty cold world-wide except for one region: India. Take a look at these regional maps.

Search volume for Java:

Search volume for Cobol:

My guess is that a lot of companies now out-source Java and Cobol projects to India. They are both languages that people need to continue to support, probably for legacy systems, but just aren’t fun and definitely kills innovation. As Brett McLaughlin (author of several O’Reilly Java books) said, Java isn’t unimportant, but it’s big business in many cases, and that tends to suck experimentation and fun out in sneaky ways.

A few more (completely useless?) things to note:

  1. Russia loves Erlang.
  2. Ruby is obviously hot in Japan, but also in Belarus (who whudda thunk it).
  3. Python and Ruby are gaining traction in the US, but Canada is lagging still (c’mon Canucks!).
  4. Japan is pretty lukewarm to Python, and it has a pretty high search volume index for Cobol compared to most every other country.

 

Java Is Just Too Slow

Posted by Jack Hsu Tue, 17 Feb 2009 04:13:00 GMT

Java is slow. And by slow I don’t mean it’s execution speed. I’m talking about development speed.

Java forces you to plan too much. Don’t get me wrong, there’s nothing inherently bad about planning. The problem is that innovation and planning don’t always go hand-in-hand. When you’re making yet another boring spreadsheet application that’s been done a million times before, you can plan out exactly what needs to be done before writing code because you know exactly what you need However, when you’re innovating you need to code as you go. Your ideas may just work, it might need some tweaking, or it may turn out to be horrible. The point is, you won’t know until you start hacking away at the code.

Java is so verbose that in order to really work with it you need to plan ahead. You need to know what Objects you need to create, what methods and variables (private or public?) it should have, and how they need to interact with each other. Oh, and after all that you need to compile your code before seeing the result. As a simple demonstration of this, let’s work with a JSON object in Java (using Gson), then in Python (using simplejson).

Let’s use the following JSON object:

{
    'foo': {
        'bar': 10,
        'baz': 20
    },
    'phrase': 'Hello World!',
    'myArray': [1, 2, 3, 4, 5]
}

Let’s access some data in Java. Wait, first we need to create some Objects to deserialize the JSON string to.

class MyObj1 {
    private int bar;
    private int baz;
    MyObj1() {}
}

class MyObj2 {
    private MyObj1 foo;
    private String phrase;
    private int[] myArray;
    MyObj2() {}
}

Now we can deserialize the JSON string and get some properties from it.

String json = "{'foo': {'bar': 10, 'baz': 20}," +
            "'phrase': 'Hello World!'," +
        "'myArray': [1, 2, 3, 4, 5]}";
MyObj2 json = gson.fromJson(json, MyObj2 .class);

system.out.println(json.foo.bar); //--> 10
system.out.println(json.phrase); //--> Hello World!
system.out.println(json.myArray[1]); //--> 2

Now let’s try the same thing in Python.

import simplejson

json = simplejson.loads('{"foo": {"bar": 10,' +
        '"baz": 20}, "phrase": "Hello World!",' +
        '"myArray": [1, 2, 3, 4, 5]}')
print(json['foo']['bar']) #--> 10
print(json['phrase']) #-> Hello World!
print(json['myArray'][1]) #--> 2

By the way, the above python code is 100% executable using the interactive shell assuming simplejson is installed (easy_install simple-json). The Java code will need some main method with whatever else, I just didn’t bother – usually my IDE at work writes all the skeleton code for me.

So which do you think is simpler? If you knew you’ll be dealing with a lot of JSON would you go with Python or Java? Another thing to keep in mind is that if I change the JSON string, the Python code would require virtually no effort, whereas the Java code will need class changes, recompiling, etc. Not to mention if I changed the array to something like ["I'm a string", 1234,  {"a": 1, "b": 2}, 3, 4, 5]. Sure you can do it in Java, but WHY? You can probably see why Java developers are forced into so many hour-long planning meetings. If you get the wrong the first time you really pay the price.

To be successful on the Web you need a platform that allows you to be innovative. Traditionally, software development involves a long period of planning before any code is written. You go through iterations of UML diagrams before typing a sinlge line of code. This just won’t cut it on the Web.

Traditional software is like cartoon drawing. You usually start with some storyboard sketches of the overall story. This gives the artists a way to previsualize the motion graphic before starting on the real thing. Web applications, on the other hand, is more like a bunch of sketches that are never quite finished. That is, web applications are always evolving, and this evolution happens in smaller steps, and more frequently. You need a language that can allow you to take these smaller steps, and I don’t believe Java is the right tool for this.

Hello Interweb! 1

Posted by Jack Hsu Thu, 12 Feb 2009 02:03:00 GMT

This is my first attempt at a dedicated programming blog. More interesting posts are on their way, but for now here’s a little info about me!

In my short professional career (which includes several co-op jobs when I attended the University of Waterloo) I’ve mostly been in web development. Actually, I’ve been doing the whole web stuff since about mid-90s, starting with HTML, JavaScript and Perl. Just on my personal/hobby websites.

I’ve worked with many languages, but my current interests are JavaScript and Python. I am a huge fan of Django, and have used it on several projects, including GlobeCampus. I expect the majority of my posts in the near-future to be on JavaScript and Python. ;)

Currently, I’m working as a web developer at The Globe and Mail.