Pylons with Tokyo Cabinet Beaker Sessions

Posted by Jack Hsu Wed, 27 May 2009 20:20:00 GMT

Maintaining a session is something most web apps need to do, and all web frameworks implement some sort of session management system.

Pylons allows you to specify the type of session your web app should use: database, file, memcached, etc. The great thing about it is that you can easily implement your own session manager and plug it right in.

Here I am implementing a session manager that uses Tokyo Cabinet and JSON (instead of cPickle). The simplejson module now has speedups that use C extensions so load and dump are more or less on par with cPickle. With JSON serialization you’ll be able to share the session data in apps written in other languages (such as Java, Perl, Ruby).

Tokyo Cabinet is a class of DBM that is a key-value store. It differs from RDBMS in that data are stored solely as key-value pairs, with no relations to each other. The advantage over RDBMS is that Tokyo Cabinet is lightning fast in comparison. If you’re familiar with memcached, you can think of it like memcached except with persistent data. Since sessions need to be accessed frequently, and sessions are essentially key-value pairs – a key/namespace mapping to a session object – I thought it’d be a good candidate to use Tokyo Cabinet.

Assuming you already have a Pylons project setup, let’s create a file under [project path]/lib/ called tcjson.py (Tokyo Cabinet JSON… get it?), and fill it with the following.

import logging

from beaker.container import NamespaceManager, Container
from beaker.synchronization import file_synchronizer
from beaker.util import verify_directory

try:
    from pytyrant import PyTyrant
except ImportError:
        raise InvalidCacheBackendError("PyTyrant cache backend requires the 'pytyrant' library")

try:
    import json
except ImportError:
    try: 
        import simplejson as json
    except ImportError:
        raise InvalidCacheBackendError("PyTyrant cache backend requires the 'simplejson' library")

log = logging.getLogger(__name__)

class TokyoCabinetManager(NamespaceManager):
    def __init__(self, namespace, url=None, data_dir=None, lock_dir=None, **params):
        NamespaceManager.__init__(self, namespace)

        if not url:
            raise MissingCacheParameter("url is required") 

        if lock_dir:
            self.lock_dir = lock_dir
        elif data_dir:
            self.lock_dir = data_dir + "/container_tcd_lock"
        if self.lock_dir:
            verify_directory(self.lock_dir)            

        host, port = url.split(':')
        self.tc = PyTyrant.open(host, int(port))

    def get_creation_lock(self, key):
        return file_synchronizer(
            identifier ="tccontainer/funclock/%s" % self.namespace,
            lock_dir = self.lock_dir)

    def _format_key(self, key):
        return self.namespace + '_'  

    def __getitem__(self, key):
        return json.loads(self.tc.get(self._format_key(key)))

    def __contains__(self, key):
        return self.tc.has_key(self._format_key(key))

    def has_key(self, key):
        return key in self

    def set_value(self, key, value):
        self.tc[self._format_key(key)] =  json.dumps(value)

    def __setitem__(self, key, value):
        self.set_value(key, value)

    def __delitem__(self, key):
        del self.tc[self._format_key(key)]

    def do_remove(self):
        self.tc.clear()

    def keys(self):
        raise self.tc.keys()


class TokyoCabinetContainer(Container):
    namespace_manager = TokyoCabinetManager

Now, put this in your [project path]/config/environment.py file.

import investor.lib.tcjson as tcjson 
import beaker
beaker.cache.clsmap['ext:tcjson'] = tcjson.TokyoCabinetManager

You can then specify ext:tcjson as your beaker session in your INI file (e.g. development.ini).

beaker.session.type = ext:tcjson
beaker.session.url = 127.0.0.1:1978

Fire up your app and it should be using Tokyo Cabinet for your sessions! :)

Playing around with Pylons

Posted by Jack Hsu Wed, 27 May 2009 15:39:00 GMT

I started playing with a Python web framework called Pylons. I’ve been using Django for the last little while, but after hearing many positive feedback about the Pylons, so I decided it’s time to get my hands dirty.

Although I’ve barely scratched the surface of Pylons, but here are some things I like about it.

  1. I love Python Paste. Deploying and dispatching your web app suddenly becomes very easy. You can even register your application at PyPi.
  2. The default Mako templating engine is very, very fast. It has a newline filter (backslash) to consume the newline character before moving to the next line – which is useful if you want nicely formatted HTML. Comments can be multi-lined – something I wish was possible in Django. And it’s also nice to have arbitrary Python code embedded in the page… should be used sparingly, but really useful sometimes.
  3. SQLAlchemy is powerful. Arguably the best ORM in any language. Doesn’t get in your way when designing your database and application architecture. You can map objects to any arbitrary joins or selects.
  4. Really gives you full control over your application. Pylons doesn’t give you out-of-the-box admin or user authentication (like Django), but it does allow you to build your application exactly the way you want without having to work around the framework.

Don’t get me wrong. I like Django, and will continue to use it where it makes sense. Forms in Django 1.0 is really awesome, and I especially like ModelForm for creating forms from models. The admin view that comes with Django is great, and saves a lot of development time.

If you do use SQLAlchemy with Pylons, you can always check out formalchemy, which many of the same functionality as Django’s ModelForm, but with SQLAlchemy mapped objects. As a bonus, you can even use a Pylons extension with formalchemy that gives you automagically created admin interface for your objects.

And some more thoughts about Django.

  1. You can use SQLAlchemy in Django. That is, if you’re willing to give up the built-in admin and generic views, and ModelForms.
  2. There are ways to get other more powerful Python templating engines into Django (e.g. django-mako), but it does require you do change your view code.

 So here are my not-so final thoughts.

I’ll continue to use Django for the many  web applications that are basically just CRUD applications. For anything that you need to have fine-grained control over, Pylons seems to be the better choice. It also allows you do use

Cheating on Speed Sudoku -- How to Prevent Greasemonkey Scripts

Posted by Jack Hsu Thu, 07 May 2009 15:24:00 GMT

I recently read about Speed Sudoku in a Globe and Mail article. It’s a website that allows player to compete against each other in a race to solve Sudoku puzzles. Players are given points based on their performance in a game – I’m unsure how the point system works – and these points are used to rank them on the scoreboard. As of this writing, the top player is named "WaterlooMathie" (woot for Waterloo).

Anyway. I decided to write a Greasemonkey script to automatically solve these puzzles. The technique I chose is a simple Backtracking algorithm with cells chosen based on the Most Restrained Variable. *Disclaimer: I’m not planning on climbing to the top of the scoreboard using this script, it’s just for fun.

[See script here]

Nothing too fancy. In fact, the code can be tuned further to perform better – meh. But the results are pretty good. I’m able to solve the Very Hard puzzles on my Mac instantly on page load.

Even though I’m not going to cheat on Speed Sudoku, I’m sure others will, or already have. In fact, you can report users you suspect of cheating to the site admins.

This raises a question for me: How do you prevent Greasemonkey scripts from executing on your website?

The answer lies with how Greasemonkey scripts are fired – it listens to the DomContentLoaded event. To script execution you simply put this piece of code at the top of your page.

document.addEventListener("DOMContentLoaded", function(ev) {
    ev.stopPropagation();
}, false);

This will attach your anonymous function to the same event listener (which will execute first), and we simply need to stop the event from propagating.

You might run into cases where you do want your own functions (attached to DOMContentLoaded event) to fire. In these cases, you could create your own custom events, listen on those, and fire them in the anonymous function after.

Of course, users can still cheat by other means, but at least this can prevent Greasemonkey cheats. That said, I’m against prevent Greasemonkey scripts. I love the addon, and it adds functionality to websites that I can’t live without.