alex gaynor's blago-blog

Posts tagged with internals

The State of MultiDB (in Django)

Posted November 10th, 2009. Tagged with django, orm, models, python, internals, gsoc.

As you, the reader, may know this summer I worked for the Django Software Foundation via the Google Summer of Code program. My task was to implement multiple database support for Django. Assisting me in this task were my mentors Russell Keith-Magee and Nicolas Lara (you may recognize them as the people responsible for aggregates in Django). By the standards of the Google Summer of Code project my work was considered a success, however, it's not yet merged into Django's trunk, so I'm going to outline what happened, and what needs to happen before this work is considered complete.

Most of the major things happened, settings were changed from a series of DATABASE_* to a DATABASES setting that's keyed by DB aliases and who's values are dictionaries containing the usual DATABASE* options, QuerySets grew a using() method which takes a DB alias and says what DB the QuerySet should be evaluated against, save() and delete() grew similar using keyword arguments, a using option was added to the inner Meta class for models, transaction support was expanded to include support for multiple databases, as did the testing framework. In terms of internals almost every internal DB related function grew explicit passing of the connection or DB alias around, rather than assuming the global connection object as they used to. As I blogged previously ManyToMany relations were completely refactored. If it sounds like an awful lot got done, that's because it did, I knew going in that multi-db was a big project and it might not all happen within the confines of the summer.

So if all of that stuff got done, what's left? Right before the end of the GSOC time frame Russ and I decided that a fairly radical rearchitecting of the Query class (the internal datastructure that both tracks the state of an operation and forms its SQL) was needed. Specifically the issue was that database backends come in two varieties. One is something like a backend for App Engine, or CouchDB. These have a totally different design than SQL, they need different datastructures to track the relevant information, and they need different code generation. The second type of database backend is one for a SQL database. By contrast these all share the same philosophies and basic structure, in most cases their implementation just involves changing the names of database column types or the law LIMIT/OFFSET is handled. The problem is Django treated all the backends equally. For SQL backends this meant that they got their own Query classes even though they only needed to overide half of the Query functionality, the SQL generation half, as the datastructure part was identical since the underlying model is the same. What this means is that if you make a call to using() on a QuerySet half way through it's construction you need to change the class of the Query representation if you switch to a database with a different backend. This is obviously a poor architecture since the Query class doesn't need to be changed, just the bit at the end that actually constructs the SQL. To solve this problem Russ and I decided that the Query class should be split into two parts, a Query class that stores bits about the current query, and a SQLCompiler which generated SQL at the end of the process. And this is the refactoring that's holding up the merger of my multi-db work primarily.

This work is largely done, however the API needs to be finalized and the Oracle backend ported to the new system. In terms of other work that needs to be done, GeoDjango needs to be shown to shown to still work (or fixed). In my opinion everything else on the TODO list (available here, please don't deface) is optional for multi-db to be merge ready, with the exception of more example documentation.

There are already people using the multi-db branch (some even in production), so I'm confident about it's stability. For the next 6 weeks or so (until the 1.2 feature deadline), my biggest priority is going to be getting this branch into a merge ready state. If this is something that interests you please feel free to get in contact with me (although if you don't come bearing a patch I might tell you that I'll see you in 6 weeks ;)), if you happen to find bugs they can be filed on the Django trac, with version "soc2009/multidb". As always contributors are welcome, you can find the absolute latest work on my Github and a relatively stable version in my SVN branch (this doesn't contain the latest, in progress, refactoring). Have fun.

You can find the rest here. There are view comments.

Another Unladen Swallow Optimization

Posted November 8th, 2009. Tagged with python, internals, unladen-swallow.

This past week I described a few optimizations that the Unladen Swallow team have done in order to speed up CPython. In particular one of the optimizations I described was to emit direct calls to C functions that took either zero or one argument. This improves the speed of Python when calling functions like len() or repr(), who only take one argument. However, there are plenty of builtin functions that take a fixed number of arguments that is greater than one. This is the source of the latest optimization.

As I discussed previously there were two relevant flags, METH_O and METH_NOARGS. These described functions that take either one or zero arguments. However, this doesn't cover a wide gamut of functions. Therefore the first stage of these optimizations was to replace these two flags with METH_FIXED, which indicates that the function takes a fixed number of arguments. There was also an additional slot added to the struct that holds C functions to store the arity of the function (the number of arguments it takes). Therefore something like:

{"id", builtin_id, METH_O, id_doc}

Which is what the struct for a C function looks like would be replaced with:

{"id", builtin_id, METH_FIXED, id_doc, 1}

This allows Unladen Swallow to emit direct calls to functions that take more than 1 argument, specifically up to 3 arguments. This results in functions like hasattr() and setattr() to be better optimized. This change ultimately results in a 7% speed increase in Unladen Swallow's Django benchmark. Here the speed gains will largely come from avoiding allocating a tuple for the arguments, as Python used to have to do since the functions were defined as METH_VARARGS (which results in it receiving it's arguments as a tuple), as well as avoiding parsing that tuple.

This change isn't as powerful as it could be, specifically it requires that the function always take the same number of arguments. This prevents optimizing calls to getattr() for example, which can take either 2 or 3 arguments. This optimization doesn't hold because C doesn't have any way of expressing default arguments for the function, therefore the CPython runtime must pass all of the needed arguments to a function, which means C functions need to have a way to encode their defaults in a way that CPython can understand. One of the proposed solutions to this problem is to have functions be able to provide the minimum number of arguments they take and then CPython could pad the provided arguments with NULLs to achieve the correct number of arguments to the function (interestingly the C standard allows more arguments to be passed to a function than it takes). This type of optimization would speed up calls to things like dict.get() and getattr().

As you can see the speed of a Python application can be fairly sensitive to how various internal things are handled, in this case the speed increase can be shown to come exclusively from eliminating a tuple allocation and some extra logic on certain function calls. If you're interested in seeing the full changeset it's available on the internet.

You can find the rest here. There are view comments.

Django's ManyToMany Refactoring

Posted November 4th, 2009. Tagged with django, orm, models, python, internals, gsoc.

If you follow Django's development, or caught next week's DjangoDose Tracking Trunk episode (what? that's not how time flows you say? too bad) you've seen the recent ManyToManyField refactoring that Russell Keith-Magee committed. This refactoring was one of the results of my work as a Google Summer of Code student this summer. The aim of that work was to bring multiple database support to Django's ORM, however, along the way I ran into refactor the way ManyToManyField's were handled, the exact changes I made are the subject of tonight's post.

If you've looked at django.db.models.fields.related you may have come away asking how code that messy could possibly underlie Django's amazing API for handling related objects, indeed the mess so is so bad that there's a comment which says:

# HACK

which applies to an entire class. However, one of the real travesties of this module was that it contained a large swath of raw SQL in the manager for ManyToMany relations, for example the clear() method's implementation looks like:

cursor = connection.cursor()
cursor.execute("DELETE FROM %s WHERE %s = %%s" % \
    (self.join_table, source_col_name),
    [self._pk_val])
transaction.commit_unless_managed()

As you can see this hits the trifecta, raw SQL, manual transaction handling, and the use of a global connection object. From my perspective the last of these was the biggest issue. One of the tasks in my multiple database branch was to remove all uses of the global connection object, and since this uses it it was a major target for refactoring. However, I really didn't want to rewrite any of the connection logic I'd already implemented in QuerySets. This desire to avoid any new code duplication, coupled with a desire to remove the existing duplication (and flat out ugliness), led me to the simple solution: use the existing machinery.

Since Django 1.0 developers have been able to use a full on model for the intermediary table of a ManyToMany relation, thanks to the work of Eric Florenzano and Russell Keith-Magee. However, that support was only used when the user explicitly provided a through model. This of course leads to a lot of methods that basically have two implementation: one for the through model provided case, and one for the normal case -- which is yet another case of code bloat that I was now looking to eliminate. After reviewing these items my conclusion was that the best course was to use the provided intermediary model if it was there, otherwise create a full fledged model with the same fields (and everything else) as the table that would normally be specially created for the ManyToManyField.

The end result was dynamic class generation for the intermediary model, and simple QuerySet methods for the methods on the Manager, for example the clear() method I showed earlier now looks like this:

self.through._default_manager.filter(**{
    source_field_name: self._pk_val
}).delete()

Short, simple, and totally readable to anyone with familiarity with Python and Django. In addition this move allowed Russell to fix another ticket with just two lines of code. All in all this switch made for cleaner, smaller code and fewer bugs.

Tomorrow I'm going to be writing about both the talk I'm going to be giving at PyCon, as well as my experience as a member of the PyCon program committee. See you then.

You can find the rest here. There are view comments.

Diving into Unladen Swallow's Optimizations

Posted November 3rd, 2009. Tagged with python, compile, internals, compiler, unladen-swallow.

Yesterday I described the general architecture of Unladen Swallow, and I said that just by switching to a JIT compiler and removing the interpretation overhead Unladen Swallow was able to get a performance gain. However, that gain is nowhere near what the engineers at Google are hoping to accomplish, and as such they've been working on building various optimizations into their JIT. Here I'm going to describe two particularly interesting ones they implemented during the 3rd quarter (they're doing quarterly releases).

Before diving into the optimizations themselves I should note there's one piece of the Unladen Swallow architecture I didn't discuss in yesterday's post. The nature of dynamic languages is that given code can do nearly anything depending on the types of the variables present, however in practice usually very few types are seen. Therefore it is necessary to collect information about the types seen in practice in order to perform optimizations. Therefore what Unladen Swallow has done is added data collection to the interpreter while it is executing bytecode. For example the BINARY_ADD opcode records the types of both of it's operands, the CALL_FUNCTION opcode records the function it is calling, and the UNPACK_SEQUENCE opcode records the type of the sequence it's unpacking. This data is then used when the function is compiled to generate optimal code for the most likely scenarios.

The first optimization I'm going to look at is one for the CALL_FUNCTION opcode. Python has a number of flags that functions defined in C can have, the two relevant to this optimization are METH_NOARGS and METHO_O. These flags indicate that the function (or method) in question take either 0 or 1 argument respectively (this is excluding the self argument on methods). Normally when Python calls a function it builds up a tuple of the arguments, and a dictionary for keyword arguments. For functions defined in Python CPython lines up the arguments with those the function takes and then sets them as local variables for the new function. For C functions they are given the tuple and dictionary directly and are responsible for parsing them themselves. By contrast functions with METH_NOARGS or METH_O receive their arguments (or nothing in the case of METH_NOARGS) directly.

Because calling METH_NOARGS and METH_O functions is so much easier than the alternate case (which involves several allocations and complex logic) when possible it is best to special case them in the generated assembly. Therefore, when compiling a CALL_FUNCTION opcode if using the data recorded there is only ever one function called (imagine a call to len, it is going to be the same len function every time), and that function is METH_NOARGS or METH_O then instead of generating a call to the usual function call machinery Unladen Swallow instead emits a check to make sure the function is actually the expected one and if it passes emits a call directly to the function with the correct arguments. If this guard fails then Unladen Swallow jumps back to the regular interpreter, leaving the optimized assembly. The reason for this is that the ultimately generated assembly can be more efficient when it only has to consider one best case scenario, as opposed to needing to deal with a large series of if else statements, which catalogue every single best case and the corresponding normal case. Ultimately, this results in more efficient code for calls to functions like len(), which are basically never redefined.

The next optimization we're going to look at is one for the LOAD_GLOBAL function. The LOAD_GLOBAL opcode is used for getting the value of a global variable, such as a builtin function, an imported class, or a global variable in the same module. In the interpreter the code for this opcode looks something like:

name = OPARG()
try:
    value = globals[name]
except KeyError:
    try:
        value = builtins[name]
    except KeyError:
        raise_exception(KeyError, name)
PUSH(value)

As you can see in the case of a builtin object (something like len, str, or dict) there are two dictionary lookups. While the Python dictionary is an exceptionally optimized datastructure it still isn't fast compared to a lookup of a local value (which is a single lookup in a C array). Therefore the goal of this optimization is to reduce the number of dictionary lookups needed to find the value for a global or builtin.

The way this was done was for code objects (the datastructures that hold the opcodes and various other internal details for functions) to register themselves with the globals and builtin dictionaries. By registering themselves the dictionaries are able to notify the code objects (similar to Django signals) whenever they are modified. The result of this is that the generated assembly for a LOAD_GLOBAL can perform the dictionary lookup once at compilation time and then the resulting assembly will be valid until the globals or builtins dictionary notifies the code object that they have been modified, thus rendering the assembly invalid. In practice this is very efficient because globals and builtins are very rarely modified.

Hopefully you've gotten a sense of the type of work that the people behind Unladen Swallow are doing. If you're interested in reading more on this type of work I'd highly recommend taking a look at the literature listed on the Unladen Swallow wiki, as they note that there is no attempt to do any original research, all the work being done is simply the application of existing, proven techniques to the CPython interpreter.

For the rest of this month I'm going to try to give a preview of the next day's post with each post, that way I can start thinking about it well in advance. Tomorrow I'm going to shift gears a little bit and write about the ManyToManyField refactoring I did over the summer and which was just committed to Django.

You can find the rest here. There are view comments.

Introduction to Unladen Swallow

Posted November 2nd, 2009. Tagged with python, compile, internals, compiler, unladen-swallow.

Unless you've been living under a rock for the past year (or have zero interest in either Python or dynamic languages, it which case why are you here?) you've probably heard of Unladen Swallow. Unladen Swallow is a Google funded branch of the CPython interpreter, with a goal of making CPython significantly faster while retaining both API and ABI compatibility. In this post I'm going to try to explain what it is Unladen Swallow is doing to bring a new burst of speed to the Python world.

In terms of virtual machines there are a few levels of complexity, which roughly correspond to their speed. The simplest type of interpreter is an AST evaluator, these are more or less the lowest of the low on the speed totem pole, up until YARV was merged into the main Ruby interpreter, MRI (Matz Ruby Interpreter) was this type of virtual machine. The next level of VM is a bytecode interpreter, this means that the language is compiled to an intermediary format (bytecode) which is then executed. Strictly speaking this is an exceptionally broad category which encompasses most virtual machines today, however for the purposes of this article I'm going to exclude any VMs with a Just-In-Time compiler from this section (more on them later). The current CPython VM is this type of interpreter. The most complex (and fastest) type of virtual machine is one with a Just-In-Time compiler, this means that the bytecode that the virtual machine interprets is also dynamically compiled into assembly and executed. This type of VM includes modern Javascript interpreters such as V8, Tracemonkey, and Squirellfish, as well as other VMs like the Hotspot Java virtual machine.

Now that we know where CPython is, and what the top of the totem pole looks like it's probably clear what Unladen Swallow is looking to accomplish, however there is a bit of prior art here that's worthy of taking a look. There is actually currently a JIT for CPython, named Psyco. Psyco is pretty commonly used to speed up numerical code, as that's what it's best at, but it can speed up most of the Python language. However, Psyco is extremely difficult to maintain and update. It only recently gained support for modern Python language features like generators, and it still only supports x86 CPUs. For these reasons the developers at Google chose to build their JIT rather than work to improve the existing solution (they also chose not to use one of the alternative Python VMs, I'll be discussing these in another post).

I just said that Unladen Swallow looked to build their own JIT, but that's not entirely true. The developers have chosen not to develop their own JIT (meaning their own assembly generator, and register allocator, and optimizer, and everything else that goes along with a JIT), they have instead chosen to utilize the LLVM (Low Level Virtual Machine) JIT for all the code generation. What this means is that instead of doing all the work I've alluded the devs can instead translate the CPython bytecode into LLVM IR (intermediate representation) and then use LLVM's existing JIT infrastructure to do some of the heavy lifting. This gives the devs more time to focus on the interesting work of how to optimize the Python language.

Now that I've layed out the background I'm going to dive into what exactly it is that Unladen Swallow does. Right now the CPython virtual machine looks something like this:

for opcode in opcodes:
    if opcode == BINARY_ADD:
        x, y = POP(), POP()
        z = x + y
        PUSH(z)
    elif opcode == JUMP_ABSOLUTE:
        pc = OPARG()
    # ...

This is both hugely simplified and translated into a Pythonesque psuedocode, but hopefully it makes the point clear, right now the CPython VM runs through the opcodes and based on what the opcode is executes some C code. This is particularly inefficient because there is a fairly substantial overhead to actually doing the dispatch on the opcode. What Unladen Swallow does is count the number of times a given Python function is called (the heuristic is actually slightly more complicated than this, but it's a good approximation of what happens), and when it reaches 10000 (the same value the JVM uses) it stops to compile the function using LLVM. Here what it does is essentially unrolls the interpreter loop, into the LLVM IR. So if you had the bytecode:

BINARY_ADD

Unladen Swallow would generate code like:

x, y = POP(), POP()
z = x + y
PUSH(z)

This eliminates all of the overhead of the large loop in the interpreter. Unladen Swallow also performs a number of optimizations based on Python's semantics, but I'll be getting into those in another post, for now LLVM run it's optimizers, which can improve the generated code somewhat, and then CPython executes the generated function. Now whenever this function is called in the future the optimized, assembly version of it is called.

This concludes the introduction to Unladen Swallow. Hopefully you've learned something about the CPython VM, Unladen Swallow, or virtual machines in general. In future posts I'm going to be diving in to some of the optimizations Unladen Swallow does, as well as what other players are doing in this space (particularly PyPy).

You can find the rest here. There are view comments.

A Second Look at Inheritance and Polymorphism with Django

Posted February 10th, 2009. Tagged with django, orm, models, python, metaclass, internals.

Previously I wrote about ways to handle polymorphism with inheritance in Django's ORM in a way that didn't require any changes to your model at all(besides adding in a mixin), today we're going to look at a way to do this that is a little more invasive and involved, but also can provide much better performance. As we saw previously with no other information we could get the correct subclass for a given object in O(k) queries, where k is the number of subclasses. This means for a queryset with n items, we would need to do O(nk) queries, not great performance, for a queryset with 10 items, and 3 subclasses we'd need to do 30 queries, which isn't really acceptable for most websites. The major problem here is that for each object we simply guess as to which subclass a given object is. However, that's a piece of information we could know concretely if we cached it for later usage, so let's start off there, we're going to be building a mixin class just like we did last time:

from django.db import models

class InheritanceMixIn(models.Model):
    _class = models.CharField(max_length=100)

    class Meta:
        abstract = True

So now we have a simple abstract model that the base of our inheritance trees can subclass that has a field for caching which subclass we are. Now let's add a method to actually cache it and retrieve the subclass:

from django.db import models
from django.db.models.fields import FieldDoesNotExist
from django.db.models.related import RelatedObject

class InheritanceMixIn(models.Model):
    ...
    def save(self, *args, **kwargs):
        if not self.id:
            parent = self._meta.parents.keys()[0]
            subclasses = parent._meta.get_all_related_objects()
            for klass in subclasses:
                if isinstance(klass, RelatedObject) and klass.field.primary_key \
                    and klass.opts == self._meta:
                    self._class = klass.get_accessor_name()
                    break
        return super(InheritanceMixIn, self).save(*args, **kwargs)

    def get_object(self):
        try:
            if self._class and self._meta.get_field_by_name(self._class)[0].opts != self._meta:
                return getattr(self, self._class)
        except FieldDoesNotExist:
            pass
        return self

Our save method is where all the magic really happens. First, we make sure we're only doing this caching if it's the first time a model is being saved. Then we get the first parent class we have (this means this probably won't play nicely with multiple inheritance, that's unfortunate, but not as common a usecase), then we get all the related objects this class has(this includes the reverse relationship the subclasses have). Then for each of the subclasses, if it is a RelatedObject, and it is a primary key on it's model, and the class it points to is the same as us then we cache the accessor name on the model, break out, and do the normal save procedure.

Our get_object function is pretty simple, if we have our class cached, and the model we are cached as isn't of the same type as ourselves we get the attribute with the subclass and return it, otherwise we are the last descendent and just return ourselves. There is one(possible quite large) caveat here, if our inheritance chain is more than one level deep(that is to say our subclasses have subclasses) then this won't return those objects correctly. The class is actually cached correctly, but since the top level object doesn't have an attribute by the name of the 2nd level subclass it doesn't return anything. I believe this can be worked around, but I haven't found a way yet. One idea would be to actually store the full ancestor chain in the CharField, comma separated, and then just traverse it.

There is one thing we can do to make this even easier, which is to have instances automatically become the correct subclass when they are pulled in from the DB. This does have an overhead, pulling in a queryset with n items guarantees O(n) queries. This can be improved(just as it was for the previous solution) by ticket #7270 which allows select_related to traverse reverse relationships. In any event, we can write a metaclass to handle this for us automatically:

from django.db import models
from django.db.models.base import ModelBase
from django.db.models.fields import FieldDoesNotExist
from django.db.models.related import RelatedObject

class InheritanceMetaclass(ModelBase):
    def __call__(cls, *args, **kwargs):
        obj = super(InheritanceMetaclass, cls).__call__(*args, **kwargs)
        return obj.get_object()

class InheritanceMixIn(models.Model):
    __metaclass__ = InheritanceMetaclass
    ...

Here we've created a fairly trivial metaclass that subclasses the default one Django uses for it's models. The only method we've written is __call__, on a metalcass what __call__ does is handle the instantiation of an object, so it would call __init__. What we do is do whatever the default __call__ does, so that we get an instances as normal, and then we call the get_object() method we wrote earlier and return it, and that's all.

We've now looked at 2 ways to handle polymorphism, with this way being more efficient in all cases(ignoring the overhead of having the extra charfield). However, it still isn't totally efficient, and it fails in several edge cases. Whether automating the handling of something like this is a good idea is something that needs to be considered on a project by project basis, as the extra queries can be a large overhead, however, they may not be avoidable in which case automating it is probably advantages.

You can find the rest here. There are view comments.

New Admin URLs

Posted January 14th, 2009. Tagged with django, internals.

I'm very happy to announce that with revision 9739 of Django the admin now uses normal URL resolvers and its URLs can be reversed. This is a tremendous improvement over the previous ad hoc system and it gives users the distinct advantage of being able to reverse the admin's URLs. However, in order to make this work a new feature went into the URL resolving system that any user can use in their own code.

Specifically users can now have objects which provide URLs. Basically this is because include() can now accept any iterable, rather than just a string which points to a urlconf. To get an idea of what this looks like you can look at the way the admin now does it.

There are going to be a few more great additions to Django going in as we move towards the 1.1 alpha so keep an eye out for them.

You can find the rest here. There are view comments.

Playing with Polymorphism in Django

Posted December 5th, 2008. Tagged with django, orm, models, python, internals.

One of the most common requests of people using inheritance in Django, is to have the a queryset from the baseclass return instances of the derives model, instead of those of the baseclass, as you might see with polymorphism in other languages. This is a leaky abstraction of the fact that our Python classes are actually representing rows in separate tables in a database. Django itself doesn't do this, because it would require expensive joins across all derived tables, which the user probably doesn't want in all situations. For now, however, we can create a function that given an instance of the baseclass returns an instance of the appropriate subclass, be aware that this will preform up to k queries, where k is the number of subclasses we have.

First let's set up some test models to work with:

from django.db import models

class Place(models.Model):
    name = models.CharField(max_length=50)

    def __unicode__(self):
        return u"%s the place" % self.name


class Restaurant(Place):
    serves_pizza = models.BooleanField()

    def __unicode__(self):
        return "%s the restaurant" % self.name

class Bar(Place):
    serves_wings = models.BooleanField()

    def __unicode__(self):
        return "%s the bar" % self.name

These are some fairly simple models that represents a common inheritance pattern. Now what we want to do is be able to get an instance of the correct subclass for a given instance of Place. To do this we'll create a mixin class, so that we can use this with other classes.

class InheritanceMixIn(object):
    def get_object(self):
        ...

class Place(models.Model, InheritanceMixIn):
    ...

So what do we need to do in our get_object method? Basically we need to loop each of the subclasses, try to get the correct attribute and return it if it's there, if none of them are there, we should just return ourself. We start by looping over the fields:

class InheritanceMixIn(object):
    def get_object(self):
        for f in self._meta.get_all_field_names():
            field = self._meta.get_field_by_name(f)[0]

_meta is where Django stores lots of the internal data about a mode, so we get all of the field names, this includes the names of the reverse descriptors that related models provide. Then we get the actual field for each of these names. Now that we have each of the fields we need to test if it's one of the reverse descriptors for the subclasses:

from django.db.models.related import RelatedObject

class InheritanceMixIn(object):
    def get_object(self):
        for f in self._meta.get_all_field_names():
            field = self._meta.get_field_by_name(f)[0]
            if isinstance(field, RelatedObject) and field.field.primary_key:

We first test if the field is a RelatedObject, and if it we see if the field on the other model is a primary key, which it will be if it's a subclass(or technically any one to one that is a primary key). Lastly we need to find what the name of that attribute is on our model and to try to return it:

class InheritanceMixIn(object):
    def get_object(self):
        for f in self._meta.get_all_field_names():
            field = self._meta.get_field_by_name(f)[0]
            if isinstance(field, RelatedObject) and field.field.primary_key:
                try:
                    return getattr(self, field.get_accessor_name())
                except field.model.DoesNotExist:
                    pass
        return self

We try to return the attribute, and if it raises a DoesNotExist exception we move on to the next one, if none of them return anything, we just return ourself.

And that's all it takes. This won't be super efficient, since for a queryset of n objects, this will take O(n*k) given k subclasses. Ticket 7270 deals with allowing select_related() to work across reverse one to one relations as well, which will allow one to optimise this, since the subclasses would already be gotten from the database.

You can find the rest here. There are view comments.

Fixing up our identity mapper

Posted December 1st, 2008. Tagged with django, foreignkey, orm, models, internals.

The past two days we've been looking at building an identity mapper in Django. Today we're going to implement some of the improvements I mentioned yesterday.

The first improvement we're going to do is it have it execute the query as usual and just cache the results, to prevent needing to execute additional queries. This means changing the __iter__ method on our queryset class:

def __iter__(self):
    for obj in self.iterator():
        try:
            yield get_from_cache(self.model, obj.pk)
        except KeyError:
            cache_instance(obj)
            yield obj

Now we just iterate over self.iterator() which is a slightly lower level interface to a querysets iteration, it bypasses all the caching that occurs(this means that for now at least, if we iterate over our queryset twice we actually execute two queries, whereas Django would normally do just one), however overall this will be a big win, since before if an item wasn't in the cache we would do an extra query for it.

The next improvement I proposed was to use Django's built in caching interfaces. However, this won't work, this is because the built in locmem cache backend pickles and unpickles everything before caching and retrieving everything from the cache, so we'd end up with different objects(which defeats the point of this).

The last improvement we can make is to have this work on related objects for which we already know the primary key. The obvious route to do this is to start hacking in django.db.models.fields.related, however as I've mentioned in a previous post this area of the code is a bit complex, however if we know a little bit about how this query is executed we can do the optimisation in a far simpler way. As it turns out the related object descriptor simply tries to do the query using the default manager's get method. Therefore, we can simply special case this occurrence in order to optimise this. We also have to make a slight chance to our manager, as by default the manager won't be used on related object queries:

class CachingManager(Manager):
    use_for_related_fields = True
    def get_query_set(self):
        return CachingQuerySet(self.model)

class CachingQueryset(QuerySet):
    ...
    def get(self, *args, **kwargs):
        if len(kwargs) == 1:
            k = kwargs.keys()[0]
            if k in ('pk', 'pk__exact', '%s' % self.model._meta.pk.attname, '%s__exact' % self.model._meta.pk.attname):
                try:
                    return get_from_cache(self.model, kwargs[k])
                except KeyError:
                    pass
        clone = self.filter(*args, **kwargs)
        objs = list(clone[:2])
        if len(objs) == 1:
            return objs[0]
        if not objs:
            raise self.model.DoesNotExist("%s matching query does not exist."
                             % self.model._meta.object_name)
        raise self.model.MultipleObjectsReturned("get() returned more than one %s -- it returned %s! Lookup parameters were %s"
                % (self.model._meta.object_name, len(objs), kwargs))

As you can see we just add one line to the manager, and a few lines to the begging of the get() method. Basically our logic is if there is only one kwarg to the get() method, and it is a query on the primary key of the model, we try to return our cached instance. Otherwise we fall back to executing the query.

And with this we've improved the efficiency of our identity map, there are almost definitely more places for optimisations, but now we have an identity map in very few lines of code.

You can find the rest here. There are view comments.

A Few More Thoughts on the Identity Mapper

Posted December 1st, 2008. Tagged with django, foreignkey, orm, models, internals.

It's late, and I've my flight was delayed for several hours so today is going to be another quick post. With that note here are a few thoughts on the identity mapper:
  • We can optimize it to actually execute fewer queries by having it run the query as usual, and then use the primary key to check the cache, else cache the instance we already have.
  • As Doug points out in the comments, there are built in caching utilities in Django we should probably be taking advantage of. The only qualification is that whatever cache we use needs to be in memory and in process.
  • The cache is actually going to be more efficient than I originally thought. On a review of the source the default manager is used for some related queries, so our manager will actually be used for those.
  • The next place to optimize will actually be on single related objects(foreign keys and one to ones). That's because we already have their primary key and so we can check for them in the cache without executing any SQL queries.

And lastly a small note. As you may have noticed I've been doing the National Blog Everyday for a Month Month, since I started two days late, I'm going to be continueing on for another two days.

You can find the rest here. There are view comments.

And now for the disclaimer

Posted November 14th, 2008. Tagged with django, python, disclaimer, internals.

I've been discussing portions of how the Django internals work, and this is powerful knowledge for a Django user. However, it's also internals, and unless they are documented internals are not guaranteed to continue to work. That doesn't mean they break very frequently, they don't, but you should be aware that it has no guarantee of compatibility going forward.

Having said that, I've already discussed ways you can use this to do powerful things by using these, and you've probably seen other ways to use these in your own code. In my development I don't really balk at the idea of using the internals, because I track Django's development very aggressively, and I can update my code as necessary, but for a lot of developers that isn't really an options. Once you deploy something you need it to work, so your options are to either lock your code at a specific version of Django, or not using these internals. What happens if you want to update to Django 1.1 for aggregation support, but 1.1 also removed some internal helper function you were using. Something similar to this happened to django-tagging, before the queryset-refactor branch was merged into trunk there was a helper function to parse portions of the query, and django-tagging made use of this. However, queryset-refactor obsoleted this function and removed, and so django-tagging had to update in order to work going forward, needing to either handle this situation in the code itself, or to maintain two separate branches.

In my opinion, while these things may break, they are worth using if you need them, because they let you do very powerful things. This may not be the answer for everyone though. In any event I'm going to continue writing about them, and if they interest you Marty Alchin has a book coming out, named Pro Django, that looks like it will cover a lot of these.

You can find the rest here. There are view comments.