DerekSchaefer.NET I do stuff, you read about it!

7Jun/110

Nginx + Django = Yay!

A couple weeks ago I replaced Apache with Nginx on all our servers hosting Django apps. The first thing I noticed was how much more simple the setup and configuration process was. The installation was easier, the configuration syntax is easier, and getting it to run in an existing Django environment was easier. So, pretty much everything was easier.

The next thing I noticed was the lowered memory footprint and CPU usage under load. I don't have hard numbers yet, but it is significantly lower. Further, and I only say this because I cannot substantiate it yet, it just feels faster. Based on other more technical reviews and comparison, I know this to be the case. Nevertheless, I plan on doing a more scientific evaluation of the situation, mainly regarding the number of requests per second that both can sustain on the same hardware and configuration.

Another piece of software that becomes useful when deploying Django on Nginx is FastCGI, installed conveniently in the python-flup package (sudo apt-get install python-flup). It includes a FastCGI server and requires essentially zero configuration. Although, for convenience, a control script like the one below will be useful.

Here is my shell script for managing the FastCGI server:

#!/bin/bash

CWD=$(cd `dirname $0` && pwd)

MYAPP=test_app
PIDFILE=/tmp/${MYAPP}_fcgi.pid
HOST=127.0.0.1
PORT=8080

# Associate it with the settings file
#SETTINGS=
# Use a socket instead of host/port
#SOCKET=
# Maximum requests for a child to service before expiring

METHOD=prefork
# Maximum number of children to have idle
MAXSPARE=5
# Minimum number of children to have idle
MINSPARE=5
# Maximum number of children to spawn
MAXCHILDREN=10
#MAXREQ=
# Spawning method - prefork or threaded

cd "`dirname $0`"

function failure () {
  STATUS=$?;
  echo; echo "Failed $1 (exit code ${STATUS}).";
  exit ${STATUS};
}

function start_server () {
  ./manage.py runfcgi pidfile=$PIDFILE \
    ${HOST:+host=$HOST} \
    ${PORT:+port=$PORT} \
    ${SOCKET:+socket=$SOCKET} \
    ${SETTINGS:+--settings=$SETTINGS} \
    ${MAXREQ:+maxrequests=$MAXREQ} \
    ${METHOD:+method=$METHOD} \
    ${MAXSPARE:+maxspare=$MAXSPARE} \
    ${MINSPARE:+minspare=$MINSPARE} \
    ${MAXCHILDREN:+maxchildren=$MAXCHILDREN} \
    ${DAEMONISE:+damonize=True}
}

function stop_server () {
  kill `cat $PIDFILE` || failure "stopping fcgi"
  rm $PIDFILE
}

DAEMONISE=$2

case "$1" in
  start)
    echo -n "Starting fcgi: "
    [ -e $PIDFILE ] && { echo "PID file exists."; exit; }
    start_server || failure "starting fcgi"
    echo "Done."
    ;;
  stop)
    echo -n "Stopping fcgi: "
    [ -e $PIDFILE ] || { echo "No PID file found."; exit; }
    stop_server
    echo "Done."
    ;;
  poll)
    [ -e $PIDFILE ] && exit;
    start_server || failure "starting fcgi"
    ;;
  restart)
    echo -n "Restarting fcgi: "
    [ -e $PIDFILE ] || { echo -n "No PID file found..."; }
    stop_server
    start_server || failure "restarting fcgi"
    echo "Done."
    ;;
  *)
    echo "Usage: $0 {start|stop|restart} [--daemonise]"
    ;;
esac

exit 0

And here is a basic Nginx config that hooks into FastCGI (so much more simple than Apache!):

server {
        listen 80 default;
        server_name localhost;

        access_log /var/log/nginx/localhost.access.log;

        location / {
                root /var/www/nginx-default;
                index index.html index.htm;
        }

        location /django {
                fastcgi_pass 127.0.0.1:8080;
                fastcgi_param PATH_INFO $fastcgi_script_name;
                fastcgi_param REQUEST_METHOD $request_method;
                fastcgi_param QUERY_STRING $query_string;
                fastcgi_param CONTENT_TYPE $content_type;
                fastcgi_param CONTENT_LENGTH $content_length;
                fastcgi_param REMOTE_ADDR $remote_addr;
                fastcgi_param SERVER_PORT $server_port;
                fastcgi_param SERVER_PROTOCOL $server_protocol;
                fastcgi_pass_header Authorization;
                fastcgi_intercept_errors off;
        }

        location /site_media/ {
                alias /home/ubuntu/cbot/media/;
                access_log off;
                expires modified +24h;
        }

        location /admin_media/ {
                alias /usr/share/pyshared/django/contrib/admin/media/;
                access_log off;
                expires modified +24h;
        }
}
22Apr/110

Django Caching

Caching is a great way to dramatically improve performance, and Django makes it wonderfully straightforward.

All of my examples are going to use memcached, as it by far the most efficient option and also one of the easiest to setup. I'm also going to assume that you're running your Django site on Linux (if not, why not!?), specifically Debian or any variant thereof.

First, you need to install memcached:

sudo apt-get install memcached

That will also take care of most of the setup for you. Now let's run it:

memcached -d -m 2048 -u root

The -d option tells it to run in a daemon (background) mode. -m tells it how many MBs of cache to allocate on the heap, and -u for which user to run under. There are many more options than this, and you should probably create a user for it and not run under root, but this is the easiest way to get it up and going.

Now, we need to install a Python library to interface with memcached. The best option that I've found thus far is python-memcache. Time for a little apt-get:

sudo apt-get install python-memcache

...and now you're good to go! Let's get to it.

First thing's first, you've got to add a single entry to your Django settings file (so difficult!):

# Use memcache on the server (much more efficient), local memory caching in dev
CACHE_BACKEND = 'memcached://127.0.0.1:11211/' if SERVER else 'locmem://'

This directive tells Django to use the memcached instance running on localhost (change if you need to) and at the default port, 11211. I have the cache backend set dynamically because I don't use memcahed on my development machine, and instead use basic memory caching which is not nearly as efficient (you shouldn't use it on a server) but it works just the same. Great for testing.

Now that the easy stuff is out of the way, it's time to take a look at how caching in Django actually functions. There are actually four different methods of caching, listed here in terms of increasing complexity:

  1. Site-wide
  2. Per-view
  3. Template fragment
  4. Low-level

Let's take a look at the details...

Site-Wide Caching

Side-wide caching will automagically cache EVERY page that doesn't have GET or POST variables. Be careful! This can have adverse effects in many cases, and I find it to be a bit heavy-handed, but I suppose there are some cases where this would be very useful.

In order to enable site-wide caching, you must add a few more entries to your settings:

# This tells memcached how to long hold each entry in memory
CACHE_MIDDLEWARE_SECONDS = 60
# And this sets the cache key prefix, which is useful if you are
# running many things on the same memcached instance
CACHE_MIDDLEWARE_KEY_PREFIX = ''

# You know what this is...
MIDDLEWARE_CLASSES = (
    ...
    'django.middleware.cache.UpdateCacheMiddleware',
    'django.middleware.common.CommonMiddleware',
    'django.middleware.cache.FetchFromCacheMiddleware',
    ...
)

Doing this also has the added benefit of setting the various HTTP and HTML caching control options automagically using your CACHE_MIDDLEWARE_SECONDS value.

And now your website has caching!

Per-View Caching

This option is quite a bit more flexible than site-wide caching. It will also cache based on the URL, so a call to "/my_super_cached_view/1/" will have a difference cache entry than "/my_super_cached_view/2/".

You can use it as a decorator:

from django.views.decorators.cache import cache_page

@cache_page(5 * 60) # cache for 5 minutes
def my_super_cached_view(request):
    ...

...or you can set it in your URLs file with the same import:

urlpatterns = ('',
    ...
    (r'^my_super_cached_view/(?P\d{4})/$', cache_page(my_super_cached_view, 5 * 50)),
    ...
)

Template Fragment Caching

This is the caching option that I have gotten the least mileage out of. Other people may have more use for it, but my development style doesn't generally favor it.

Within your template, import the cache library:

{% load cache %}
{% cache 120 footer request.user.username %}
    ...
{% endcache %}

In this example, 120 is the number of sections, footer is the name of the cache entry, and the third argument is optional and makes whatever is in footer cache uniquely for each user.

Low-Level Caching

This method is the one that I've used the most, as it is by far the most flexible, but is still incredibly simple and not very "low-level" in my opinion, but I suppose everything is relative! This method allows you to complete specify under which conditions to save and pull from the cache, unlike the previous versions which varied based on the URL or a single variable, like the active user.

Here's an adapted example of how I've used this method to great success:

from django.core.cache import cache

def my_view(request):
    # Cache per user and account
    account = request.session['account']
    user = account.holder.username
    key = 'key_prefix-%s-%s' % (user, account.code)
    # Is it in the cache?
    cached_html = cache.get(key)
    if not cached_html:
        # If not, then let's render the HTML and save it
        cached_html = render_to_string(
            'templates/my_view.html',
            {'foo':expensive_per_user_function(user)}
        )
        cache.set(key, cached_html, time_until_midnight())
    return HttpResponse(cached_html)

In this example, the rendering of a certain section of HTML was very expensive but varied in such a way that using the template variety was inconvenient. As you can see there are only really two methods, cache.get() and cache.set(). Quite simple!

You control the cache by specifying the key, which here is unique for each user viewing a custom a account object. It just so happened that the data that I cached here updated at midnight each night, so I used a utility function called time_until_midnight() that returns the number of seconds until 12am, which you can see here:

def time_until_midnight():
    """ Returns the seconds until midnight """
    today = datetime.date.today()
    tomorrow = today + datetime.timedelta(1)
    midnight = datetime.datetime(tomorrow.year, tomorrow.month, tomorrow.day)
    difference = midnight - datetime.datetime.now()
    return difference.seconds

Done and done!

Caching is your friend.

30Jul/102

Web Development Using Django, a Python-Based Framework

Recently, I've been using Django at work to develop a corporate web application. This is first time that I have used Django for anything this extensive, and it has been a positively pleasant and, what is probably more important, incredibly productive experience.

Some descriptive excerpts from Wikipedia:

Django is an open source web application framework, written in Python, which follows the model-view-controller (MVC) architectural pattern.

Django's primary goal is to ease the creation of complex, database-driven websites. Django emphasizes reusability and "pluggability" of components, rapid development, and the principle of DRY (Don't Repeat Yourself). Python is used throughout, even for settings, files, and data models.

...and Django is quite good at those things. And of course, being written in Python (and using the mod_wsgi Apache module), it is also very fast and efficient.

Additionally, if you are looking for a good CMS for Django, you will likely find, unsurprisingly, Django CMS. It is a very elegant solution that is very extensible and easy to setup. I highly recommend it.