Wellfire Interactive // Expertise for established Django SaaS applications

Create Human Readable URLs in Django That Don't Break

You can use Django's URL routing to create easy to read URLs that can keep up with content changes.

You probably noticed that the URL of this blog post is made up of a few easy to read components. There’s the domain, the /learn/ route, and then the post’s slug, the lowercase title with dashes.

In a typical Django setup, this tells the server to use the blog’s URL configuration, and then to look up a blog post whose slug matches the following text. You could use something like blog.asp?post=12 (if you were running ASP… ahem) and the script you’re running there will pull up the blog post with row ID 12. It works, but it’s ugly and not terribly user friendly. Hence the ‘beautiful’ URLs you see here.

Slug and slug combinations - like the date components used by many blog engines - provide a unique lookup for the content which is a tremendous advantage over IDs over query parameter. However - and this is key - they’re fragile to content changes. Presuming you care that the slug of your content matches the content in some way you run the risk of one of two scenarios, either a poorly descriptive slug due to a content update or a broken link.

The former is a problem that has plagued even major publishers before after needing to make some editorial change to an article, and the latter is pretty self-descriptive.

This is not a problem for sites using plain old IDs as lookups. Thanksfully you can take the best of both worlds and guarantee content retrieval without horrible URLs.

Area man reads non-breaking slug-based URLs

When The Onion launched their Django-based refresh back in 2010, I noticed something funky with the article URLs. Instead of the usual slug, like this:

http://www.theonion.com/articles/chimp-in-cocaine-study-starts-lying-to-friends/

the new URLs looked like this:

http://www.theonion.com/articles/chimp-in-cocaine-study-starts-lying-to-friends,17176/

It looks like they had added the ID of the article to the slug. I wasn’t sure why until I started messing around with both parameters and realized that the slug wasn’t used at all to retrieve the articles. You could change the URL from this:

http://www.theonion.com/articles/chimp-in-cocaine-study-starts-lying-to-friends,17176/

to this:

http://www.theonion.com/articles/rich-guy-feeling-left-out-of-recession,17176/

and still get the original article about the coke addled chimp.

The article ID is what Django uses to retrieve the article. The slug is there for human eyes, but it’s just window dressing. It makes sense to you if you see the address bar, it makes it easier to access the page through your location bar history, and it’s got more search result mojo (or so the SEO mavens keep telling us).

This has become a fairly common place strategy in the years since.

Combining auto primary keys with descriptive slugs

The Django answer is to use both. Set up your URL to take a slug and an integer as parameters, but in your view use only the integer primary key to access the article (or whatever object it is you want). The object’s get_absolute_url object inserts both slug and primary key into the URL and ouila, fast, readable URLs.

urls.py

url(r'^(?P<slug>[-\w\d]+),(?P<id>\d+)/$', view=myviews.article, name='article'),

Or if you’re using Django’s new path syntax:

path('<slug:slug>,<int:id>/', views.article_detail, name='article'),

views.py

def get_redirected(queryset_or_class, lookups, validators):
    """
    Calls get_object_or_404 and conditionally builds redirect URL
    """
    obj = get_object_or_404(queryset_or_class, **lookups)
    for key, value in validators.items():
        if value != getattr(obj, key):
            return obj, obj.get_absolute_url()
    return obj, None

def my_view(request, slug, id):
    article, article_url = get_redirected(Article, {'pk': id}, {'slug': slug})
    if article_url:
        return redirect(article_url)
    # everything else in your view

models.py

class Article(models.Model):
  def get_absolute_url(self):
        return reverse('article', kwargs={'slug': self.slug, 'id':self.id})

There are more elegant URL patterns - Stack Overflow uses a standard hierarchal pattern with directory levels and the ID can also be trailed using a dash - but the concept remains.