Wednesday, February 07, 2007

RESTful TurboGears

Over the last 6 months, I've been involved with two large projects building web applications in the TurboGears framework.

We've put together some rather complex systems very quickly, and I believe our success in part was aided by using a CherryPy controller which let us easily map RESTful, elegant URLs to a sensible controller structure. We made sure that controller code did a minimum of work. Most of the time, the controller code only manipulated the model via method calls on the model objects.

For the uninitiated, a URL can usually be considered RESTful, when the URL path identifies the resource which is being fetched (using a GET request) or modified (a POST request).

An excellent side-effect of having a RESTful URL scheme, is that it encourages clear thinking about your application, and if you have the right controller class, it can accelerate your normal web development cycle beyond rapid! At least, thats the way it happened to me :-)

The Resource class presented below is derived from a class developed for the Scouta project, and the fellows over there have kindly given permission for me to release it. It provides integration with the TurboGears validation framework, and also works with web and browser caches by doing last-modified-date checks when requested.

The original inspiration for this controller comes from a recipe in the Python Cookbook.



import cherrypy
from turbogears import redirect, expose, error_handler
from datetime import datetime
from time import gmtime, strptime


def parse_http_date(timestamp_string):
if timestamp_string is None: return None
test = timestamp_string[3]
if test == ',':
format = "%a, %d %b %Y %H:%M:%S GMT"
elif test == ' ':
format = "%a %d %b %H:%M:%S %Y"
else:
format = "%A, %d-%b-%y %H:%M:%S GMT"
return datetime(*strptime(timestamp_string, format)[:6])


class Resource(object):
children = {}

def __init__(self):
error_function = getattr(self.__class__, 'error', None)
if error_function is not None:
#If this class defines an error handling method (self.error),
#then we should decorate our methods with the TG error_handler.
self.get = error_handler(error_function)(self.get)
self.modify = error_handler(error_function)(self.modify)
self.new = error_handler(error_function)(self.new)

@classmethod
def get_child(cls, token):
return cls.children.get(token, None)

@expose()
def default(self, *path, **kw):
request = cherrypy.request
path = list(path)
resource = None
http_method = request.method.lower()
#check the http method is supported.
try:
method_name = dict(get='get',post='modify')[http_method]
except KeyError:
raise cherrypy.HTTPError(501)

if not path: #If the request path is to a collection.
if http_method == 'post':
#If the method is a post, we call self.create which returns
#a class which is passed into the self.new method.
resource = self.create(**kw)
assert resource is not None
method_name = 'new'
elif http_method == 'get':
#If the method is a get, call the self.index method, which
#should list the contents of the collection.
return self.index(**kw)
else:
#Any other methods get rejected.
raise cherrypy.HTTPError(501)

if resource is None:
#if we don't have a resource by now, (it wasn't created)
#then try and load one.
token = path.pop(0)
resource = self.load(token)
if resource is None:
#No resource found?
raise cherrypy.HTTPError(404)

#if we have a path, check if the first token matches this
#classes children.
if path:
token = path.pop(0)
child = self.get_child(token)
if child is not None:
child.parent = resource
#call down into the child resource.
return child.default(*path, **kw)
else:
raise cherrypy.HTTPError(404)

if http_method == 'get':
#if this resource has children, make sure it has a '/'
#on the end of the URL
if getattr(self, 'children', None) is not None:
if request.path[-1:] != '/':
redirect(request.path + "/")
#if the client already has the request in cache, check
#if we have a new version else tell the client not
#to bother.
modified_check = request.headers.get('If-Modified-Since', None)
modified_check = parse_http_date(modified_check)
if modified_check is not None:
last_modified = self.get_last_modified_date(resource)
if last_modified is not None:
if last_modified <= modified_check:
raise cherrypy.HTTPRedirect("", 304)

#run the requested method, passing it the resource
method = getattr(self, method_name)
response = method(resource, **kw)
#set the last modified date header for the response
last_modified = self.get_last_modified_date(resource)
if last_modified is None:
last_modified = datetime(*gmtime()[:6])

cherrypy.response.headers['Last-Modified'] = (
datetime.strftime(last_modified, "%a, %d %b %Y %H:%M:%S GMT")
)

return response

def get_last_modified_date(self, resource):
"""
returns the last modified date of the resource.
"""
return None

def index(self, **kw):
"""
returns the representation of a collection of resources.
"""
raise cherrypy.HTTPError(403)

def load(self, token):
"""
loads and returns a resource identified by the token.
"""
return None

def create(self, **kw):
"""
returns a class or function which will be passed into the self.new
method.
"""
raise cherrypy.HTTPError(501)

def new(self, resource_factory, **kw):
"""
uses resources factory to create a resource, commit it to the
database.
"""
raise cherrypy.HTTPError(501)

def modify(self, resource, **kw):
"""
uses kw to modifiy the resource.
"""
raise cherrypy.HTTPError(501)

def get(self, resource, **kw):
"""
fetches the resource, and returns a representation of the resource.
"""
raise cherrypy.HTTPError(501)


This Resource class looks complicated, but it really makes writing nice URL systems in TurboGears a piece of cake. A contrived example will illustrate best. :-)

The below code demonstrates how to set up two classes, which allow users to be listed, individual users viewed, user posts listed, and individual posts viewed using these URLs.

/users
/users/simon
/users/simon/posts
/users/simon/posts/post_1

Notice how the classes integrate quite nicely with TurboGears validators. You only need to define one error function, and the Resource controller makes sure it gets called if validation fails on any of your get, modify or new method calls. This example uses SQLAlchemy for its model.



class Posts(Resource):
def load(self, post_id):
return model.Post.get_by(user_id=self.parent.user_id, post_id=post_id)

@expose('scouta.templates.post_list')
def index(self):
return dict(posts=model.Post.select_by(user_id=self.parent.user_id))

@expose('scouta.templates.post_get')
def get(self, post):
return dict(post=post)


class Users(Resource):
children = dict(posts=Posts())

@expose('scouta.templates.user_list')
def root(self):
return dict(users=model.User.select())

def load(self, user_name):
return model.User.get_by(user_name=(user_name))

def create(self, **kw)
return model.User

def error(self, tg_errors=None):
return tg_errors

@validate(validators=dict(
user_name=validators.PlainText(not_empty=True),
display_name=validators.UnicodeString(not_empty=True),
email_address=validators.Email(not_empty=True)
))
@identity.require(identity.not_anonymous())
def new(self, User, **kw):
new_user = User(**kw)
model.session.flush()
return dict(user=user)

def get_modified_date(self, user):
return user.last_modified_date

@expose('scouta.templates.user_get')
def get(self, user):
return dict(user=user)

@validate(validators=dict(
display_name = validators.UnicodeString(length=255, if_empty=None),
email_address=validators.Email(if_empty=None)
))
@identity.require(identity.not_anonymous())
def modify(self, user, **kw):
user.display_name = kw[display_name]
user.email_address = kw[email_address]
model.session.flush()
return dict(user=user)


You may notice that the Resource class has no support for PUT or DELETE requests. I've intentionally left these out, as they are not well supported across all browsers. Fortunately, we don't need them.

To insert a new user in the above example, simply post to /users/ and the controller will call the new method. If you want to delete, you need to treat your deletes as modify operations, eg, post to /users/simon
and set a delete flag. This is not very elegant, but it is a good compromise, as I've rarely (never!) needed to do a real delete call on a resource.

1 comment:

Anonymous said...

"An origin server SHOULD return the status code 405 (Method Not Allowed) if the method is known by the origin server but not allowed for the requested resource, and 501 (Not Implemented) if the method is unrecognized or not implemented by the origin server."

You should probably also allow HEAD (CherryPy will delete the body for you).

Also, you should have a look at cherrypy.lib.cptools.validate_since, which should replace all your modified_check code (you just have to set the Last-Modified response header first, then validate, instead of the other way around).

HTH

Popular Posts