Friday, June 15, 2007

Safer Serialization

The need for a secure / safe serialization module for the built-in Python types has reared its head again.

After looking around at the alternatives, I decided I should simply update my 2005-era gherkin module, and make it an easy installable package. When the cheeseshop comes back online I'll do the upload.

I've added support for sets, and complex numbers. I even tried to make it faster, but ended up in defeat. I had forgotten how much time I had already spent optimizing the thing... My 2007 brain could not best my 2005 brain... hmmm must be getting old.

In other news, the new Super Ajax-ified Media Widget (Scouta Play) went live earlier this week on the front page of It doesn't use gherkin, it uses json. :-) Band of None have also released a new tune, Hoffburger.


Seo Sanghyeon said...

How does it compare with Cerealizer?

Simon Wittber said...

Cerializer handles circular references (unlike marshal, gherkin, xmlrpclib), and appears to also handle instances, if the class is first registered.

gherkin uses a binary format, cerealizer uses ascii.

speed and the size of the data output varies quite differently between the two modules, based on the sort of objects being serialized.

The big problem for cerealizer though, is that it fails to work accurately with floats:

>> from cerealizer import dumps, loads
>> test = 3.3492934923942394
>> s = dumps(test)
>> o = loads(s)
>> o == test
>> False

Richard Jones said...

I presume the cerealizer people know about the float issue?

Simon Wittber said...

Yep, emailed this morning. I'm not sure how to solve the problem exactly... because cerealizer is an ascii protocol. Can python output all 8 bytes of a float in string repr?

Richard Jones said...

One of the developers here emailed the cerealizer folk a patch. They were str()ing instead of repr()ing. For those not aware of the difference, try str(1.1) vs. repr(1.1)

Simon Wittber said...

*smacks forehead*

Popular Posts