Monday, February 7, 2011

Serializion Performance

Last week  I stuck my head out  in a meeting and declared that XML is verbose and slow to parse and that we should move to something like Google's protocols buffers,  or something readable such as json or YAML, which are  easier to parse etc etc etc! Well is this really true ? The statement seems logical considering how verbose XML can be. Still, after the meeting, some questions stayed in my mind. So I thought I would do some tests. I used  a FIX Globex (CME) swap trade confirmation message to test my theory.


Sizefrom Pythonto Python
jsoncjson23320.2222380638120.0943419933319
picklecPickle17780.2335181236270.128826141357
XMLcElementTree20830.4077069759372.77832698822
jsonsimplejson23323.377236127855.11316084862






So this simple test shows that using XML with cElementTree parser  is not so slow, cjson wins in speed and the conclusion must be: Your performance will ultimately depend on your data and the quality of the libraries you have available.

I'll try  to continue with these tests and maybe find  a better YAML library.