Last week I stuck my head out in a meeting and declared that XML is verbose and slow to parse and that we should move to something like Google's protocols buffers, or something readable such as json or YAML, which are easier to parse etc etc etc! Well is this really true ? The statement seems logical considering how verbose XML can be. Still, after the meeting, some questions stayed in my mind. So I thought I would do some tests. I used a FIX Globex (CME) swap trade confirmation message to test my theory.
So this simple test shows that using XML with cElementTree parser is not so slow, cjson wins in speed and the conclusion must be: Your performance will ultimately depend on your data and the quality of the libraries you have available.
I'll try to continue with these tests and maybe find a better YAML library.
Size | from Python | to Python | ||
---|---|---|---|---|
json | cjson | 2332 | 0.222238063812 | 0.0943419933319 |
pickle | cPickle | 1778 | 0.233518123627 | 0.128826141357 |
XML | cElementTree | 2083 | 0.407706975937 | 2.77832698822 |
json | simplejson | 2332 | 3.37723612785 | 5.11316084862 |
So this simple test shows that using XML with cElementTree parser is not so slow, cjson wins in speed and the conclusion must be: Your performance will ultimately depend on your data and the quality of the libraries you have available.
I'll try to continue with these tests and maybe find a better YAML library.