Wednesday, March 04, 2009

Scalability of Stackless, Fibra and Kamaelia

After my last post, I decided to benchmark the scaling properties of Stackless, Kamaelia, Fibra using the same hackysack algorithm.

Left axis is milliseconds.
Bottom axis is number of tasks * 100.
Green line is Kamaelia.
Blue line is Fibra.
Red Line is Stackless.



These are the results, using Python 2.6.1 to run Fibra and Kamaelia, and Stackless 2.6.1 to run the Stackless test:




These are the results when using Stackless 2.6.1 to run all the tests:




It's quite interesting to see that Fibra copes with 600000 tasks better than 500000 tasks in both sets of results. Strange.

15 comments:

Jesse said...

Need some code simon. I'd like to monkey with this too.

Richard Tew said...

Ditto the code thing. Without it, all you have is a series of coloured lines ;-)

Richard Tew said...

Ah, it's linked in the previous post. Cheers.

Simon Wittber said...

OK Chaps, I've placed Fibra source and tests into SVN, which is a little more accessible than my BZR repo.

The tests/benchmarks/chart.py file is used to generate the graph.

chrism said...

Awesome work! Would love to see Circuits thrown into the mix too. http://pypi.python.org/pypi/circuits

michael said...

Simon, based on this:

for lib in "stacklessb", "fibrab", "threadingb", "kamaeliab":
print lib
for i in RANGE:
print i
t = timeit.Timer(setup="import %s.hackysack"%lib, stmt="%s.hackysack.runit(%d, 1000, dbg=0)"%(lib, i))

I think this is potentially unfair to stackless. I've noted that Kamaelia runs slightly faster on stackless than it does on standard python (fringe benefit of using stackless), and it's possible that the other non-stackless examples are being made to look better too, albeit inadvertantly.

Whilst it will make kamaelia look even worse :-(, it may be useful to try the stacklessb with stackless and the others with regular python. It would also probably be a slightly more realistic benchmark...

Simon Wittber said...

Hmm. I figured it would be an exactly level playing field if Stackless ran all the tests as well, though I guess that is just my point of view, as I can see your point also.

Unfortunately Stackless wants to overwrite my system Python on my Mac, so I cannot run both Pythons. Arrgh. This is the only bad thing about Stackless... the need for a custom interpreter.

Simon Wittber said...

@chrism:

Circuits doesn't explicitly offer any concurrency, but it does offer event dispatch which could be used to communicate amongst tasks which are supplied from some other library. In Fibra, event dispatch is done with:

msg = yield fibra.RecvMsg('HELLO')

and:

yield fibra.SendMsg('HELLO')

Richard Tew said...

Stackless modifies the generator mechanics. I was actually wondering about this, and assumed that the non-Stackless tests would have been run on standard Python.

Richard Tew said...

Note this comment from the Stackless source: "Generators are quite a bit slower in Stackless, because we are jumping in and out so much."

Simon Wittber said...

Hmm OK, you have convinced me. I'll rerun the tests and update the posts. :-)

Simon Wittber said...

I've updated the post to show test results for Python and Stackless. I've removed the threading example, as it was just tooo slow.

michael said...

Being honest, these results leave me with me with mixed feelings. On one hand I'm very pleased for you, and pleased to see you & stackless getting the results you do.

On the flip side the Kamaelia results are down right depressing and damning.

Simon Wittber said...

Michael, there is a few things to remember here.

Firstly, success is not measured by some contrived benchmark!

Secondly, the libraries are doing very different things. Kamaelia is effectively doing a busy loop while waiting for data to arrive in an inbox. In Fibra, the task is removed from the schedule until data arrives. This limits the way tubes can be used in Fibra, whereas Kamaelia remains more flexible in its use of inboxes.

Tartley said...

Hey,

I think there's a typo - something somewhere is out by a factor of ten. The horizontal axis, 'tasks * 100', ranges from 100 to 900, which I interpret to mean it ranges over 10,000 to 90,000. Then in the final sentence you reference positions on the graph at 600,000 and 500,000.

Shine on.

Jonathan

Popular Posts