Fibra results:
python timeit.py -n 3 -s "import hackysack" "hackysack.runit(10000,1000, dbg=0)"
3 loops, best of 3: 227 msec per loop
Kamaelia Results: Using this code.
python timeit.py -n 3 -s "import hackysack" "hackysack.runit(10000,1000)
3 loops, best of 3: 2.41 sec per loop
Stackless results: Using this code.
python timeit.py -n 3 -s "import hackysack" "hackysack.runit(10000,1000, dbg=0)"
3 loops, best of 3: 31.5 msec per loop
These benchmarks show that Stackless is 7x faster than Fibra, and Fibra is 10x faster than Kamaelia.
This is the code I used to benchmark Fibra.
import fibra
import random
import sys
scheduler = fibra.schedule()
class hackysacker:
counter = 0
def __init__(self,name,circle):
self.name = name
self.circle = circle
circle.append(self)
self.messageQueue = fibra.Tube()
scheduler.install(self.messageLoop())
def incrementCounter(self):
hackysacker.counter += 1
if hackysacker.counter >= turns:
while self.circle:
hs = self.circle.pop()
if hs is not self:
return hs.messageQueue.push('exit')
sys.exit()
def messageLoop(self):
while 1:
message = yield self.messageQueue.pop()
if message == "exit":
debugPrint("%s is going home" % self.name)
return
debugPrint("%s got hackeysack from %s" % (self.name, message.name))
kickTo = self.circle[random.randint(0,len(self.circle)-1)]
debugPrint("%s kicking hackeysack to %s" % (self.name, kickTo.name))
yield self.incrementCounter()
yield kickTo.messageQueue.push(self)
def debugPrint(x):
if debug:
print x
debug=1
hackysackers=5
turns = 5
def runit(hs=10000,ts=1000,dbg=1):
global hackysackers,turns,debug
hackysackers = hs
turns = ts
debug = dbg
hackysacker.counter= 0
circle = []
one = hackysacker('1',circle)
for i in range(hackysackers):
hackysacker(`i`,circle)
def main():
yield one.messageQueue.push(one)
scheduler.install(main())
scheduler.run()
if __name__ == "__main__":
runit()
15 comments:
Hi
Your Kamaelia link requires a www. infront.
Cheers
Thanks.
Shouldn't you disable debugging for the Kamaelia test as well?
The Kamelia code was modified and has no debugging statements in it, therefore does not need to be turned off.
sounds pretty good :)
I don't know how fibra works... so to help understand your code... could you please add some comments?
I wonder how it would compare to a threaded version?
cu.
I'm not really sure how the hackysack thing works either, I just copied the stackless example and replaced the threaded pieces with fibra pieces.
A threaded version will crash your computer. 10000 OS level threads is simply too much! :-)
ah oh well. I'm sure python threads would be waaay slower... it'd just be interesting to compare them :)
well, you can use 10,000 on some systems if configured correctly... even 100,000
eg. 10,000 python threads on ubuntu...
ulimit -s 300;python -c "import pygame.threads;pygame.threads.init(10000);pygame.threads.tmap(lambda x:x, range(10000))";ulimit -s 8096
Cool! I didn't know you could do that.
On my Mac, I had to limit it to 1000 threads.
Threads = 830 msec
Fibra = 39.8 msec
Kamelia = 324 msec
Stackless = 9.8 msec
Hmm this is interesting. I might graph the scaling properties of each library.
Hi Simon,
FYI, I've blogged about this, and why I think Fibra is so much quicker in this case. Hackysack is a pretty unnatural scenario for any concurrency system IMO, but that doesn't stop it being an interesting benchmark, and doesn't stop your results being really cool :-)
I think the key difference really is you've focussed on a better scheduler, and have schedule generally passive things, whereas we have a (deliberately) simple scheduler and schedule generally active things.
Whatever the reason though, nice work! :)
Regards, Michael
Greetings Michael.
Thanks for the response post, but I can't seem to leave a comment at your blog in Safari?
The behavior you observed in the Fibra scheduler is coming from the handlers/tube.py handler class. The core scheduler degenerates into a round robin scheduler when not using any handlers. Using a different handler for each yielded type allows Fibra to do implement specialised scheduling rules when a task yields control.
For example, if a task is waiting for a value to arrive via it's tube, the tube handler will remove the task from the main schedule (swaps the task out) until a value arrives in the tube. The sleep handler does something similar. This keeps the number of active tasks to a minimum.
I'm not sure what's up with comments on my blog - I use dojo to enable wysiwyg editting, and the version used there doesn't always do the right thing, so rather than discriminate against safari users I've switched them off completely for the moment.
I'll have to delve into why kamaelia is being so much slower here in more detail, but overheads inside the scheduler is my biggest suspicion. Since you're running 10x faster that really shows kamaelia is doing __really__ badly, and that really sucks ... and thanks :) for the heads up just how badly it sucks. :-/
Your graphs show that as well. Regarding the debugging comment above, I doubt that's a major factor (could be, but I doubt it's *that* much of a factor).
Right, in case I wasn't being exactly clear, I think the difference in performance is likely due to the implementation of the Tube, rather than the core scheduler algorithm. Though, in the case of Fibra, the Tube wouldn't work without the scheduler...
As far as the debugging flag is concerned, I think all the debugging code was stripped from the Kamaelia test case, so the Kamaelia test has a extremely tiny advantage in this regard.
Hi,
Do you plan on integrating the old network module into the new version of fibra ?
Thanks for running these tests. I was wondering how Parallel Python compares when added to the mix, understanding that it's more oriented towards SMP than coroutines.
http://www.parallelpython.com/
self.circle[random.randint(0,len(self.circle)-1)]
should be written as
random.choice(self.circle)
Post a Comment