Comparing IronPython and CPython

First a little background to help explain some of the terms, etc. "Python" is a language, similar to how "Java" is a language; unlike Java wherein the language is also relatively synonymous with the actual implementation of that language, Python has multiple implementations. If you've run python(1) from the command line, you're most likely running the CPython implementation of the Python language, in effect, Python implemented in C. Other implementations of Python exist, like Jython (implemented on top of the Java virtual machine), PyPy (Python implemented in Python), and IronPython (Python implemented on top of the .NET CLR).

I was talking with some of the guys from the #mono channel on GIMPNet about IronPython versus CPython as far as performance is concerned and I decided that I would refine my testing (using pybench) for more similar versions of the respective implementations, in as controlled of an environment as possible.

I ran pybench.py on a "quiet" (i.e. not-busy) machine sitting in a remote datacenter not too far from Novell, the machine is a Pentium III (i386) based machine running openSUSE 10.3. Since IronPython reports it's "implementation version" as Python 2.4.0, I decided to build and run CPython 2.4 against it. IronPython is running on top of the recently released Mono 1.2.6 which I also built from source (I got IronPython from the IPCE package in YaST however). pybench reported the various implementation details for both as such:

CPython

       Implementation: 2.4.4
       Executable:     /home/tyler/basket/bin/python
       Version:        2.4.4
       Compiler:       GCC 4.2.1 (SUSE Linux)
       Bits:           32bit
       Build:          Dec 18 2007 23:00:48 (#1)
       Unicode:        UCS2

IronPython

       Implementation: 2.4.0
       Executable:     /usr/lib/IPCE/ipy.exe
       Version:        2.4.0
       Compiler:       .NET 2.0.50727.42
       Bits:           32bit
       Build:           (#)
       Unicode:        UCS2

IronPython did alright, but it got pretty thrashed on a lot of the benchmarks. Unfortunately it's hard to tell whether it's Mono getting beaten up, or whether it's IronPython itself that's losing the battle here, running similar tests on the .NET 2.0 CLR would be beneficial but not something I am curious enough to boot a Windows virtual machine for. Regardless, here are the results, I've highlighed the rows where IronPython performs better than CPython.

Test Minimum Run-time Average Run-time
CPython IronPython Diff CPython IronPython Diff
BuiltinFunctionCalls: 448ms 357ms +25.4% 450ms 405ms +11.0%
BuiltinMethodLookup: 530ms 1329ms -60.1% 536ms 1390ms -61.4%
CompareFloats: 380ms 129ms +194.3% 381ms 132ms +187.7%
CompareFloatsIntegers: 377ms 93ms +306.1% 378ms 97ms +291.2%
CompareIntegers: 436ms 160ms +172.5% 437ms 161ms +170.6%
CompareInternedStrings: 425ms 443ms -4.1% 426ms 445ms -4.3%
CompareLongs: 360ms 292ms +23.3% 361ms 293ms +23.0%
CompareStrings: 423ms 330ms +28.0% 423ms 337ms +25.6%
CompareUnicode: 377ms 243ms +54.7% 377ms 245ms +54.2%
ConcatStrings: 726ms 9452ms -92.3% 823ms 10071ms -91.8%
ConcatUnicode: 711ms 5687ms -87.5% 756ms 6039ms -87.5%
CreateInstances: 508ms 761ms -33.2% 518ms 815ms -36.4%
CreateNewInstances: 451ms 3475ms -87.0% 458ms 3581ms -87.2%
CreateStringsWithConcat: 473ms 2650ms -82.1% 475ms 2833ms -83.2%
CreateUnicodeWithConcat: 482ms 1008ms -52.1% 508ms 1092ms -53.4%
DictCreation: 405ms 2944ms -86.2% 407ms 3057ms -86.7%
DictWithFloatKeys: 552ms 934ms -40.9% 553ms 944ms -41.5%
DictWithIntegerKeys: 423ms 1118ms -62.2% 426ms 1137ms -62.5%
DictWithStringKeys: 413ms 1186ms -65.1% 414ms 1317ms -68.6%
ForLoops: 412ms 189ms +118.5% 413ms 217ms +90.7%
IfThenElse: 372ms 128ms +191.8% 374ms 141ms +165.8%
ListSlicing: 311ms 4033ms -92.3% 315ms 4230ms -92.6%
NestedForLoops: 488ms 349ms +39.7% 489ms 382ms +28.1%
NormalClassAttribute: 430ms 1080ms -60.2% 432ms 1104ms -60.9%
NormalInstanceAttribute: 401ms 427ms -6.1% 404ms 442ms -8.7%
PythonFunctionCalls: 393ms 302ms +30.1% 402ms 352ms +14.3%
PythonMethodCalls: 478ms 643ms -25.7% 536ms 673ms -20.3%
Recursion: 547ms 158ms +245.9% 659ms 159ms +313.6%
SecondImport: 476ms 1383ms -65.6% 481ms 1432ms -66.4%
SecondPackageImport: 501ms 1425ms -64.8% 503ms 1482ms -66.1%
SecondSubmoduleImport: 589ms 1916ms -69.3% 592ms 1990ms -70.2%
SimpleComplexArithmetic: 475ms 729ms -34.9% 476ms 758ms -37.3%
SimpleDictManipulation: 424ms 1009ms -58.0% 427ms 1020ms -58.2%
SimpleFloatArithmetic: 416ms 455ms -8.7% 422ms 480ms -12.0%
SimpleIntFloatArithmetic: 345ms 161ms +113.8% 346ms 162ms +112.9%
SimpleIntegerArithmetic: 345ms 161ms +114.7% 345ms 161ms +113.9%
SimpleListManipulation: 346ms 497ms -30.4% 350ms 501ms -30.1%
SimpleLongArithmetic: 402ms 1120ms -64.1% 403ms 1130ms -64.3%
SmallLists: 417ms 1693ms -75.4% 421ms 1717ms -75.5%
SmallTuples: 450ms 3839ms -88.3% 453ms 3915ms -88.4%
SpecialClassAttribute: 431ms 1104ms -60.9% 432ms 1133ms -61.8%
SpecialInstanceAttribute: 608ms 423ms +43.8% 610ms 437ms +39.5%
StringMappings: 443ms 2255ms -80.3% 448ms 2311ms -80.6%
StringPredicates: 503ms 1058ms -52.5% 504ms 1066ms -52.7%
StringSlicing: 527ms 2880ms -81.7% 562ms 3008ms -81.3%
TryExcept: 418ms 21ms +1905.2% 418ms 39ms +985.6%
TryRaiseExcept: 587ms 6670ms -91.2% 591ms 6733ms -91.2%
TupleSlicing: 390ms 1817ms -78.5% 397ms 1863ms -78.7%
UnicodeMappings: 362ms 1323ms -72.7% 365ms 1347ms -72.9%
UnicodePredicates: 438ms 860ms -49.0% 439ms 912ms -51.8%
UnicodeProperties: 400ms 0ms n/a 401ms 0ms n/a
UnicodeSlicing: 624ms 2491ms -75.0% 666ms 2638ms -74.7%



The results are disappointing but not all that surprising, especially with regards to string manipulation. I attempted to run the same pybench.py tool on top of Jython but Jython doesn't appear to support the "platform" module, so I don't have a really good baseline for "managed/virtual machine-based Python implementations" right now. However, given the lack of evidence otherwise, I'll just go ahead and assume IronPython blew the doors off of Jython :). In general though this isn't the be-all end-all benchmark for IronPython, especially on Mono, but it does give a nice hint of where some improvements could be made both in the Mono runtime and IronPython. I'll have to run the benchmarks again with the newer versions of both implementations of Python to see where they're improving or degrading but by all means don't let this deter you from checking out IronPython! I'll be writing up a few code samples over the next couple weeks that I hope will be helpful to those "unenlightened" among us; dynamic languages on the CLR, what has the world come to.

Comments

Care to try this again?

With IronPython 2.0.1 and Mono 2.4 out, perhaps the benchmarks would be quite different?

Some problems with your post

For starters, using mono to make this benchmark, as you noted, is flawed. Given that the official CLR is really the benchmark of .NET performance this comparison is pretty useless as a gauge of IronPython.

Also you failed to mention that CPython is not reentrant and cannot actually run 2 concurrent python threads unless both threads are IO bound at that given moment. If they are computationally bound, as is probably the case most of the time, the Global Interpreter Lock is held and only one thread executes, even on a multi-core or multi-proc box.

IronPython doesn't suffer from this problem as its a more robust and reentrant implementation.

I would go back to the drawing board and make the effort to do a proper benchmark.

"Given that the official

"Given that the official CLR"
It is not so easy.. At least, it requires windows. So I've got a workstation, my wife's mac, and 3 laptops.. Of course laptops came with windows preinstalled...plus some vendor-specific bonuses of spyware,adware,drm and sometimes even StarForce virus; it could be tricky to clean everything out, anyway it's EULA only allow you to run it on real hardware - which is unacceptable; You've left to boot from cd and create new partition table.
Anyway, testing in "official CLR" is useless, because I don't see any real-world application for such enviroment.

Erm, i don't see any

Erm, i don't see any problems with his benchmark. He explained how he got the software, he said what versions he used, he stated the platform being tested on... it's all there.

He wasn't running an MS.NET versus CPython benchmark here, he never claimed that. Also, calling the MS.NET CLR the 'official CLR' is like calling Internet Explorer the 'official browser'. In other words - a stupid statement.

Finally, this is a benchmark. It shows the relative performance of CPython against IronPython on the mono platform. If IronPython performs better in threaded applications because it has true threading support, then that's a valid benchmark. If you're doing heavily threaded work in python, you would appreciate knowing that it'll perform a hellofalot better on mono.

Re: Erm, i don't see any

"Erm, i don't see any problems with his benchmark. He explained how he got the software, he said what versions he used, he stated the platform being tested on... it's all there."

Not really. If you are going to be using IronPython you are most likely using Windows. Using Linux on a Mono platform is a huge difference in both speed and implementation. If you don't understand that you need to be educated more.

This is like trying to benchmark a Windows application using Wine, its stupid.

> If you're doing heavily

> If you're doing heavily threaded work in python, you would appreciate knowing that it'll perform a hellofalot better on mono.

Irony is that you only have to do "threaded work" in win32, on *nix the better idea would be to run several processes, there's even a "processing" module which use siminal to threading syntax

Threads not necessary on *NIX?

"Irony is that you only have to do "threaded work" in win32"

So no-one uses threads on *nix? I think you are seriously mistaken... Inter-process communication is only useful if you don't need to share a lot of data between them. Marshalling data between processes is expensive.

.NET Benchmark

Back in April Jim Hugunin posted the results of pybench for CPython 2.5 and IronPython 1.1:

http://lists.ironpython.com/pipermail/users-ironpython.com/2007-April/00...

There was a fair amount of debate about it at the time (including comparing .NET to Mono performance) which I tried to summarise:

http://www.voidspace.org.uk/python/weblog/arch_d7_2007_04_21.shtml#e688

Michael
http://www.manning.com/foord