Why is my Python version slower than my Perl version? [closed]
I've been a Perl guy for over 10 years but a friend convinced me to try Python and told me how much faster it is than Perl. So just for kicks I ported an app I wrote in Perl to Python and found that it runs about 3x slower. Initially my friend told me that I must have done it wrong, so I rewrote and refactored until I could rewrite and refactor no more and ... it's still a lot slower. 开发者_开发百科So I did a simple test:
i = 0
j = 0
while (i < 100000000):
i = i + 1
j = j + 1
print j
$ time python python.py
100000000real 0m48.100s
user 0m45.633s sys 0m0.043s
my $i = 0;
my $j = 0;
while ($i < 100000000) {
++$i; # also tested $i = $i + 1 to be fair, same result
++$j;
}
print $j;
$ time perl perl.pl
100000000real 0m24.757s
user 0m22.341s sys 0m0.029s
Just under twice as slow, which doesn't seem to reflect any of the benchmarks I've seen ... is there a problem with my installation or is Python really that much slower than Perl?
The nit-picking answer is that you should compare it to idiomatic Python:
- The original code takes 34 seconds on my machine.
- A
for
loop (FlorianH's answer) with+=
andxrange()
takes 21. - Putting the whole thing in a function reduces it to 9 seconds!
That's much faster than Perl (15 seconds on my machine)!
Explanation: Python local vars are much faster than globals.
(For fairness, I also tried a function in Perl - no change) Getting rid of the j variable reduced it to 8 seconds:
print sum(1 for i in xrange(100000000))
Python has the strange property that higher-level shorter code tends to be fastest :-)
But the real answer is that your "micro-benchmark" is meaningless. The real question of language speed is: what's the performance of an average real application? To know that, you should take into account:
Typical mix of operations in complex code. Your code doesn't contain any data structures, function calls, or OOP operations.
A large enough codebase to feel cache effects — many interpreter optimizations trade memory for speed, which is not measured fairly by any tiny benchmark.
Optimization opportunities: after you write your code, IF it's not fast enough, how much faster can you easily make it?
E.g. how hard is it to offload the heavy lifting to effecient C libriries?
PyPy's benchmarks and Octane are good examples of what realistic language speed benchmarks look like.
If you want to talk number crunching, Python IS surprisingly popular with scientists. They love it for the simple pseudo-math syntax and short learning curve, but also for the excellent numpy library for array crunching and the ease of wrapping other existing C code.
And then there is the Psyco JIT which would probably run your toy example well under 1 second, but I can't check it now because it only works on 32-bit x86.
EDIT: Nowdays, skip Psyco and use PyPy which a cross-platform actively improving JIT.
All this micro benchmarking can get a bit silly!
For eg. just switching to for
in both Python & Perl provides an hefty speed bump. The original Perl example would be twice as quick if for
was used:
my $j = 0;
for my $i (1..100000000) {
++$j;
}
print $j;
And I can shave off a bit more with this:
++$j for 1..100000000;
print $j;
And getting even sillier we can get it down to 1 second here ;-)
print {STDOUT} (1..10000000)[-1];
/I3az/
ref: Perl 5.10.1 used.
Python runs very fast, if you use the correct syntax of the python language. It is roughly described as "pythonic".
If you restructure your code like this, it will run at least twice as fast (well, it does on my machine):
j = 0
for i in range(10000000):
j = j + 1
print j
Whenever you use a while in python, you should check if you could also use a "for X in range()".
Python is not particularly fast at numeric computations and I'm sure it's slower than perl when it comes to text processing.
Since you're an experienced Perl hand, I don't know if this applies to you but Python programs in the long run tend to be more maintainable and are quicker to develop. The speed is 'enough' for most situations and you have the flexibility to drop down into C when you really need a performance boost.
Update
Okay. I just created a large file (1GB) with random data in it (mostly ascii) and broke it into lines of equal lengths. This was supposed to simulate a log file.
I then ran simple perl and python programs that search the file line by line for an existing pattern.
With Python 2.6.2, the results were
real 0m18.364s
user 0m9.209s
sys 0m0.956s
and with Perl 5.10.0
real 0m17.639s
user 0m5.692s
sys 0m0.844s
The programs are as follows (please let me know if I'm doing something stupid)
import re
regexp = re.compile("p06c")
def search():
with open("/home/arif/f") as f:
for i in f:
if regexp.search(i):
print "Found : %s"%i
search()
and
sub search() {
open FOO,"/home/arif/f" or die $!;
while (<FOO>) {
print "Found : $_\n" if /p06c/o;
}
}
search();
The results are pretty close and tweaking it this way or other don't seem to alter the results much. I don't know if this is a true benchmark but I think it'd be the way I'd search log files in the two languages so I stand corrected about the relative performances.
Thanks Chris.
To OP, in Python this piece of code:
j = 0
for i in range(10000000):
j = j + 1
print j
is the same as
print range(10000001)[-1]
which, on my machine,
$ time python test.py
10000000
real 0m1.138s
user 0m0.761s
sys 0m0.357s
runs for approximately 1s. range() (or xrange) is internal to Python and "internally" , it already can generates a sequence of numbers for you. Therefore, you don't have to create your own iterations using your own loop. Now, you go and find a Perl equivalent that can run for 1s to produce the same result
Python maintains global variables in a dictionary. Therefore, each time there is an assignment, the interpreter performs a lookup on the module dictionary, that is somewhat expensive, and this is the reason why you found your example so slower.
In order to improve performance, you should use local allocation, like creating a function. Python interpreter stores local variables in an array, with a much faster access.
However, it should be noted that this is an implementation detail of CPython; I suspect IronPython, for instance, would lead to a completely different result.
Finally, for more information on this topic, I suggest you an interesting essay from GvR, about optimization in Python: Python Patterns - An Optimization Anecdote.
python is slower then perl. It may be faster to develop but it doesnt execute faster here is one benchmark http://xodian.net/serendipity/index.php?/archives/27-Benchmark-PHP-vs.-Python-vs.-Perl-vs.-Ruby.html -edit- a terrible benchmark but it is at least a real benchmark with numbers and not some guess. To bad theres no source or test other then a loop.
I'm not up-to-date on everything with Python, but my first idea about this benchmark was the difference between Perl and Python numbers. In Perl, we have numbers. They aren't objects, and their precision is limited to the sizes imposed by the architecture. In Python, we have objects with arbitrary precision. For small numbers (those that fit in 32-bit), I'd expect Perl to be faster. If we go over the integer size of the architecture, the Perl script won't even work without some modification.
I see similar results for the original benchmark on my MacBook Air (32-bit) using a Perl 5.10.1 that I compiled myself and the Python 2.5.1 that came with Leopard:
However, I added arbitrary precision to the Perl program with the bignum
use bignum;
Now I wonder if the Perl version is ever going to finish. :) I'll post some results when it finishes, but it looks like it's going to be an order of magnitude difference.
Some of you may have seen my question about What are five things you hate about your favorite language?. Perl's default numbers is one of the things that I hate. I should never have to think about it and it shouldn't be slow. In Perl, I lose on both. Note, however, that if I needed numeric processing in Perl, I could use PDL.
is Python really that much slower than Perl?
Look at the Computer Language Benchmarks Game - "Compare the performance of ≈30 programming languages using ≈12 flawed benchmarks and ≈1100 programs".
They are only tiny benchmark programs but they still do a lot more than the code snippet you have timed -
http://shootout.alioth.debian.org/u32/python.php
精彩评论