When is StringIO used, as opposed to joining a list of strings?
Using StringIO as string buffer is slower than using list as buffer.
When is StringIO used?
from io import StringIO
def meth1(string):
a = []
for i in range开发者_C百科(100):
a.append(string)
return ''.join(a)
def meth2(string):
a = StringIO()
for i in range(100):
a.write(string)
return a.getvalue()
if __name__ == '__main__':
from timeit import Timer
string = "This is test string"
print(Timer("meth1(string)", "from __main__ import meth1, string").timeit())
print(Timer("meth2(string)", "from __main__ import meth2, string").timeit())
Results:
16.7872819901
18.7160351276
The main advantage of StringIO is that it can be used where a file was expected. So you can do for example (for Python 2):
import sys
import StringIO
out = StringIO.StringIO()
sys.stdout = out
print "hi, I'm going out"
sys.stdout = sys.__stdout__
print out.getvalue()
If you measure for speed, you should use cStringIO
.
From the docs:
The module cStringIO provides an interface similar to that of the StringIO module. Heavy use of StringIO.StringIO objects can be made more efficient by using the function StringIO() from this module instead.
But the point of StringIO is to be a file-like object, for when something expects such and you don't want to use actual files.
Edit: I noticed you use from io import StringIO
, so you are probably on Python >= 3 or at least 2.6. The separate StringIO and cStringIO are gone in Py3. Not sure what implementation they used to provide the io.StringIO. There is io.BytesIO
too.
Well, I don't know if I would like to call that using it as a "buffer", you are just multiplying a string a 100 times, in two complicated ways. Here is an uncomplicated way:
def meth3(string):
return string * 100
If we add that to your test:
if __name__ == '__main__':
from timeit import Timer
string = "This is test string"
# Make sure it all does the same:
assert(meth1(string) == meth3(string))
assert(meth2(string) == meth3(string))
print(Timer("meth1(string)", "from __main__ import meth1, string").timeit())
print(Timer("meth2(string)", "from __main__ import meth2, string").timeit())
print(Timer("meth3(string)", "from __main__ import meth3, string").timeit())
It turns out to be way faster as a bonus:
21.0300650597
22.4869811535
0.811429977417
If you want to create a bunch of strings, and then join them, meth1() is the correct way. There is no point in writing it to StringIO, which is something completely different, namely a string with a file-like stream interface.
Another approach based on Lennart Regebro approach. This is faster than list method (meth1)
def meth4(string):
a = StringIO(string * 100)
contents = a.getvalue()
a.close()
return contents
if __name__ == '__main__':
from timeit import Timer
string = "This is test string"
print(Timer("meth1(string)", "from __main__ import meth1, string").timeit())
print(Timer("meth2(string)", "from __main__ import meth2, string").timeit())
print(Timer("meth3(string)", "from __main__ import meth3, string").timeit())
print(Timer("meth4(string)", "from __main__ import meth4, string").timeit())
Results (sec.):
meth1 = 7.731315963647944
meth2 = 9.609279402186985
meth3 = 0.26534052061106195
meth4 = 2.915035489152274
精彩评论