Concatenate strings read from file with python?
Emacs's auto-fill mode splits the line to make the document look nice. I need to join the strings read from the document.
For example, (CR is the carriage return, not the real character)
- Blah, Blah, and (CR) Blah, Blah, Blah, (CR) Blah, Blah (CR) - A, B, C (CR) Blah, Blah, Blah, (CR) Blah, Blah (CR)
is read into string buffer array with readlines() function to produce
["Blah, Blah, and Blah, Blah, Blah, Blah, Blah", "A, B, C Blah, Blah, Blah, Blah, Blah"]
I thought about having loop to check '-' to concatenate all the stored strings before it, but I expect Python has efficient way to do this.
ADDED:
Based on kindall's code, I could get what I want as follows.
lines = ["- We shift our gears toward nextGen effort"," contribute the work with nextGen."]
out = [(" " if line.startswith(" ") else "\n") + line.strip() for line in lines]
print out
res = ''.join(out).split('\n')[1:]
print res
The result is as follows.
['\n- We shift our gears toward nextGen effort', ' contribute the work with nextGen.'] ['- We shift our gears toward nextGen effort contribute th开发者_如何学Pythone work with nextGen.']
As I read it, your problem is to undo hard-wrapping and restore each set of indented lines to a single soft-wrapped line. This is one way to do it:
# hard-coded input, could also readlines() from a file
lines = ["- Blah, Blah, and",
" Blah, Blah, Blah,",
" Blah, Blah",
"- Blah, Blah, and",
" Blah, Blah, Blah,",
" Blah, Blah"]
out = [(" " if line.startswith(" ") else "\n") + line.strip() for line in lines]
out = ''.join(out)[1:].split('\n')
print out
I'm not sure if you want just :
result = thefile.read()
or maybe :
result = ''.join(line.strip() for line in thefile)
or something else ...
Use file.readlines()
. It returns a list of strings, each string being a line of the file:
readlines(...)
readlines([size]) -> list of strings, each a line from the file.
Call readline() repeatedly and return a list of the lines so read.
The optional size argument, if given, is an approximate bound on the
total number of bytes in the lines returned.
EDIT: readlines() is not the best way to go, as has been pointed out in the comments. Disregard that suggestion and use the following one instead
If you were to use the output that emacs provides as input into a python function, then I would give you this (if the emacs output is one long string):
[s.replace("\n", "") for s in emacsOutput.split('-')]
Hope this helps
精彩评论