Data Manipulation: Stemming from a inability to select lists

2023-03-18 17:55 问答作者：

I am very new to python with no real prior programing knowledge. At my current job I am being asked to take data in the form of text from about 500+ files and plot them out. I understand the plotting to a deg开发者_运维百科ree, but I cannot seem to figure out how to manipulate the data in a way that it is easy to select specific sections. Currently this is what I have for opening a file:

fp=open("file")
for line in fp:
    words = line.strip().split()
    print words

The result is it gives me a list for each line of the file, but I can only access the last line made. Does any one know a way that would allow me to choose different variations of lists? Thanks a lot!!

The easiest way to get a list of lines from a file is as follows:

with open('file', 'r') as f:
    lines = f.readlines()

Now you can split those lines or do whatever you want with them:

lines = [line.split() for line in lines]

I'm not certain that answers your question -- let me know if you have something more specific in mind.

Since I don't understand exactly what you are asking, here are a few more examples of how you might process a text file. You can experiment with these in the interactive interpreter, which you can generally access just by typing 'python' at the command line.

>>> with open('a_text_file.txt', 'r') as f:
...     text = f.read()
... 
>>> text
'the first line of the text file\nthe second line -- broken by a symbol\nthe third line of the text file\nsome other data\n'

That's the raw, unprocessed text of the file. It's a string. Strings are immutable -- they can't be altered -- but they can be copied in part or in whole.

>>> text.splitlines()
['the first line of the text file', 'the second line -- broken by a symbol', 'the third line of the text file', 'some other data']

splitlines is a string method. splitlines splits the string wherever it finds a \n (newline) character; it then returns a list containing copies of the separate sections of the string.

>>> lines = text.splitlines()

Here I've just saved the above list of lines to a new variable name.

>>> lines[0]
'the first line of the text file'

Lists are accessed by indexing. Just provide an integer from 0 to len(lines) - 1 and the corresponding line is returned.

>>> lines[2]
'the third line of the text file'
>>> lines[1]
'the second line -- broken by a symbol'

Now you can start to manipulate individual lines.

>>> lines[1].split('--')
['the second line ', ' broken by a symbol']

split is another string method. It's like splitlines but you can specify the character or string that you want to use as the demarcator.

>>> lines[1][4]
's'

You can also index the characters in a string.

>>> lines[1][4:10]
'second'

You can also "slice" a string. The result is a copy of characters 4 through 9. 10 is the stop value, so the 10th character isn't included in the slice. (You can slice lists too.)

>>> lines[1].index('broken')
19

If you want to find a substring within a string, one way is to use index. It returns the index at which the first occurrence of the substring appears. (It throws an error if the substring isn't in the string. If you don't want that, use find, which returns a -1 if the substring isn't in the string.)

>>> lines[1][19:]
'broken by a symbol'

Then you can use that to slice the string. If you don't provide a stop index, it just returns the remainder of the string.

>>> lines[1][:19]
'the second line -- '

If you don't provide a start index, it returns the beginning of the string and stops at the stop index.

>>> [line for line in text.splitlines() if 'line' in line]
['the first line of the text file', 'the second line -- broken by a symbol', 'the third line of the text file']

You can also use in -- it's a boolean operation that returns True if a substring is in a string. In this case, I've used a list comprehension to get only the lines that have 'line' in them. (Note that the last line is missing from the list. It has been filtered.)

Let me know if you have any more questions.

继续阅读：python

Data Manipulation: Stemming from a inability to select lists

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？