Python Read Formatted String
I have a file with a number of lines formatted with the following syntax:
FIELD POSITION DATA TYPE
------------------------------
COOP ID 1-6 Character
LATITUDE 8-15 Real
LONGITUDE 17-25 Real
ELEVATION 27-32 Real
STATE 34-35 Character
NAME 37-66 Character
COMPONENT1 68-73 Character
COMPONENT2 75-80 Character
COMPONENT3 82-87 Character
UTC OFFSET 89-90 Integer
The data is all ASCII-formatted.
An example of a line i开发者_运维知识库s:
011084 31.0581 -87.0547 26.0 AL BREWTON 3 SSE ------ ------ ------ +6
My current thought is that I'd like to read the file in a line at a time and somehow have each line broken up into a dictionary so I can refer to the components. Is there some module that does this in Python, or some other clean way?
Thanks!
EDIT: You can still use the struct module:
See the struct module documentation. Looks to me like you want to use struct.unpack()
What you want is probably something like:
import struct
with open("filename.txt", "r") as f:
for line in f:
(coop_id, lat, lon, elev, state, name, c1, c2, c3, utc_offset
) = struct.unpack("6sx8sx9sx6sx2sx30sx6sx6sx6sx2s", line.strip())
(lat, lon, elev) = map(float, (lat, lon, elev))
utc_offset = int(utc_offset)
I think I understand from your question/comments what you are looking for. If we assume that Real, Character, and Integer are the only data types, then the following code should work. (I will also assume that the format file you showed is tab delimited):
format = {}
types = {"Real":float, "Character":str, "Integer":int}
for line in open("format.txt", "r"):
values = line.split("\t")
range = values[1].split("-")
format[values[0]]={"start":int(range[0])-1, "end":int(range[1])-1, "type":types[values[2]]}
results=[]
for line in open("filename.txt"):
result={}
for key in format:
result[key]=format["type"](line[format["start"]:format["end"]])
results.append(result)
You should end up with results containing a list of dictionaries where each dictionary is a mapping from key names in the format file to data values in the correct data type.
It seems like you could write a function using strings and slices fairly simply. string[0:5] would be the first element. Does it need to be extensible, or is it likely a one off?
精彩评论