How to read a file in python which has newline and tabs into a string?

2023-03-22 22:42 问答作者：

I am trying to read a file which has tabs and newline etc and the data is JSON format.

When I read it using file.read()/readlines() etc, all the newlines and tabs are also read.

I have tried rstrip(), split etc but in vain, maybe I am missing some thing:

Here is essentially what I am doing:

 f = open('/path/to/file.txt')
 line = f.readlines()
 line.split('\n')

This is the data 开发者_JAVA技巧(including the raw tabs, hence the poor formatting):

        {
      "foo": [ {
       "id1" : "1",
   "blah": "blah blah",
       "id2" : "5885221122",
      "bar" : [
              {  
         "name" : "Joe JJ", 
          "info": [                 {
         "custid": "SSN",    
         "type" : "String",             }        ]
        }     ]     }     ]  }

I was wondering if we can ignore it elegantly.

Also hoping to use json.dumps()

Why not just use json.load() if the data is json?

import json
d = json.load(open('myfile.txt', 'r'))

A little hack, inefficient I guess:

f = open("/path/to/file.txt")
lines = f.read().replace("\n", "").replace("\t", "").replace(" ", "")

print lines

Where did that structure come from? My condolences. Anyway, as a start you might try this:

cleanedData = re.sub('[\n\t]', '', f.read())

That's a brute-force removal of newline and tab characters. What it returns might be suitable for feeding into json.loads. It'll depend greatly on whether or not the contents of the file are actually valid JSON once you clear out the extra white space and line breaks.

If you want to loop over each line, you can just:

for line in open('path/to/file.txt'):
  # Remove whitespace from both ends of line
  line = line.strip()

  # Do whatever you want with line

What about the usage of the json module?

import json

tmp = json.loads(open("/path/to/file.txt", "r"))

output = open("/path/to/file2.txt", "w")
output.write(json.dumps(tmp, sort_keys=True, indent=4))

$ cat foo.json | python -mjson.tool
Expecting property name: line 11 column 41

The comma in "type" : "String", is causing the JSON decoder to choke. If it wasn't for that problem, you could use json.load() to load the file directly.

In other words, you have malformed JSON, meaning you'll need to perform a replacement operation before feeding it to json.loads(). Since you'll need to read the file into a string completely to do the replacement operation anyway, use json.loads(jsonstr) instead of json.load(jsonfilep):

    >>> import json, re
    >>> jsonfilep = open('foo.json')
    >>> jsonstr = re.sub(r'''(["'0-9.]\s*),\s*}''', r'\1}', jsonfilep.read())
    >>> jsonobj = json.loads(jsonstr)
    >>> jsonstr = json.dumps(jsonobj)
    >>> print(jsonstr)
    {"foo": [{"blah": "blah blah", "id2": "5885221122", "bar": [{"info":
    [{"type": "String", "custid": "SSN"}], "name": "Joe JJ"}], "id1": "1"}]}

I only used the re module because it could happen for any value, number or string.

继续阅读：fileparsing json python

How to read a file in python which has newline and tabs into a string?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？