Python: How do i split the file?

2023-01-08 09:35 问答作者：

I have this txt file which is ls -R of etc directory in a linux system. Example file:

etc:  
ArchiveSEL  
xinetd.d

etc/cmm:  
CMM_5085.bin  
cmm_sel  
storage.cfg  

etc/crontabs:  
root

etc/pam.d:  
ftp    
rsh  

etc/rc.d:  
eth.set.sh  
rc.sysinit  

etc/rc.d/init.d:  
cmm  
functions  
userScripts  

etc/security:  
access.conf  
console.apps  
time.conf

etc/security/console.apps:  
kbdrate

etc/ssh:  
ssh_host_dsa_key  
sshd_config  

etc/var:  
setUser  
snmpd.conf

etc/xinetd.d:  
irsh  
wu-ftpd

I would like to split it by subdirectories into several files. example files would be like this: etc.txt, etcCmm.txt, etcCrontabs.txt, etcPamd.txt, ...

Can someone give me a python code that can do that? Notice that the subdirectory lines end with ':', but i'm just not smart enough to write the code. some examples would be appreciated. thank y开发者_如何学Cou :)

Maybe something like this? re.M generates a multiline regular expression which can match several lines, and the last part just iterates over the matches and creates the files...

import re

data = '<your input data as above>' # or open('data.txt').read()
results = map(lambda m: (m[0], m[1].strip().splitlines()),
    re.findall('^([^\n]+):\n((?:[^\n]+\n)*)\n', data, re.M))

for dirname, files in results:
    f = open(dirname.replace('/', '')+'.txt', 'w')
    for line in files:
        f.write(line + '\n')
    f.close()

You will need to do it line-by-line. if a line.endswith(":") then you are in a new subdirectory. From then on, each line is a new entry into your subdirectory, until another line ends with :.

From my understanding, you just want to split one textfile into several, ambiguously named, text files.

So you'd see if a line ends with :. then you open a new text file, like etcCmm.txt, and every line that you read from the source text, from that point on, you write intoetcCmm.txt. When you encounter another line that ends in :, you close the previously opened file, create a new one, and continue.

I'm leaving a few things for you to do yourself, such as figuring out what to call the text file, reading a file line-by-line, etc.

use regexp like '.*:'.
use file.readline().
use loops.

If Python is not a must, you can use this one liner

awk '/:$/{gsub(/:|\//,"");fn=$0}{print $0 > fn".txt"}' file

Here's what I would do:

Read the file into memory (myfile = open(filename).read() should do).

Then split the file along the delimiters:

import re
myregex = re.compile(r"^(.*):[ \t]*$", re.MULTILINE)
arr = myregex.split(myfile)[1:] # dropping everything before the first directory entry

Then convert the array to a dict, removing unwanted characters along the way:

mydict = dict([(re.sub(r"\W+","",k), v.strip()) for (k,v) in zip(arr[::2], arr[1::2])])

Then write the files:

for name,content in mydict.iteritems():
    output = open(name+".txt","w")
    output.write(content)
    output.close()

继续阅读：python

Python: How do i split the file?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？