Bash or python for changing spacing in files

2022-12-24 09:47 问答作者：

I have a set of 10000 files. In all of them, the second line, looks like:

AAA 3.429 3.84

so there is just one space (requirement) between AAA and the two other columns. The rest of lines on each file are completely different and correspond to 10 columns of numbers.

Randomly, in around 20% of the files, and due to some errors, one gets

BBB  3.429 3.84

so now there are two spaces between the first and second column.

This is a big error so I need to fix it, changing from 2 to 1 space in the files where the error takes place.

The first approach I though开发者_如何转开发t of was to write a bash script that for each file reads the 3 values of the second line and then prints them with just one space, doing it for all the files.

I wonder what do oyu think about this approach and if you could suggest something better, bashm python or someother approach.

Thanks

Performing line-based changes to text files is often simplest to do in sed.

sed -e '2s/  */ /g' infile.txt

will replace any runs of multiple spaces with a single space. This may be changing more than you want, though.

sed -e '2s/^\([^ ]*\)  /\1 /' infile.txt

should just replace instances of two spaces after the first block of space-free text with a single space (though I have not tested this).

(edit: inserted 2 before s in each instance to tie the edit to the second line, specifically.)

Use sed.

for file in *
do
  sed -i '' '2s/  / /' "$file"
done

The -i '' flag means to edit in-place without a backup.

Or use ed!

for file in *
do
  printf "2s/  / /\nwq\n" |ed -s "$file"
done

if the error always can occur at 2nd line,

for file in file*
do
    awk 'NR==2{$1=$1}1' file >temp
    mv temp "$file"    
done

or sed

sed -i.bak '2s/  */ /' file* # do 2nd line

Or just pure bash scripting

i=1
while read -r line
do
  if [ "$i" -eq 2 ];then
    echo $line
  else
    echo "$line"
  fi
  ((i++))
done <"file"

Since it seems every column is separated by one space, another approach not yet mentioned is to use tr to squeeze all multi spaces into single spaces:
tr -s " " < infile > outfile

I am going to be different and go with AWK:

awk '{print $1,$2,$3}' file.txt > file1.txt

This will handle any number of spaces between fields, and replace them with one space

To handle a specific line you can add line addresses:

awk 'NR==2{print $1,$2,$3} NR!=2{print $0}' file.txt > file1.txt

i.e. rewrite line 2, but leave unchanged the other lines.

A line address can be a regular expression as well:

awk '/regexp/{print $1,$2,$3} !/regexp/{print}' file.txt > file1.txt

This answer assumes you don't want to mess with any except the second line.

#!/usr/bin/env python
import sys, os
for fname in sys.argv[1:]:
    with open(fname, "r") as fin:
        line1 = fin.readline()
        line2 = fin.readline()
        fixedLine2 = " ".join(line2.split()) + '\n'
        if fixedLine2 == line2:
            continue
        with open(fname + ".fixed", "w") as fout:
            fout.write(line1)
            fout.write(line2)
            for line in fin:
                fout.write(line)
    # Enable these lines if you want the old files replaced with the new ones.
    #os.remove(fname)
    #os.rename(fname + ".fixed", fname)

I don't quite understand, but yes, sed is an option. I don't think any POSIX compliant version of sed has an in file option (-i), so a fully POSIX compliant solution would be.

sed -e 's/^BBB  /BBB /' <file> > <newfile>

Use sed:

sed -e 's/[[:space:]][[:space:]]/ /g' yourfile.txt >> newfile.txt

This will replace any two adjacent spaces with one. The use of [[:space:]] just makes it a little bit clearer

sed -i -e '2s/  / /g' input.txt

-i: edit files in place

继续阅读：bash python

Bash or python for changing spacing in files

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？