How to remove trailing whitespace in code, using another script?

2023-02-19 02:46 问答作者：

Something like:

import fileinput

for lines in fileinput.FileInput("test.txt", inplace=1):
    lines = lines.strip()
    if lines == '': continue
    print lines

But nothing is being printed on stdout.

Assuming some string named foo:

foo.lstrip() # to remove leading white space
foo.rstrip() # to remove trailing whitespace开发者_C百科
foo.strip()  # to remove both lead and trailing whitespace

fileinput seems to be for multiple input streams. This is what I would do:

with open("test.txt") as file:
    for line in file:
        line = line.rstrip()
        if line:
            print(line)

You don't see any output from the print statements because FileInput redirects stdout to the input file when the keyword argument inplace=1 is given. This causes the input file to effectively be rewritten and if you look at it afterwards the lines in it will indeed have no trailing or leading whitespace in them (except for the newline at the end of each which the print statement adds back).

If you only want to remove trailing whitespace, you should use rstrip() instead of strip(). Also note that the if lines == '': continue is causing blank lines to be completely removed (regardless of whether strip or rstrip gets used).

Unless your intent is to rewrite the input file, you should probably just use for line in open(filename):. Otherwise you can see what's being written to the file by simultaneously echoing the output to sys.stderr using something like the following (which will work in both Python 2 and 3):

from __future__ import print_function
import fileinput
import sys

for line in (line.rstrip() for line in
                fileinput.FileInput("test.txt", inplace=1)):
    if line:
        print(line)
        print(line, file=sys.stderr)

If you're looking to tidy up for PEP8, this will trim trailing whitespace for your whole project:

import os

PATH = '/path/to/your/project'

for path, dirs, files in os.walk(PATH):
    for f in files:
        file_name, file_extension = os.path.splitext(f)
        if file_extension == '.py':
            path_name = os.path.join(path, f)
            with open(path_name, 'r') as fh:
                new = [line.rstrip() for line in fh]
            with open(path_name, 'w') as fh:
                [fh.write('%s\n' % line) for line in new]

This is the sort of thing that sed is really good at: $ sed 's/[ \t]*$//'. Be aware the you will probably need to literally type a TAB character instead of \t for this to work.

It seems, fileinput.FileInput is a generator. As such, you can only iterate over it once, then all items have been consumed and calling it's next method raises StopIteration. If you want to iterate over the lines more than once, you can put them in a list:

list(fileinput.FileInput('test.txt'))

Then call rstrip on them.

Save as fix_whitespace.py:

#!/usr/bin/env python
"""
Fix trailing whitespace and line endings (to Unix) in a file.
Usage: python fix_whitespace.py foo.py
"""

import os
import sys


def main():
    """ Parse arguments, then fix whitespace in the given file """
    if len(sys.argv) == 2:
        fname = sys.argv[1]
        if not os.path.exists(fname):
            print("Python file not found: %s" % sys.argv[1])
            sys.exit(1)
    else:
        print("Invalid arguments. Usage: python fix_whitespace.py foo.py")
        sys.exit(1)
    fix_whitespace(fname)


def fix_whitespace(fname):
    """ Fix whitespace in a file """
    with open(fname, "rb") as fo:
        original_contents = fo.read()
    # "rU" Universal line endings to Unix
    with open(fname, "rU") as fo:
        contents = fo.read()
    lines = contents.split("\n")
    fixed = 0
    for k, line in enumerate(lines):
        new_line = line.rstrip()
        if len(line) != len(new_line):
            lines[k] = new_line
            fixed += 1
    with open(fname, "wb") as fo:
        fo.write("\n".join(lines))
    if fixed or contents != original_contents:
        print("************* %s" % os.path.basename(fname))
    if fixed:
        slines = "lines" if fixed > 1 else "line"
        print("Fixed trailing whitespace on %d %s" \
              % (fixed, slines))
    if contents != original_contents:
        print("Fixed line endings to Unix (\\n)")


if __name__ == "__main__":
    main()

It's a bit surprising seeing multiple answers suggesting to use python for this task, as there's no need to write a multi-line program for this.

Standard Unix tools like sed, awk or perl can achieve this easily straight from the command-line.

e.g anywhere you have perl (Windows, Mac, Linux) the following should achieve what the OP asked:

perl -i -pe 's/[ \t]+$//;' files...

Explanation of the arguments to perl:

-i   # run the edit "in place" (modify the original file)
-p   # implies a loop with a final print over every input line
-e   # next arg is the perl expression to apply (to every line)

s/[ \t]$// is a substitution regex s/FROM/TO/: replace every trailing (end of line) non-empty space (spaces or tabs) with nothing.

Advantages:

One liner, no programming needed
Works on multiple (any number) of files
Works correctly on standard-input (no file arguments given)

Edit:

Newer versions of perl support \h (any horizontal-space character), so the solution becomes even shorter:

perl -i -pe 's/\h+$//;' files...

More generally, if you want to modify any number of files directly from the command line, replacing every appearance of FOO with BAR, you may always use this generic template:

perl -i -pe 's/FOO/BAR/' files...

继续阅读：python

How to remove trailing whitespace in code, using another script?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？