开发者

Find extra space / new line after a closing ?> (php tag)

So I have a space/new line after a closing ?> (php tag) that is breaking my application.

How can I find it easily I have 1000 of files and 100000 lines o开发者_开发问答f code in this app.

Ideally im after some regex combined with find grep to run on a unix box.


The problem here is normal grep doesn't match multiple lines. So, I would install pcregrep and try the following command:

pcregrep -rMl '\?>[\s\n]+\z' *

This will match all files in the folder and subfolders (the -r part) using PCRE multiline match (the -M part), and only list their filenames (the -l part).

As for the pattern, well that matches ?> followed by 1 or more whitespace or newline characters, followed by the end of the file \z. I found though, when I ran this on my folder, many of the PHP files do in fact end with a single newline. So you can update that regex to be '\?>[\s\n]+\n\z' to match files with whitespace over and above the single \n character terminator.

Lastly, you can always use od -c filename to print unambiguous representation of the file if you need to check its exact character sequence ending.


use perl;

perl -0777 -i -pe 's/\s*$//s' *.php
  • -0777 will slurp he whole file (-0 will be ok too)
  • -i - inplace editing, so the file will be replaces with the result
  • -p print lines
  • -e perl expression

s/\s*$//s - treat all lines as a single line and substitute any space at the end to nothing


This is possible with regular grep

grep -Pz '\?>[\s]+$' -Rl

Will search for all files starting from the current directory and list all that have a ?> followed by white space at the end of the file.

  • -P Interpret the pattern as a Perl-compatible regular expression (PCRE).
  • -z Treats the input file as one long line - this is in part what makes it work
  • [\s]+ matches at least one white space - including newlines

If you want to match PHP files only:

find -name '*.php' | xargs grep -Pz '\?>[\s]+$' -l

To search for white space at the beginning of the file before

find -name '*.php' | xargs grep -Pz '^[\s]+<\?' -l


This works on my box:

for i in `find . -name "*.php"`; do (echo -n "$i: "; tail -c 3 $i) | grep -v "[?]>"; done

The idea is that you take just the last 3 characters with tail, then discard the files where those are '?', '>' and newline. If there's a space or another newline, you won't get the '?' character..


sed -e :a -e '/^[ \n]*$/{$d;N;ba' -e '}' -e '$ s/\([^ ]\)*?>[ ]*/\1?>/' file.php > new_file.php

to be executed for each file not completely tested..

remember to work on a temporary file and after the sed operation copy the new file on the original one..


This works for me...

<\?php | \?>

If you need to use in in a sublime-settings file or something like that which doesn't like forward slashes, you might have to add an extra slash for each of them like so...

<\\?php | \\?>

Hope that helps!


This worked for me to find white spaces before php files

find -name '*.php' | xargs grep -Pz '\?>[\s]+$' -l


grep '?> ' *.php? Of course, it may not be a space and could be a linebreak or a tab, so you may want to try other characters.


Using notepad++ you can replace easily all documents at the same time, drap and drop that folder of files and press CTRL + R, also you can use Regex

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜