Bash: how to automatically insert spacing characters according to indent?

2023-02-05 20:17 问答作者：

I've got lots of debug statements making for unreadable stacktraces (not my call) like this:

 00:53:59,906  - j.util.indexing.FileBasedIndex - START INDEX SHUTDOWN 
 00:53:09,192  - .impl.stores.XmlElementStorage - Document was not loaded for $APP_CONFIG$/macros.xml 
 00:53:09,195  - s.impl.stores.FileBasedStorage - Document was not loaded for $APP_CONFIG$/quicklists.xml file is null 
 00:53:09,195  - .impl.stores.XmlElementStorage - Document was not loaded for $APP_CONFIG$/quicklists.xml 
 00:53:09,696  - ij.openapi.wm.impl.IdeRootPane - App initialization took 5584 ms 
 00:53:11,677  -                  TestNG Runner - Create TestNG Template Configuration 
 00:53:13,628  - indexing.UnindexedFilesUpdater - Unindexed files update started 
 00:53:15,370  - indexing.UnindexedFilesUpdater - Unindexed files update done 
 00:53:20,873  - tor.impl.FileEditorManagerImpl - Project opening took 10569 ms 
 00:53:31,862  - s.impl.stores.FileBasedStorage - Document was not loaded for $APP_CONFIG$/intentionSettings.xml file is null 
 00:53:31,862  - .impl.stores.XmlElementStorage - Document was not loaded for $APP_CONFIG$/intentionSettings.xml 
 00:54:00,723  - j.util.indexing.FileBasedIndex - END INDEX SHUTDOWN

And I want them to look like this:

 00:53:59,906  - j.util.indexing.FileBasedIndex - START INDEX SHUTDOWN 
 00:53:09,192  - .impl.stores.XmlElementStorage -   Document was not loaded for $APP_CONFIG$/macros.xml 
 00:53:09,195  - s.impl.stores.FileBasedStorage -   Document was not loaded for $APP_CONFIG$/quicklists.xml file is null 
 00:53:09,195  - .impl.stores.XmlElementStorage -   Document was not loaded for $APP_CONFIG$/quicklists.xml 
 00:53:09,696  - ij.openapi.wm.impl.IdeRootPane -     App initialization took 5584 ms 
 00:53:11,677  -                  TestNG Runner -       Create TestNG Template Configuration 
 00:53:13,628  - indexing.UnindexedFilesUpdater -        Unindexed files update started 
 00:53:15,370  - indexing.UnindexedFilesUpdater -        Unindexed files update done 
 00:53:20,873  - tor.impl.FileEditorManagerImpl -     Project opening took 10569 ms 
 00:53:31,862  - s.impl.stores.FileBasedStorage -    Document was not loaded for $APP_CONFIG$/intentionSettings.xml file is null 
 00:53:31,862  - .impl.stores.XmlElementStorage -   Document was not loaded for $APP_CONFIG$/intentionSettings.xml 
 00:54:00,723  - j.util.indexing.FileBasedIndex - END INDEX SHUTDOWN

Now my code is formatted automatically so I know exactly how many characters there are in my source code for indent level.

What I want to do is replace automatically every line in my code looking like:

log.debug("some silly debug line");

with:

log.debug(" " + "some silly debug line");

or:

log.debug("  " + "some silly debug line");

depending on the number of indents before the log.debug line.

I'm pretty sure some Bash/shell magic can do this automatically.

Basically I'd start with a:

find . -iname "*java" -exec ...

then I'd like to insert " " on lines starting with log.debug depending on the indent level of these lines.

How would I go about it?

EDIT

Damn: it needs to be able to run several times, without re-shifting the lines already containing log.debug(" " +). It gets more complicated but now thanks to the good answer开发者_StackOverflow社区 I've got something to start from :)

Note: before yelling and shooting as to why this ain't a good idea, read my comment(s)

This should work when run multiple times on files:

sed -i.bak 's/^\([[:blank:]]*\)\(log.debug(\)\("[[:blank:]]*" + \)\?/\1\2"\1" + /; s/^log.debug("" + /log.debug(/'

Combined with your find command:

find . -iname "*java" -exec sed -i.bak 's/^\([[:blank:]]*\)\(log.debug(\)\("[[:blank:]]*" + \)\?/\1\2"\1" + /; s/^log.debug("" + /log.debug(/' \;

Test run:

$ sed -i.bak '...' test.java
$ diff test.java test.java.bak
(report of differences)
$ sed -i.bak '...' test.java
$ diff test.java test.java.bak
(no differences)

Explanation of sed command:

s/^\([[:blank:]]*\)\(log.debug(\)\("[[:blank:]]*" + \)\?/\1\2"\1" + /

Capture the tabs and spaces, if any, at the beginning of the line into group \1.
Capture the string "log.debug(" (only so we don't have to repeat it verbatim) into group \2.
Capture any existing indent string ("indentlevel" +) into so it can be overwritten in case the indentation has changed. The \? makes its existence optional. We don't care that it would be capture group \3 since we're throwing it away. The parentheses are necessary only for the \?.
Now the replacement is output. Capture groups \1 (the indent) and \2 (the "log.debug(") are put back like they were. And "indentlevel" + is added (where "indentlevel" is the same sequence of whitespace characters that indented the line). Note that this may result in nothing between the quotes.

The next part of the command removes the indentation stuff if it's null. You can omit that if you don't mind resulting lines of the form log.debug("" + "foo") when the line isn't indented.

s/^log.debug("" + /log.debug(/

Try sed string replacement:

find . -type f -iname "*.java" -exec sed -i 's/\(\(\s\+\)log.debug(\)\(.*)\)/\1"\2" + \3/' {} \;

Let's take a closer look at the regex patterns in sed -i 's/\(\(\s\+\)log.debug(\)\(.*)\)/\1"\2" + \3/...

Group 1 is the original leading text, including whitespace and "log.debug(" (with the open parenthesis).
\(\(\s\+\)log.debug(\)

Group 2, which is inside Group 1 shown above, captures only the leading whitespace:
\(\s\+\)

Group 3 is the original remaining text after "log.debug(":
\(.*)\)

sed uses these capture groups to form a replacement string, which begins with the original leading text, following the original leading whitespace surrounded by quotes, then a plus symbol (for Java string concatenation), and finally the remaining original text:
\1"\2" + \3

You can try this awk script:

x.awk

/log.debug/ {
    logStart = match($0, "log");
    spaces = substr($0, 0, logStart);
    gap = "\"" spaces "\"";
    parenStart = match($0, "(");
    firstHalf = substr($0, 0, parenStart);
    lastHalf = substr($0, parenStart, length($0) - parenStart);
    print firstHalf gap " + " lastHalf;
    next;
}
{
    print;
}

awk -f x.awk orig.java > new.java

Easy way to do this would be:

cd /code/root
mkdir /tmp/newcode
cp -R * /tmp/newcode
for i in `find . -name "*.java"`
do
    awk -f x.awk > /tmp/newcode/$i
done

Should just copy over the original code on top of the new code, and not have to fight with making directories or any of that.

Untested, there can easily be a one off error in the splitting of the line, but it should get your started.

And you were right!¹

sed -i .tmp -e 's/^\( *\)log.debug(/\1log.debug("\1" + /' prog.java

Integrated with find(1) it becomes:

find . -name \*.java -a -exec sed -i .tmp -e 's/^\( *\)log.debug(/\1log.debug("\1" + /' {} \;

This finds everything named *.java and runs an in-place sed (stream editor) program, matches the lines to change with a \( capture group \) around the space string, and then it tacks the capture group back in where it found it as well as in front of your string.

The -a, -exec, \; and {} are just the arcane find syntax for "and", "execute program", "end the command being executed" and "substitute the found filename here".

^{1. When you said: "I'm pretty sure some Bash/shell magic can do this automatically."}

继续阅读：bash indentation logging

Bash: how to automatically insert spacing characters according to indent?

And you were right!¹

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

And you were right!1

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

And you were right!¹

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？