开发者

At what stage is sed's pattern space printed?

I have heard that for the pattern space, the maximum number of addresses is two.

And that sed goes through each line of the text file, and for each of them, runs through all the commands in the script expression or script file.

When does sed print the pattern space? Is it at the end of the text file, after it has done the last line? Or is it as the ending part of processing each line of the text file, just after it has run through all commands, it dumps the pattern space?

Can anybody demonstrate

a)the max limit of the pattern space being two?

b)the fact of when the pattern space is printed. And, if you can, please provide a textual source that says so too.

And why is it that here in my attempt to se开发者_开发问答e the size of the pattern space, it looks like it can fit a lot..

When this tutorial, says

http://www.thegeekstuff.com/2009/12/unix-sed-tutorial-7-examples-for-sed-hold-and-pattern-buffer-operations/

Sed G function

The G function appends the contents of the holding area to the contents of the pattern space. The former and new contents are separated by a newline. The maximum number of addresses is two.

An example of what I found about the size of the pattern space, trying unsuccessfully to see its limit of two.. abc.txt is a text file with just the character z

sed h;G;G;G;G;G;G;G;G abc.txt

prints many zs so I guess it can hold more than 2.

So i've misunderstood some thing(s).


An address is a way of selecting lines. Lines can be selected using zero, one or two addresses. This has nothing to do with the capacity of pattern space.

Consider the following input file:

aaa
bbb
ccc
ddd
eee

This sed command has zero addresses, so it processes every line:

s/./X/

Result:

Xaa
Xbb
Xcc
Xdd
Xee

This command has one address, it selects only the third line:

3s/./X/

Result:

aaa
bbb
Xcc
ddd
eee

An address of $ as in $s/./X/ would function the same way, but for the last line (regardless of the number of lines).

Here is a two-address command. In this case, it selects the lines based on their content. A single address command can do this, too.

/b/,/d/s/./X/

Result:

aaa
Xbb
Xcc
Xdd
eee

Pattern space is printed when given an explicit p or P command or when the script is complete for the current line of the input file (which includes ending the processing of the file with the q command) if the -n (suppress automatic printing) option is not in place.

Here's a demonstration of sed printing each line immediately upon receiving and processing it:

for i in {1..3}; do echo aaa$i; sleep 2; done | sed 's/./X/'

The capacity of pattern space (and hold space) has to do with the number of characters it can hold (and is implementation dependent) rather than the number of input lines. The newlines separating those lines are simply another character in that total. The G command simply appends a copy of hold space onto the end of what's in pattern space. Multiple applications of the G command appends that many copies.

In the tutorial that you linked to, the statement "The maximum number of addresses is two." is somewhat ambiguous. What that indicates is that you can use zero, one or two addresses to select lines to apply that command to. As in the above examples, you could apply G to all lines, one line or a range of lines. Each command can accept zero, zero or one, or zero, one, or two addresses. See man sed under the Synopsis section for sub headings that group the commands by the number of addresses they accept.

From info sed:

3.1 How `sed' Works

'sed' maintains two data buffers: the active pattern space, and the auxiliary hold space. Both are initially empty.

'sed' operates by performing the following cycle on each lines of input: first, 'sed' reads one line from the input stream, removes any trailing newline, and places it in the pattern space. Then commands are executed; each command can have an address associated to it: addresses are a kind of condition code, and a command is only executed if the condition is verified before the command is to be executed.

When the end of the script is reached, unless the '-n' option is in use, the contents of pattern space are printed out to the output stream, adding back the trailing newline if it was removed.(1) Then the next cycle starts for the next input line.

Unless special commands (like 'D') are used, the pattern space is deleted between two cycles. The hold space, on the other hand, keeps its data between cycles (see commands 'h', 'H', 'x', 'g', 'G' to move data between both buffers).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜