How to read and extract information from a file that is being continuously updated?
This is how I am planning to build my utilities for a project :
logdump dumps log results to file log. The results are appended to the existing results if the file is already there (like if a new file is created every month, the results are appended to the same file for that month).
extract reads the log result file to extract relevant results depending on the arguments provided.
The thing is that I do not want to wait for logdump to finish writing to log to begin processing it. Also that way I will need to remember till where I already read log to begin extracting more information, which is not what I want to do.
I need live results so that whenever something is added to the log results file, extract will get the required results.
The processing that extract will do will be generic (will depend on some command line arguments to it), but surely on a line by line basis.
This involves reading a file as and when it is being written to and continuously monitoring it for new updates even after you reach the end of the log 开发者_JS百科file.
How can I do this using C or C++ or shell scripting or Perl?
tail -f
will read from a file and monitor it for updates when it reaches EOF instead of quitting outright. It's an easy way to read a log file "live". Could be as simple as:
tail -f log.file | extract
Or maybe tail -n 0 -f
so it only prints new lines, not existing lines. Or tail -n +0 -f
to display the entire file, and then continue updating thereafter.
The traditional unix tool for this is tail -f
, which keeps reading data appended to its argument until you kill it. So you can do
tail -c +1 -f log | extract
In the unix world, reading from continuously appended-to files has come to be known as “tailing”. In Perl, the File::Tail module performs the same task.
use File::Tail;
my $log_file = File::Tail->new("log");
while (defined (my $log_line = $log_file->read)) {
process_line($log_line);
}
Using a simple stand-in for logdump
#! /usr/bin/perl
use warnings;
use strict;
open my $fh, ">", "log" or die "$0: open: $!";
select $fh;
$| = 1; # disable buffering
for (1 .. 10) {
print $fh "message $_\n" or warn "$0: print: $!";
sleep rand 5;
}
and the skeleton for extract
below to get the processing you want. When logfile
encounters end-of-file, logfile.eof()
is true. Calling logfile.clear()
resets all the error state, and then we sleep and try again.
#include <iostream>
#include <fstream>
#include <cerrno>
#include <cstring>
#include <unistd.h>
int main(int argc, char *argv[])
{
const char *path;
if (argc == 2) path = argv[1];
else if (argc == 1) path = "log";
else {
std::cerr << "Usage: " << argv[0] << " [ log-file ]\n";
return 1;
}
std::ifstream logfile(path);
std::string line;
next_line: while (std::getline(logfile, line))
std::cout << argv[0] << ": extracted [" << line << "]\n";
if (logfile.eof()) {
sleep(3);
logfile.clear();
goto next_line;
}
else {
std::cerr << argv[0] << ": " << path << ": " << std::strerror(errno) << '\n';
return 1;
}
return 0;
}
It's not as interesting as watching it live, but the output is
./extract: extracted [message 1] ./extract: extracted [message 2] ./extract: extracted [message 3] ./extract: extracted [message 4] ./extract: extracted [message 5] ./extract: extracted [message 6] ./extract: extracted [message 7] ./extract: extracted [message 8] ./extract: extracted [message 9] ./extract: extracted [message 10] ^C
I left the interrupt in the output to emphasize that this is an infinite loop.
Use Perl as a glue language to make extract
get lines from the log by way of tail
:
#! /usr/bin/perl
use warnings;
use strict;
die "Usage: $0 [ log-file ]\n" if @ARGV > 1;
my $path = @ARGV ? shift : "log";
open my $fh, "-|", "tail", "-c", "+1", "-f", $path
or die "$0: could not start tail: $!";
while (<$fh>) {
chomp;
print "$0: extracted [$_]\n";
}
Finally, if you insist on doing the heavy lifting yourself, there's a related Perl FAQ:
How do I do a tail -f in perl?
First try
seek(GWFILE, 0, 1);
The statement
seek(GWFILE, 0, 1)
doesn't change the current position, but it does clear the end-of-file condition on the handle, so that the next<GWFILE>
makes Perl try again to read something.If that doesn't work (it relies on features of your stdio implementation), then you need something more like this:
for (;;) { for ($curpos = tell(GWFILE); <GWFILE>; $curpos = tell(GWFILE)) { # search for some stuff and put it into files } # sleep for a while seek(GWFILE, $curpos, 0); # seek to where we had been }
If this still doesn't work, look into the
clearerr
method fromIO::Handle
, which resets the error and end-of-file states on the handle.There's also a
File::Tail
module from CPAN.
精彩评论