Use tee (or equivalent) but limit max file size or rotate to new file

2023-03-20 09:01 问答作者：

I would like to capture output from a UNIX process but limit max file size and/or rotate to a new file.

I have seen logrotate, but it does not work real-time. As I understand, it is a "clean-up" job that runs in parallel.

What is the ri开发者_运维问答ght solution? I guess I will write a tiny script to do it, but I was hoping there was a simple way with existing text tools.

Imagine:

my_program | tee --max-bytes 100000 log/my_program_log

Would give... Always writing latest log file as: log/my_program_log

Then, as it fills... renamed to log/my_program_log000001 and start a new log/my_program_log.

use split:

my_program | tee >(split -d -b 100000 -)

Or if you don't want to see the output, you can directly pipe to split:

my_program | split -d -b 100000 -

As for the log rotation, there's no tool in coreutils that does it automatically. You could create a symlink and periodically update it using a bash command:

while ((1)); do ln -fns target_log_name $(ls -t | head -1); sleep 1; done

In package apache2-utils is present utility called rotatelogs, it fully meet to your requirements.

Synopsis:

rotatelogs [ -l ] [ -L linkname ] [ -p program ] [ -f ] [ -t ] [ -v ] [ -e ] [ -c ] [ -n number-of-files ] logfile rotationtime|filesize(B|K|M|G) [ offset ]

Example:

your_program | rotatelogs -n 5 /var/log/logfile 1M

Full manual you may read on this link.

or using awk

program | awk 'BEGIN{max=100} {n+=length($0); print $0 > "log."int(n/max)}'

It keeps lines together, so the max is not exact, but this could be nice especially for logging purposes. You can use awk's sprintf to format the file name.

Here's a pipable script, using awk

#!/bin/bash
maxb=$((1024*1024))    # default 1MiB
out="log"              # output file name
width=3                # width: log.001, log.002
while getopts "b:o:w:" opt; do
  case $opt in
    b ) maxb=$OPTARG;;
    o ) out="$OPTARG";;
    w ) width=$OPTARG;;
    * ) echo "Unimplented option."; exit 1
  esac
done
shift $(($OPTIND-1))

IFS='\n'              # keep leading whitespaces
if [ $# -ge 1 ]; then # read from file
  cat $1
else                  # read from pipe
  while read arg; do
    echo $arg
  done
fi | awk -v b=$maxb -v o="$out" -v w=$width '{
    n+=length($0); print $0 > sprintf("%s.%0.*d",o,w,n/b)}'

save this to a file called 'bee', run 'chmod +x bee' and you can use it as

program | bee

or to split an existing file as

bee -b1000 -o proglog -w8 file

To limit the size to 100 bytes, you can simply use dd:

my_program | dd bs=1 count=100 > log

When 100 bytes are written, dd will close the pipe and my_program receives EPIPE.

The most straightforward way to solve this is probably to use python and the logging module which was designed for this purpose. Create a script that read from stdin and write to stdout and implement the log-rotation described below.

The "logging" module provides the

class logging.handlers.RotatingFileHandler(filename, mode='a', maxBytes=0,
              backupCount=0, encoding=None, delay=0)

which does exactly what you are asking about.

You can use the maxBytes and backupCount values to allow the file to rollover at a predetermined size.

From docs.python.org

Sometimes you want to let a log file grow to a certain size, then open a new file and log to that. You may want to keep a certain number of these files, and when that many files have been created, rotate the files so that the number of files and the size of the files both remain bounded. For this usage pattern, the logging package provides a RotatingFileHandler:

import glob
import logging
import logging.handlers

LOG_FILENAME = 'logging_rotatingfile_example.out'

# Set up a specific logger with our desired output level
my_logger = logging.getLogger('MyLogger')
my_logger.setLevel(logging.DEBUG)

# Add the log message handler to the logger
handler = logging.handlers.RotatingFileHandler(
              LOG_FILENAME, maxBytes=20, backupCount=5)

my_logger.addHandler(handler)

# Log some messages
for i in range(20):
    my_logger.debug('i = %d' % i)

# See what files are created
logfiles = glob.glob('%s*' % LOG_FILENAME)

for filename in logfiles:
    print(filename)

The result should be 6 separate files, each with part of the log history for the application:

logging_rotatingfile_example.out
logging_rotatingfile_example.out.1
logging_rotatingfile_example.out.2
logging_rotatingfile_example.out.3
logging_rotatingfile_example.out.4
logging_rotatingfile_example.out.5

The most current file is always logging_rotatingfile_example.out, and each time it reaches the size limit it is renamed with the suffix .1. Each of the existing backup files is renamed to increment the suffix (.1 becomes .2, etc.) and the .6 file is erased.

Obviously this example sets the log length much much too small as an extreme example. You would want to set maxBytes to an appropriate value.

Another solution will be to use Apache rotatelogs utility.

Or following script:

#!/bin/ksh
#rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]
numberOfFiles=10
while getopts "n:fltvecp:L:" opt; do
    case $opt in
  n) numberOfFiles="$OPTARG"
    if ! printf '%s\n' "$numberOfFiles" | grep '^[0-9][0-9]*$' >/dev/null;     then
      printf 'Numeric numberOfFiles required %s. rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]\n' "$numberOfFiles" 1>&2
      exit 1
    elif [ $numberOfFiles -lt 3 ]; then
      printf 'numberOfFiles < 3 %s. rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]\n' "$numberOfFiles" 1>&2
    fi
  ;;
  *) printf '-%s ignored. rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]\n' "$opt" 1>&2
  ;;
  esac
done
shift $(( $OPTIND - 1 ))
pathToLog="$1"
fileSize="$2"
if ! printf '%s\n' "$fileSize" | grep '^[0-9][0-9]*[BKMG]$' >/dev/null; then
  printf 'Numeric fileSize followed by B|K|M|G required %s. rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]\n' "$fileSize" 1>&2
  exit 1
fi
sizeQualifier=`printf "%s\n" "$fileSize" | sed "s%^[0-9][0-9]*\([BKMG]\)$%\1%"`
multip=1
case $sizeQualifier in
B) multip=1 ;;
K) multip=1024 ;;
M) multip=1048576 ;;
G) multip=1073741824 ;;
esac
fileSize=`printf "%s\n" "$fileSize" | sed "s%^\([0-9][0-9]*\)[BKMG]$%\1%"`
fileSize=$(( $fileSize * $multip ))
fileSize=$(( $fileSize / 1024 ))
if [ $fileSize -le 10 ]; then
  printf 'fileSize %sKB < 10KB. rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]\n' "$fileSize" 1>&2
  exit 1
fi
if ! touch "$pathToLog"; then
  printf 'Could not write to log file %s. rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]\n' "$pathToLog" 1>&2
  exit 1
fi
lineCnt=0
while read line
do
  printf "%s\n" "$line" >>"$pathToLog"
  lineCnt=$(( $lineCnt + 1 ))
  if [ $lineCnt -gt 200 ]; then
    lineCnt=0
    curFileSize=`du -k "$pathToLog" | sed -e 's/^[  ][  ]*//' -e 's%[   ][  ]*$%%' -e 's/[  ][  ]*/[    ]/g' | cut -f1 -d" "`
    if [ $curFileSize -gt $fileSize ]; then
      DATE=`date +%Y%m%d_%H%M%S`
      cat "$pathToLog" | gzip -c >"${pathToLog}.${DATE}".gz && cat /dev/null >"$pathToLog"
      curNumberOfFiles=`ls "$pathToLog".[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_[0-9][0-9][0-9][0-9][0-9][0-9].gz | wc -l | sed -e 's/^[   ][  ]*//' -e 's%[   ][  ]*$%%' -e 's/[  ][  ]*/[    ]/g'`
      while [ $curNumberOfFiles -ge $numberOfFiles ]; do
        fileToRemove=`ls "$pathToLog".[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_[0-9][0-9][0-9][0-9][0-9][0-9].gz | head -1`
        if [ -f "$fileToRemove" ]; then
          rm -f "$fileToRemove"
          curNumberOfFiles=`ls "$pathToLog".[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_[0-9][0-9][0-9][0-9][0-9][0-9].gz | wc -l | sed -e 's/^[   ][  ]*//' -e 's%[   ][  ]*$%%' -e 's/[  ][  ]*/[    ]/g'`
        else
          break
        fi
      done
    fi
  fi
done

Limiting the max size can also be done with head:

my_program | head -c 100  # Limit to 100 first bytes

See this for benefits over dd: https://unix.stackexchange.com/a/121888/

继续阅读：logging shell tee

Use tee (or equivalent) but limit max file size or rotate to new file

更多精彩内容

精彩评论

最新问答

纸不语和纸嫁衣什么关系?？

三千元买二手音箱功放或是买新的有源音箱好？？

优酷投屏二次收费引争议,该播放器为何力推电视端?？

去哪家医院专治输卵管不通畅好？

投影仪极米Z4X和坚果G1PrO那个好,请用过的人帮分释一下谢谢？

问答排行榜

Escaping "<" in Perl-generated XML

微信重新建群怎么建？

imessage会显示已读吗？

太快了能不能慢一点好爽~好大~不要拔出来了？

二年级家长回音怎么写大全简短的（二年级家长回音怎么写）？