开发者

Problem with multi-threaded Perl script run via SSH on Linux

I cannot understand the behavior of the following simple Perl script in case I run it remotely via SSH.

use strict;
use warnings;
use threads;
use threads::shared;
use POSIX;

my $print_mutex : shared;

################################################################################

sub _print($)
{
    my $str = shift;
    lock($print_mutex);
    my $id = threads->tid();
    my $time = strftime('%H:%M:%S', localtime time);
    print "$time [$id] $str";
    return;
}

############################################################开发者_如何转开发####################

sub run()
{
    for my $i (1 .. 3)
      {
        _print("Begin $i\n");
        sleep 1;
        _print("End $i\n");
      }
    return threads->tid();
}

################################################################################

_print "Starting test.\n";
my @threads;
for my $thr_num (1 .. 2)
  {
    my $thr = threads->create('run');
    push @threads, $thr;
    _print "Thread created.\n";
  }
foreach (@threads)
  {
    my $id = $_->join;
    _print "Thread '$id' finished.\n";
  }
_print "Test finished.\n";

################################################################################

When I run it normally on my Linux box with Perl-5.10.0 I get expected results:

$ perl /tmp/a.pl
14:25:54 [0] Starting test.
14:25:54 [0] Thread created.
14:25:54 [1] Begin 1
14:25:54 [0] Thread created.
14:25:54 [2] Begin 1
14:25:55 [1] End 1
14:25:55 [1] Begin 2
14:25:55 [2] End 1
14:25:55 [2] Begin 2
14:25:56 [1] End 2
14:25:56 [1] Begin 3
14:25:56 [2] End 2
14:25:56 [2] Begin 3
14:25:57 [1] End 3
14:25:57 [0] Thread '1' finished.
14:25:57 [2] End 3
14:25:57 [0] Thread '2' finished.
14:25:57 [0] Test finished.
$

However, when I run it via SSH (on the same local host, but it doesn't matter) I get very strange results (look closely at timestamps and thread IDs):

$ ssh localhost 'perl /tmp/a.pl'
14:26:11 [0] Starting test.
14:26:11 [0] Thread created.
14:26:11 [1] Begin 1
14:26:12 [1] End 1
14:26:12 [1] Begin 2
14:26:13 [1] End 2
14:26:13 [1] Begin 3
14:26:14 [1] End 3
14:26:11 [2] Begin 1
14:26:12 [2] End 1
14:26:12 [2] Begin 2
14:26:13 [2] End 2
14:26:13 [2] Begin 3
14:26:14 [2] End 3
14:26:11 [0] Thread created.
14:26:14 [0] Thread '1' finished.
14:26:14 [0] Thread '2' finished.
14:26:14 [0] Test finished.
$

I've never seen this in single-threaded Perl scripts and I noticed that I started seeing the problem with I/O right after the first thread has been created.

I was able to reproduce the problem with the latest Perl-5.12 on Windows, so I don't think the problem is Perl/OS specific.

Could someone please explain what's wrong here?


I was able to reproduce this myself. However, when running it from a shell over ssh, I got the expected behaviour. So what's the difference? A pseudo-terminal!

Try this:

ssh -t localhost 'perl /tmp/a.pl'


Indeed, it looks like each Perl thread has it's own output buffer. I've redirected the output to the file (the same as running script via SSH since it just disables line buffering) and run the script under strace:

$ strace -fF -tt -s200 bash -c "perl /tmp/a.pl > OUT" 2>&1 | grep write
[pid   359] 12:12:24.674142 write(1, "12:12:24 [0] Starting test.\n"..., 28) = 28
[pid   359] 12:12:24.687319 write(1, "12:12:24 [0] Thread created.\n"..., 29) = 29
[pid   360] 12:12:27.693225 write(1, "12:12:24 [1] Begin 1\n12:12:25 [1] End 1\n12:12:25 [1] Begin 2\n12:12:26 [1] End 2\n12:12:26 [1] Begin 3\n12:12:27 [1] End 3\n"..., 120) = 120
[pid   361] 12:12:27.706137 write(1, "12:12:24 [2] Begin 1\n12:12:25 [2] End 1\n12:12:25 [2] Begin 2\n12:12:26 [2] End 2\n12:12:26 [2] Begin 3\n12:12:27 [2] End 3\n"..., 120) = 120
[pid   359] 12:12:27.711343 write(1, "12:12:24 [0] Thread created.\n12:12:27 [0] Thread '1' finished.\n12:12:27 [0] Thread '2' finished.\n12:12:27 [0] Test finished.\n"..., 125) = 125
$ 

It becomes clear that each thread places all the data into thread-local buffer and only then (in this example just before thread termination) calls "write" system call on that buffer. IMHO, thread-local output buffers is very bad idea because people get confusing results even if they you explicit serialization of "print" calls.

The solution I found is to use explicit serialization and enable autoflush on STDOUT so that thread-local buffers are always empty.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜