How to use threads to replace looping a subroutine in perl/pdl

2023-01-15 00:04 问答作者：

I have a perfectly good perl subroutine written as part of a perl module. Without going into too many details, it takes a string and a short list as arguments (often taken from terminal) and spits out a value (right now, always a floating point, but this may not always be the case.)

Right now, the list portion of my argument takes two values, say (val1,val2). I save the output of my subroutine f开发者_开发技巧or hundreds of different values for val1 and val2 using for loops. Each iteration takes almost a second to complete--so completing this entire process takes hours.

I recently read of a mystical (to me) computational tool called "threading" that apparently can replace for loops with blazing fast execution time. I have been having trouble understanding what these are and do, but I imagine they have something to do with parallel computing (and I would like to have my module as optimized as possible for parallel processors.)

If I save all the values I would like to pass to val1 as a list, say @val1 and the same for val2, how can I use these "threads" to execute my subroutine for every combination of the elements of val1 and val2? Also, it would be helpful to know how to generalize this procedure to a subroutine that also takes val3, val4, etc.

Update:

I do not use PDL so I did not know a thread in PDL does not correspond exactly to the notion of threading I have been talking about. See PDL threading and signatures:

First we have to explain what we mean by threading in the context of PDL, especially since the term threading already has a distinct meaning in computer science that only partly agrees with its usage within PDL.

However, I think the explanation below is still useful to you as one would need to know what threading in the regular sense is to understand how PDL threads are different.

Here is the Threads entry on Wikipedia for background.

Using threads cannot make your program magically faster. If you have multiple CPUs/cores and if the computations you are carrying out can be divided into independent chunks, using threads can allow your program to carry more than one computation at a time and cut down on the total execution time.

The easiest case is when the subtasks are embarrassingly parallel requiring no communication/coordination between threads.

Regarding possible performance gains, consider the following program:

#!/usr/bin/perl

use strict; use warnings;
use threads;

my ($n) = @ARGV;

my @threads = map { threads->create(\&act_busy) } 1 .. $n;

$_->join for @threads;

sub act_busy {
    for (1 .. 10_000_000) {
        my $x = 2 * 2;
    }
}

On my dual core laptop running Windows XP:

C:\> timethis t.pl 1
TimeThis :  Elapsed Time :  00:00:02.375

C:\> timethis t.pl 2
TimeThis :  Elapsed Time :  00:00:02.515

C:\> timethis t.pl 3
TimeThis :  Elapsed Time :  00:00:03.734

C:\> timethis t.pl 4
TimeThis :  Elapsed Time :  00:00:04.703

...

C:\> timethis t.pl 10
TimeThis :  Elapsed Time :  00:00:11.703

Now, compare that to:

#!/usr/bin/perl

use strict; use warnings;

my ($n) = @ARGV;

act_busy() for 1 .. $n;

sub act_busy {
    for (1 .. 10_000_000) {
        my $x = 2 * 2;
    }
}

C:\> timethis s.pl 10
TimeThis :  Elapsed Time :  00:00:22.312

As Sinan says, the "threading" you were probably thinking of is "PDL threading", now renamed (as of 2.075) to "broadcasting" to match the general terminology (see docs). It allows you to replace something like this:

$x = sequence(5);
$x->set($_, $x->at($_)+2) for 0..$x->dim(0)-1;

with just this, since "+=" fundamentally operates on one thing (a zero-dimensional scalar), so with more dimensions than a scalar (such as this 1-dimensional sequence) it can "broadcast":

$x += 2; # does whole ndarray at once

This is also faster because unlike the for loop, it doesn't have to keep leaving and re-entering the Perl environment (aka "Perl-land"), but can stay in extremely fast "C-land" to do the calculations with no overhead.

The motivation behind its original name was that these "broadcasted" calculations are all independent, and therefore "embarrassingly parallel", so can be automatically parallelised. See doc - as of 2.059, PDL by default sets parallel processing to happen automatically, on the number of CPU cores available.

继续阅读：loops multithreading pdl perl subroutine

How to use threads to replace looping a subroutine in perl/pdl

Update:

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Update:

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？