How to configure parallel remote kernels in Mathematica?
When I try to configure remote kernels in mathematica via Evaluation>Parallel Kernel Configuration ... 开发者_开发技巧then I go to "Remote Kernels" and add hosts. After that I try to Launch the remote kernels and only some of them get launched (the number of them varies). And I get a msg like the following.
KernelObject::rdead: Subkernel connected through remote[nodo2] appears dead. >> LinkConnect::linkc: Unable to connect to LinkObject[36154@192.168.1.104,49648@192.168.1.104,38,12]. >> General::stop: Further output of LinkConnect::linkc will be suppressed during this calculation. >>
Any ideas how to get this working?
Take into account it sometimes does load some of the remote kernels but never all of them. Thanks in advance.
This is my ouput for $ConfiguredKernels // InputForm
{SubKernels`LocalKernels`LocalMachine[4],
SubKernels`RemoteKernels`RemoteMachine["nodo2", 2],
SubKernels`RemoteKernels`RemoteMachine["nodo1", 2],
SubKernels`RemoteKernels`RemoteMachine["nodo3", 2],
SubKernels`RemoteKernels`RemoteMachine["nodo4", 2],
SubKernels`RemoteKernels`RemoteMachine["nodo5", 2]}
Once it did load all of the kernels, but it commonly doesn't, just one or two remote kernels.
There is very little information given, so this answer may not be 100% useful.
The first issue to always consider is licensing on the remote machine. If some kernels launch, but others don't, it is possible you have run out of licenses for kernels on that machine. The rest of this post will assume licensing is not the issue.
Connection Method
The remote kernel interface in Mathematica by default assumes the rsh protocol, which is not the right choice for many environments, because rsh is not a very secure protocol.
The other option is ssh, which is much more widely supported. There are many ssh clients, but I will focus on a client included with Mathematica, namely WolframSSH.jar. This client is java based, which has the added benefit of working the same on all platforms supported by Mathematica (Mac, Window and Linux).
To avoid having to type a password for every kernel connection, it is convenient to create a private/public key pair. The private key stays on your computer and the public key needs to be placed on the remote computer (usually in the .ssh folder of the remote home directory).
To generate a private/public key pair you can use the WolframSSHKeyGen.jar file, like so:
java -jar c:\path\to\mathematica\SystemFiles\Java\WolframSSHKeyGen.jar
and follow the instructions on the dialogs that come up. When done, copy the public key to the .ssh folder
on the remote machine. In my case, I called the keys kernel_key
and kernel_key.pub
was automatically named that way.
You can now test the connection from a command line, like so (using the ls
command on the remote machine):
java -jar c:\path\to\mathematica\SystemFiles\Java\WolframSSH.jar --keyfile kernel_key arnoudb@machine.example.com ls
If this works, you should be able to finish on the Mathematica side of things.
Remote Kernel Connection
To make a connection you need the following settings, the name of the remote machine:
machine = "machine.example.com";
The login name, usually $UserName:
user = $UserName;
The ssh binary location:
ssh = FileNameJoin[{$InstallationDirectory, "SystemFiles", "Java", "WolframSSH.jar"}];
The private key as described above:
privatekey = "c:\\users\\arnoudb\\kernel_key";
The launch command for the kernel:
math = "math -mathlink -linkmode Connect `4` -linkname `2` -subkernel -noinit >& /dev/null &";
A configuration function to put everything together:
ConfigureKernel[machine_, user_, ssh_, privatekey_, math_, number_] :=
SubKernels`RemoteKernels`RemoteMachine[
machine,
"java -jar \"" <> ssh <> "\" --keyfile \"" <> privatekey <> "\" " <> user <> "@" <> machine <> " \"" <> math <> "\"", number]
This uses the configuration function and defines it to use 4 remote kernels:
remote = ConfigureKernel[machine, user, ssh, privatekey, math, 4]
This launches the kernels:
LaunchKernels[remote]
This command verifies if the kernels are all connected and remote:
ParallelEvaluate[$MachineName]
精彩评论