开发者

Unable to run OpenMPI across more than two machines

When attempting to run the first example in the boost::mpi tutorial, I was unable to run across more than two machines. Specifically, this seemed to run fine:

mpirun -hostfile hostnames -np 4 boost1

with each hostname in hostnames as <node_name> slots=2 max_slots=2. But, when I increase the number of processes to 5, it just hangs. I have decreased the number of slots/max_slots to 1 with the same result when I exceed 2 machines. On the nodes, this shows up in the job list:

<user> Ss orted --daemonize -mca ess env -mca orte_ess_jobid 388497408 \
-mca orte_ess_vpid 2 -mca orte_ess_num_procs 3 -hnp-uri \
388497408.0;tcp://<node_ip>:48823

Additionally, when I kill it, I get this message:

node2- daemon did not report back when launched
node3- daemon did not report back when launched

The cluster is set up with the mpi and boost libs accessible on an NFS mounted drive. Am I running into a deadlock with NFS? Or, is something else going on?

Update: To be clear, the boost program I am running is

#include <boost/mpi/environment.hpp>
#include <boost/mpi/communicator.hpp>
#include <iostream>
namespace mpi = boost::mpi;

int main(int argc, char* argv[]) 
{
  mpi::environment env(argc, argv);
  mpi::communicator world;
  std::cout << "I am process " << world.rank() << " of " << world.size()
        << "." << std::endl;
  return 0;
}

From @Dirk Eddelbuettel's recommendations, I compiled and ran the mpi example hello_c.c, as follows

#include <stdio.h>
#include "mpi.h"

int main(int argc, char* argv[])
{
    int rank, size;

    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    printf("Hello, world, I am %d of %d\n", rank, size);
    MPI_Barrier(MPI_COMM_WORLD);
    MPI_Finalize();

   return 0;
}

It runs fine on a single machine with multiple processes, this includes sshing into any of the nodes and running. Each compute node is identical with the working and mpi/boost directories mounted from a remote machine via NFS. When running the boost program from the fileserver (identical to a node except boost/mpi are local), I am able to run on two remote开发者_开发百科 nodes. For "hello world", however, running the command mpirun -H node1,node2 -np 12 ./hello I get

[<node name>][[2771,1],<process #>] \
[btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect] \
connect() to <node-ip> failed: No route to host (113)

while the all of the "Hello World's" are printed and it hangs at the end. However, the behavior when running from a compute node on a remote node differs.

Both "Hello world" and the boost code just hang with mpirun -H node1 -np 12 ./hello when run from node2 and vice versa. (Hang in the same sense as above: orted is running on remote machine, but not communicating back.)

The fact that the behavior differs from running on the fileserver where the mpi libs are local versus on a compute node suggests that I may be running into an NFS deadlock. Is this a reasonable conclusion? Assuming that this is the case, how do I configure mpi to allow me to link it statically? Additionally, I don't know what to make of the error I get when running from the fileserver, any thoughts?


The answer turned out to be simple: open mpi authenticated via ssh and then opened up tcp/ip sockets between the nodes. The firewalls on the compute nodes were set up to only accept ssh connections from each other, not arbitrary connections. So, after updating iptables, hello world runs like a champ across all of the nodes.

Edit: It should be pointed out that the fileserver's firewall allowed arbitrary connections, so that was why an mpi program run on it would behave differently than just running on the compute nodes.


My first recommendation would be to simplify:

  • can you build the standard MPI 'hello, world' example?
  • can you run it several times on localhost?
  • can you execute on the other host via ssh
  • is the path identical

and if so, then

mpirun -H host1,host2,host3 -n 12 ./helloworld

should travel across. Once you have these basics sorted out, try the Boost tutorial ... and make sure you have Boost and MPI libraries on all hosts you plan to run on.


Consider to use the parameter --mca btl_tcp_if_include eth0 to make nodes use only eth0 interface and preventing OpenMPI to figure out which was the best network. There is also --mca btl_tcp_if_exclude eth0 Remember to subtitute eth0 for your particular interface.

My /etc/hosts contained lines like these:

10.1.2.13 node13

...

10.1.3.13 node13-ib

When I launched mpirun, the TCP network was selected and the nodes used TCP network, however, after some time(20 seconds), OpenMPI discovered the IPs 10.1.3.XXX and tried to use them, which caused the error message.

I hope it helps

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜