访问量: 330 次浏览
There are already many existing Linux based clustering solutions out there that claim to provide
an easy way to obtain/build a Linux Beowulf Cluster[1] . The fact is that most
links out there are either dead or completely outdated. We'll concentrate on a
more specific class that use a diskless node approach where the nodes boot off
a Single System Image through the network interface (this process is explained
in the Creating the SSI section below). The Clustermonkey web site[2] has an
article[3] which alleviates the use of such a configuration in specific
conditions depending on the intended use of the cluster. One of the key
conditions where diskless nodes are useful is when there is a need to share
file based data during the runs between the nodes. However, if all processes
compute and manipulate the data independently, local storage becomes more
interesting. In our case, the nodes do have a local disk that we could
configure as a local "scratch pad" for such purposes.
With the existence of commercial and non-commercial solutions (refer to the References
at the end of the article), one must ponder as to why we would want to build
our own cluster solution from bottom up. We'll provide a few of the key
answers to this here.
Gentoo is becoming more and more popular due to it's flexibility
and managebility. Recent IEEE journal articles have been written on this
subject so we won't debate this here[5] . Since hardware and software, in the
Linux and research world, evolve at a frantic rate, the need for a fast
evolving OS is more than necessary. Gentoo offers this technological bleeding
edge as well as providing the means to easily integrating new packages to the
system by the means of portage's ebuilds.
In our proof of concept, we will be using the following material to build our mini-cluster:
We will use the most basic/common network
topology for building this cluster. All Slave Nodes connect to one switch
which in turn connect to the Master Node through a 100BaseT Ethernet network.
The Master Node has two Ethernet devices to ensure that the Slave Nodes are on
an isolated network. In theory, the nodes should not be accessed directly by
the users and jobs are to be launched through the Master Node.
The steps to creating a Gentoo base Single System Image is documented in
the Gentoo Diskless Client section of this wiki. Although a little bit
general, it contains the key elements to creating an SSI image that will be
usable by the Gentoo Headnode Configuration document. However, there are some
applications we do implicitly add to nodes. Here is a short listing of some of
the Gentoo packages:
sys-cluster/openmpi sys-cluster/torque sys-cluster/ganglia
Ganglia is used for monitoring the entire cluster. It's
installation is detailed in the Gentoo Headnode Configuration document
document. Note that the openmpi and torque ebuilds come from the
www.gentooscience.org overlay.
This section is
detailed by the Gentoo Headnode Configuration Document. Please refer to it for
details on how to configure the Master Node.
We use SSH to launch commands on each nodes as an
alternative to RSH. There are many arguments to using RSH instead of the
overhead of SSH. The fact is that SSH is more portable when it comes to
carrying the environment over to the other nodes than RSH. Since SSH is only
used for launching commands an not for the actual communications, there is no
overhead added to the actual computation.
Since your home directory is mounted across all nodes, you only need to
create one key in your home directory and it will automatically be present on
all nodes due to the NFS mounted nature of your $HOME. Here is the sequence to
perform:
cd ~/.ssh/ ssh-keygen -t dsa -b 1024 -f id_dsa
The ssh-keygen command
will prompt for a passphrase, don't enter anything since we don't want one to
log onto the nodes. We then add the newly generated key to the
authorized_keys:
cat id_dsa.pub >> authorized_keys
Now we must log onto all
the nodes so that their unique signature is added to our ssh configuration. To
make the process simpler, we can loop the process as such:
for Num in $(seq 1 24); do ssh thinkbig${Num} hostname; done
This will log you onto each nodes
and get the hostname value (we use hostname so that ssh is only used to launch
a simple command and doesn't actually open a session on the node). Here is an
example output, note that some of the nodes aren't available (ssh: thinkbig20:
Name or service not known) and some of them were already registered (they
simply return their hostname):
eric@headless ~ $ for Num in $(seq 1 24); do ssh thinkbig${Num} hostname; done thinkbig1
The authenticity of host 'thinkbig2 (10.0.1.12)' can't be established. RSA key fingerprint is
22:c1:2a:28:44:f2:1d:a6:7e:57:72:16:ee:d5:28:4c. Are you sure you want to
continue connecting (yes/no)? yes
Warning: Permanently added 'thinkbig2,10.0.1.12' (RSA) to the list of known hosts.
thinkbig2
The authenticity of host 'thinkbig3 (10.0.1.13)' can't be established. RSA key
fingerprint is 22:c1:2a:28:44:f2:1d:a6:7e:57:72:16:ee:d5:28:4c. Are you sure
you want to continue connecting (yes/no)? yes
Warning: Permanently added 'thinkbig3,10.0.1.13' (RSA) to the list of known hosts.
thinkbig3
ssh: thinkbig4: Name or service not known
The authenticity of host 'thinkbig5 (10.0.1.15)' can't be established. RSA key fingerprint is
22:c1:2a:28:44:f2:1d:a6:7e:57:72:16:ee:d5:28:4c. Are you sure you want to
continue connecting (yes/no)? yes
Warning: Permanently added'thinkbig5,10.0.1.15' (RSA) to the list of known hosts.
thinkbig5
ssh:thinkbig6: Name or service not known
thinkbig7
ssh: thinkbig8: Name or service not known
thinkbig9
thinkbig10
ssh: thinkbig11: Name or service not known
thinkbig12
thinkbig13
ssh: thinkbig14: Name or service not known
thinkbig15
thinkbig16
thinkbig17
thinkbig18
thinkbig19
ssh: thinkbig20: Name or service not known
thinkbig21
ssh: thinkbig22: Name or service not known
thinkbig23
thinkbig24
The loop described above can also be
used to generate a list of available nodes as such:
for Num in $(seq 1 24); do ssh thinkbig${Num} hostname 2> /dev/null | grep -e '^think' >> hostfile ; done
Only run this after having added the hosts to your known_hosts as performed by
the loop above. The file named hostfile now contains:
eric@headless ~ $ cat hostfile thinkbig1 thinkbig2 thinkbig3 thinkbig5 thinkbig7 thinkbig9 thinkbig10 thinkbig12 thinkbig13 thinkbig15 thinkbig16 thinkbig17 thinkbig18 thinkbig19 thinkbig21 thinkbig23 thinkbig24
kyron@headless ~ $ export LD_LIBRARY_PATH=/usr/lib/openmpi/1.0.2-gcc-4.1/lib64; /usr/lib/openmpi/1.0.2-gcc-4.1/bin/mpirun -np 2 hello Hello, world. I am 1 of 2 Hello, world. I am 0 of 2
Execution of OpenMPI on the 32
bit nodes including the 64 bit head node... This is a heterogeneous issue.
Here, in no particular order: