6. May 2009 18:39 by David in
Supercomputing has officially reached the desktop it is now possible buy a Linux or Microsoft based cluster for a few thousand dollar. Further compute clusters can easily be spun up in the cloud where they can perform some work and then by switched off once the CPU intensive task has been completed. One of the great benefits of this migration from big iron to clusters of affordable commodity hardware is that the software libraries that were designed to help scientists to predict the weather or to find the correlation between diseases and genes are now available to use in line of business applications.
Desktop versus Applications Server versus Compute cluster
The main differences between the three main computer architectures (Desktop, Application Server and Compute Clusters) is no longer based on hardware differences (vector computers such as those from Cray are being slowly replaced with Intel/AMD x64 machines) but usage scenarios.
Computer Architecture | Usage |
Desktop | - General purpose computer
- Multiple applications running concurrently
- Optimised for GUI feedback and response
|
Application Server | - Dedicated to run a particular program
- Long uptime requirements
- Optimised for network
|
Compute Cluster/Supercomputer | - Batch processing – Cluster is dedicated to work on a problem
- Typically low or no external input once problem starts
- Single Program Multiple Data type problems
- Often collaboration between nodes is more complex than problem that is being solved
|
Supercomputers are typically batch processing devices. A university or government department that has spent millions of dollars on their cluster needs to ensure that the infrastructure is fully utilised throughout its lifetime (24x7x365). This typically achieved by scheduling jobs weeks in advance and making sure that there are no periods where the cluster is idle. Often when work is not completed within its allocated scheduled period the process are automatically killed so that the next job can execute. Since the cluster is designed to run different pieces of software and there might be hundreds of servers involved the concept of a “Single System Image” becomes important where software that is “deployed” to the cluster is seamlessly deployed onto all the nodes within the cluster.
Data Sharing
There are two basic methods for sharing information between nodes within a cluster:
- Shared Memory Model – Combine all the memory of all of the machines in the cluster into a single logical unit so that processes from any of the the machines is able to access the shared data on any of the other machines.
- Message Passing Model – Use messaging to pass data between nodes.
Since the shared memory model is similar to software runs on a local machine it is very familiar to develop against. Unfortunately while accessing the data is transparent the actual time to load data off the network is far slower than reading from local memory. Further data contention can arise with different servers in the cluster trying to Update the same logical memory location. For this reason Message Passing has gained prominence and the Message Passing Interface protocol has become a standard across the super computing industry.
MPI.NET
MPI.NET allows the .NET developer to hook into the MPI implementation by Microsoft and thus can be used to have code running on a cluster.
MPI.NET Installation steps
- Install the Microsoft implementation of MPI - download
- Install the MPI.NET SDK – download
Converting the Monte Carlo F# PI example to use MPI.NET
The following is the original F# code to calculate PI:
1: let inUnitCircle (r:Random) =
2:let y = 0.5 - r.NextDouble()
3:let x = 0.5 - r.NextDouble()
4:
5: match y * y + x * x <= 0.25 with
6: |true -> 1.0
7: |_ -> 0.0
8:
9: let calculate_pi_using_monte_carlo(numInterations:int)(r:Random) =
10:let numInCircle = List.sum_by (fun f -> (inUnitCircle r))[1 .. numInterations]
11: 4.0 * numInCircle / (float)numInterations
To calculate PI after a 1000 tests have been executed calculate_pi_using_monte_carlo 1000 r.
To speed up the calculation process we want the F# code to run on our cluster.
Steps:
- Reference the MPI assembly from the GAC
- Pass the command line arguments to MPI
1: let args = Sys.argv
2:
3: using (new MPI.Environment(ref args)) (fun environment ->
4: // Code here
5: )
- When MPI starts it gives each node an ID – Usually Node 0 is used for communication and the other nodes are used for processing.
- We want to insert the call to calculate PI at line 4 however once each node has completed the calculation the result needs to be passed back to Node 0 so that it can be combined with the other results at the same location
- Determine the Node ID
1: using (new MPI.Environment(ref args)) (fun environment ->
2:let comm = Communicator.world
3:let nodeId = comm.Rank
4: )
- Use the Node Id to seed the Random method
- Use the Reduce method to retrieve the results of each different cluster instance and return that value back to the client
1: using (new MPI.Environment(ref args)) (fun environment ->
2:let comm = Communicator.world
3:let seed = DateTime.Now.AddDays((float)comm.Rank)
4:let r = new Random((int)seed.Ticks)
5:let pi:double = comm.Reduce(calculate_pi_using_monte_carlo 1000 r, Operation<double>.Add, 0) / (double)comm.Size
6:if (comm.Rank = 0) then
7: (
8: Console.WriteLine("Pi " + pi.ToString())
9: )
10: )
- Execute the code
- “c:\Program Files\Microsoft HPC Pack 2008 SDK\Bin\mpiexec.exe" -n 15 PebbleSteps.MPI.exe
- MPI then spins up 15 processes and runs the F# application within each process and provides the environment settings so that MPI.NET can determine what each Nodes ID is.
- The program will finally display the extracted value of PI