David Ross's Blog Random thoughts of a coder

Visual F# in VS 2010

28. May 2009 23:55 by David in

I have finally got around to installing VS 2010 Betaand debugging with F# ACTUALLY works!  Six months of having to add variables to the watch window, instead of being able to move my mouse over the variable declaration was getting very tiring…  At the moment the F#/IDE integration story is very poor and makes it hard to justify using F# for simple problems that can be just as easily solved using C#.  That excuse thankfully no longer  applies.

I loaded up a couple of the F# projects from work and they all compile/pass our unit tests which as you can imagine was a great relief.  It appears as though there has been a “big” cleanup around naming conventions within the core F# libraries and many of the methods that we were leveraging have now been marked as obsolete.    Thankfully Microsoft decided to “obsolete” the methods and not remove the old functions entirely.

Opensource .NET Exchange III Lineup

28. May 2009 23:13 by David in

Gojko has announced the lineup for the 3rd Open Source.NET Exchange 

The speaker list is as follows:

  • Ian Cooper: A First Look at Boo
  • Dylan Beattie: Managing Websites with Web Platform Installer and msdeploy
  • Scott Cowan: Spark View Engine
  • David: Introduction to MPI.NET
  • Gojko Adzic: Acceptance testing in English with Concordion .NET
  • Sebastien Lambla: What OpenRasta does other frameworks can’t
  • Phil Trelford: F# Units of Measure

This time I will be introducing MPI.NET and covering many of the topics that I have been blogging about over the last few weeks.  I’ve decided to do all the examples/slides using C# as opposed to F# which I will continue to cover in my blog posts.

Personally I am looking forward to seeing the session on Boo and the improvement Sebastien has made to OpenRasta.

MPI.NET – Distributed Computations with the Message Passing Interface in F# – Part 2

17. May 2009 12:17 by David in

In a previous post I described that it was possible to calculate PI using Monte Carlo

Using the same technique it is possible to price financial products such as insurance.  Customers pay a premium for heath cover.  If they get sick the insurance company is then obligated to pay any medical costs that “may” occur.  This means that the customer has a fixed “known” upfront cost meanwhile the insurance company’s possible costs range from almost nothing to that of an extremely expensive medical bill resulting from a surgical procedure.  It is possible to run thousands of “What if scenarios” and use the results to estimate the amount of capital that is needed to cover the costs for all of the customers at the organisation.

Another example is pricing an Option which is a simple version of Insurance and is used to lock in the price of a product that the company wants to buy or sell at a future date. 

Buying an option – going Long:

Buyer has no obligation to exercise the Option however the seller of the Option is legally obligated to provide the product at the price indicated in the contract.  Buyer pays a premium to the seller.

  • Call - Buyer assumes the price is going to rise and wants to lock in the price – Airline buys an option to lock in the price of fuel for its fleet as it fears that prices will rise in 6 months. If the price drops in six months the Airline buys Oil from the market at the cheaper price.
  • Put – Buyer assumes that the price is going to fall and wants to lock in the price – Oil manufacturer believes prices will be lower in 6 months as the economy is slowing and fears that oil prices will fall.  If the oil price increases the manufacturer sells the oil at the higher market price.

Selling an option – going Short:

Seller is obligated to provide the product at the price indicated in the contract if the buyer exercises the contract.  An exercised contract is a loss for the seller, in comparison, if the contract is not exercised the Seller makes a profit in the amount of the Premium.

  • Call – Seller assumes the price is going to fall
  • Put – Seller assumes that the price is going to rise

Simulating a European Option

The Financial Numerical Recipes in C++web site includes a number of tutorials using C++ to calculate Bond Prices, Option Prices etc.  I recently have been porting the code snippets to F# to gain more familiarity with the language.

The C++ code to simulate the price of an Option is here

The first part of the simulation is randomly generate what the price will be in the future.  The value has an equal probability of being higher or lower than the starting price.  The inputs to the simulation are:

  • Current Value (S)
  • Interest Rate (r)– Since we are simulating the price in the future we want to convert the price back into todays money
  • Time (time) – Duration to the contract’s exercise date
  • Volatility (sigma) – The magnitude of the random movements, at each point in time, that the price is expected to have – Best guess based on historical data and is the most problematic and difficult part of pricing Options. 
    • A low volatility implies the the final price WILL NOT have diverged far from the current value S
    • A high volatility implies the the final price WILL have diverged far from the current value S
  • Type of Walk – Stock prices are assumed to be “lognormal walk” which means that each price movement is a percentage change
    • S1 = S * 0.1 * randomChoiceOf(1 or –1)
    • S2 = S1 * 0.1 * randomChoiceOf(1 or –1)

The C++ simulation uses a library to generate a random normal distribution which in turn is used to create the lognormal walk.  Meanwhile the System.Random object in .NET provides random numbers where the generated values are equally spread across the range 0 to 1.  A random normal distribution meanwhile generates values that follow a Gaussian distribution with the “mean” being zero and the shape looking like a bell curve.   While it is easy to create a method that will generate a normal distribution the excellent Math.NET Project provides this capability all ready.

let logNormalRandom = new MathNet.Numerics.Distributions.NormalDistribution()
let next = logNormalRandom.NextDouble()

This leads to the following ported code:

   1: let R = (r - (0.5 * Math.Pow(sigma, 2.0))) * time
   2: let SD = sigma * Math.Sqrt(time)
   3: let FuturePrice = S * Math.Exp(R + SD * logNormalRandom.NextDouble())

The code above returns what the future price will be for a particular simulation.  Following the explanation of Options above the buyer only exercises the option if it will make a profit over buying the product directory from the market (In the Money) and programatically is as follows:

   1: let europe_call_payoff price exercise =  Math.Max(0.0, price - exercise)
   2: let europe_put_payoff price exercise =  Math.Max(0.0, exercise - price)

The final code is here

   1: #light
   2:  
   3: open System
   4:  
   5: // Financial Numerical Recipes in C
   6: // http://finance-old.bi.no/~bernt/gcc_prog/recipes/recipes/recipes.html
   7: let europe_call_payoff price exercise =  Math.Max(0.0, price - exercise)
   8: let europe_put_payoff price exercise =  Math.Max(0.0, exercise - price)
   9:  
  10: let option_price_call_european S X r sigma time payoff sims =
  11:let logNormalRandom = new MathNet.Numerics.Distributions.NormalDistribution()
  12:  
  13:let R = (r - (0.5 * Math.Pow(sigma, 2.0))) * time
  14:let SD = sigma * Math.Sqrt(time)
  15:  
  16:let option_price_simulation()  =
  17:let S_T = S * Math.Exp(R + SD * logNormalRandom.NextDouble())
  18:     payoff S_T X
  19:
  20:let rec futureValueIter i value = 
  21:     match i with
  22:     |0 -> value + option_price_simulation()
  23:     |_ -> futureValueIter (i-1) (option_price_simulation() + value)
  24:
  25:let futureValue = futureValueIter sims 0.0
  26:   System.Math.Exp(-r * time) * (futureValue / (double)sims)

And the test

   1: #light
   2:  
   3: open MbUnit.Framework
   4: open OptionPricingModel
   5:
   6: [<Test>]
   7: let simulate_call_option() = 
   8:let result = option_price_call_european 100.0 100.0 0.1 0.25 1.0 europe_call_payoff 500000
   9:   Assert.AreApproximatelyEqual(14.995, result, 0.03)

Once again the simulation is “close” to the correct value in this case within 3%.  The C++ code shows techniques to improve the accuracy of the simulation which I will do a in a future post and at the same time host the simulation within MPI.NET.

MPI.NET – Distributed Computations with the Message Passing Interface in F# – Part 1

6. May 2009 18:39 by David in

Supercomputing has officially reached the desktop it is now possible buy a Linux or Microsoft based cluster for a few thousand dollar.  Further compute clusters can easily be spun up in the cloud where they can perform some work and then by switched off once the CPU intensive task has been completed.  One of the great benefits of this migration from big iron to clusters of affordable commodity hardware is that the software libraries that were designed to help scientists to predict the weather or to find the correlation between diseases and genes are now available to use in line of business applications.

Desktop versus Applications Server versus Compute cluster

The main differences between the three main computer architectures (Desktop, Application Server and Compute Clusters) is no longer based on hardware differences (vector computers such as those from Cray are being slowly replaced with Intel/AMD x64 machines) but usage scenarios. 

Computer ArchitectureUsage
Desktop
  • General purpose computer
  • Multiple applications running concurrently
  • Optimised for GUI feedback and response
Application Server
  • Dedicated to run a particular program
  • Long uptime requirements
  • Optimised for network
Compute Cluster/Supercomputer
  • Batch processing – Cluster is dedicated to work on a problem
  • Typically low or no external input once problem starts
  • Single Program Multiple Data type problems
  • Often collaboration between nodes is more complex than problem that is being solved

Supercomputers are typically batch processing devices.  A university or government department that has spent millions of dollars on their cluster needs to ensure that the infrastructure is fully utilised throughout its lifetime (24x7x365).  This typically achieved by scheduling jobs weeks in advance and making sure that there are no periods where the cluster is idle.  Often when work is not completed within its allocated scheduled period the process are automatically killed so that the next job can execute.  Since the cluster is designed to run different pieces of software and there might be hundreds of servers involved the concept of a “Single System Image” becomes important where software that is “deployed” to the cluster is seamlessly deployed onto all the nodes within the cluster.

Data Sharing

There are two basic methods for sharing information between nodes within a cluster:

  • Shared Memory Model – Combine all the memory of all of the machines in the cluster into a single logical unit so that processes from any of the the machines is able to access the shared data on any of the other machines.
  • Message Passing Model – Use messaging to pass data between nodes.

Since the shared memory model is similar to software runs on a local machine it is very familiar to develop against.  Unfortunately while accessing the data is transparent the actual time to load data off the network is far slower than reading from local memory.  Further data contention can arise with different servers in the cluster trying to Update the same logical memory location.  For this reason Message Passing has gained prominence and the Message Passing Interface protocol has become a standard across the super computing industry.

MPI.NET

MPI.NET allows the .NET developer to hook into the MPI implementation by Microsoft and thus can be used to have code running on a cluster.

MPI.NET Installation steps

  1. Install the Microsoft implementation of MPI - download
  2. Install the MPI.NET SDK – download

Converting the Monte Carlo F# PI example to use MPI.NET

The following is the original F# code to calculate PI:

   1: let inUnitCircle (r:Random)  =
   2:let y = 0.5 - r.NextDouble()
   3:let x = 0.5 - r.NextDouble()
   4:
   5:   match y * y + x * x <= 0.25 with
   6:     |true -> 1.0
   7:     |_ -> 0.0
   8:  
   9: let calculate_pi_using_monte_carlo(numInterations:int)(r:Random) = 
  10:let numInCircle = List.sum_by (fun f -> (inUnitCircle r))[1 .. numInterations] 
  11:   4.0 * numInCircle / (float)numInterations 

To calculate PI after a 1000 tests have been executed calculate_pi_using_monte_carlo 1000 r.

To speed up the calculation process we want the F# code to run on our cluster.

Steps:

  1. Reference the MPI assembly from the GAC
  2. Pass the command line arguments to MPI
       1: let args = Sys.argv
       2:  
       3: using (new MPI.Environment(ref args)) (fun environment ->
       4:     // Code here
       5: )
  3. When MPI starts it gives each node an ID – Usually Node 0 is used for communication and the other nodes are used for processing.
  4. We want to insert the call to calculate PI at line 4 however once each node has completed the calculation the result needs to be passed back to Node 0 so that it can be combined with the other results at the same location
  5. Determine the Node ID
       1: using (new MPI.Environment(ref args)) (fun environment ->
       2:let comm = Communicator.world
       3:let nodeId = comm.Rank
       4: )
  6. Use the Node Id to seed the Random method
  7. Use the Reduce method to retrieve the results of each different cluster instance and return that value back to the client
       1: using (new MPI.Environment(ref args)) (fun environment ->
       2:let comm = Communicator.world
       3:let seed = DateTime.Now.AddDays((float)comm.Rank)
       4:let r = new Random((int)seed.Ticks)
       5:let pi:double = comm.Reduce(calculate_pi_using_monte_carlo 1000 r, Operation<double>.Add, 0)  / (double)comm.Size
       6:if (comm.Rank = 0) then
       7:     (
       8:       Console.WriteLine("Pi " + pi.ToString())
       9:     ) 
      10: )
  8. Execute the code
  9. “c:\Program Files\Microsoft HPC Pack 2008 SDK\Bin\mpiexec.exe" -n 15 PebbleSteps.MPI.exe
  10. MPI then spins up 15 processes and runs the F# application within each process and provides the environment settings so that MPI.NET can determine what each Nodes ID is.
  11. image
  12. The program will finally display the extracted value of PI