Lab (2) -Introduction to Distributed MATLAB
Rahman Tashakkori & Darren Greene, CS - Appalachian State University, Distributed Computing Workshop, 2006
Distributed Computing in MATLAB
In this lab we will learn to convert the sequential PI program we wrote
in the previous lab to a parallel version.
Preparations
We can use the MATLAB on Windows machines in the lab or on student Linux box. If you
have the MATLAB open from the first lab, then you are good to go.
Otherwise, use start, All Programs, CS Software, MATLAB, and start the
MATLAB again.
Example (1)
Suppose we want to add the two elements of 3 one-by-two
matrices. For example a = [1 1] results in 1+1 = 2, b[2 2]
results in 4, and [3 3] results in 6.
1. Use the findResource to
locate a job manager and create the job manager object jm, which
represents the job manager in the cluster
whose name is MyJobManager.
jm = findResource('jobmanager',
'LookupURL', 'grid0.cs.appstate.edu')
2. Create a job. Create job j on the job manager.
j = createJob(jm);
3. Create tasks. Create three tasks on the job j. Each task evaluates
the sum of the array that is passed as an input argument.
createTask(j, @sum, 1, {[1 1]});
createTask(j, @sum, 1, {[2 2]});
createTask(j, @sum, 1, {[3 3]});
4. Submit the job to the queue. The job manager moves the job into
the queue to be executed when workers are available.
submit(j);
5. Retrieve results. Wait for the job to complete, then get the
results from all the job’s tasks.
waitForState(j)
results = getAllOutputArguments(j)
results =
[2]
[4]
[6]
The above computations were done on three processors and the result
was returned back to the manager at the end.
Lab
Activity (2) - Finding
Available Workers
The following MATLAB program (listWorkers.m)
allows you to determine what workers are available for use on the
distributed system. Right click on this link for the
(listWorkers.m) program (I prefer this method) and save the file in your
Work directory or cut and paste the program below into a blank MATLAB
Editor page and save it in your work directory with the correct
name. Note: the file
name is listWorkers.m.
% listWorkers.m
% Darren Greene - Appalachian
State University - Summer 2005
% Modified Bar D. Greene in Nov.
2006
jm = findResource('jobmanager',
'LookupURL', 'grid0.cs.appstate.edu');
numIdleWorkers =
size(jm.IdleWorkers, 1);
workerCount = 0;
numBusyWorkers =
size(jm.BusyWorkers, 1);
workerCount = 0;
fprintf(1, 'Idle: %i Busy:
%i Total: %i\n\n', numIdleWorkers, numBusyWorkers,
numIdleWorkers+numBusy
fprintf(1, '-----Idle
Workers-----\n');
for workerCount=1:numIdleWorkers
fprintf(1, '%s
(%s)\n', jm.IdleWorkers(workerCount).Name,
jm.IdleWorkers(workerCount).HostName);
end
if numBusyWorkers
> 0
fprintf(1, '\n\n');
fprintf(1, '-----Busy Workers-----\n');
fprintf(1, 'Total: %i\n\n', numBusyWorkers);
for
workerCount=1:numBusyWorkers
fprintf(1, '%s (%s)\n', jm.BusyWorkers(workerCount).Name,
jm.BusyWorkers(workerCount).HostNa
end
end
fprintf(1, '\n');
You can now run this program to find out how many workers are
available to play and which ones. To do so at the MATLAB prompt
type:
>> listWorkers
Below we will use these workers to perform some computations.
Following is a MATLAB Monte Carlo code we wrote in Lab (1). We
will work on parallelizing (I hope not paralizing) it.
Lab
Activity (2) - Monte Carlo Sequential PI Calculator
To save the following program either Right
Click on this link pi_comp.m
and save the file in your Work directory or cut and paste the following
lines into a file and save it with the correct name under your Work
directory.
Example
% pi_comp.m
% Rahman Tashakkori - Appalachian State University
% Monte Carlo Computation of pi.
% n is the total number histories we will use
n = input(' Enter n number of histories you wish to use: ');
% To keep the number of histories that falls inside the circle
inside_count = 0;
% Generate n random points in range [0,1]
% scale it to X[0,1]. 1/4 of the circle is defined in that range
as
% x^2+y^2 <= 1 which will be approximately pi/4.
% Compute sum of x square for variance and STD for error
x_er = 0;
start_time = cputime; % Start
timing of algorithm
for i = 1:n,
x = rand;
y = rand;
if x^2 + y^2 <= 1,
inside_count =
inside_count + 1;
x_er = x_er+1;
end;
end;
elapsed_time = cputime -
start_time; %Calculate elapsed time to perform
algorithm
pi_approx =
4*(inside_count/n),
err = pi - pi_approx,
x_er = x_er/n;
sigma = 4*sqrt(x_er)/n;
fprintf('1 STD Error STD is: %f \n', sigma)
fprintf('pi-STD is %f: \n', pi_approx-sigma)
fprintf('pi+STD is %f: \n', pi_approx+sigma)
fprintf('Computing time is: %f \n',elapsed_time )
Lab
Activity (3) - Monte Carlo Distributed PI Calculator
There
are several ways to distribute this computations to the available
processors. Here we summarize two possibilities:
1) Submit the same calculations with the same number of histories to
multiple processors but with different seeds for the random number
generator, take the results back and average them to find an estimate
for pi.
2) Divide the circle in some segments depending on the number of
processors involved, send the calculations of pi on each segment to one
of the processors, take the results back and average them to find the
answer.
Below is a program in which we use the first method to compute the
value for pi. Save this file pi_dist_mont.m
in your Work directory or use the cut and paste method to create the
file. Note that you also need a second file pi_mont.m
that is listed immediately after the pi_dist_mont.m below.
%
pi_dist_mont.m
% Rahman Tashakkori - Appalachian State University
% Parallel Monte Carlo Computation of pi.
n = input(' Enter Number of Histories n: ');
disp(' ')
disp(' ')
disp('------------------------------------- ')
disp('These workers are available: ')
disp('------------------------------------- ')
listWorkers
disp('------------------------------------- ')
disp(' ')
disp(' ')
np = input(' Enter Number of Processors you wish to use: ');
count = 0;
jm = findResource('jobmanager', 'LookupURL', 'grid0.cs.appstate.edu');
j = createJob(jm);
set(j, 'FileDependencies', {'pi_mont.m'});
% Generate random points in the square [-1,1]X[-1,1].
% The fraction of these that lie in the unit disk
% x^2+y^2 <= 1 will be approximately pi/4.
% Think of this as taking the average of N independent
% identically distributed random variables X_i, where
% X_i = 1 if point i lies in the disk, 0 otherwise.
% np = 6;
for p = 1 : np
TASK(p) = createTask( j, @pi_mont, 1, {n} );
% createTask(JOB1, @convoluter, 1, {quadrant_1,
sample});
end
%_____________________________________________________________________
start_time = cputime; % Start
timing of algorithm
submit( j )
%_____________________________________________________________________
while(1)
if strcmp(j.State, 'finished')
break;
end
end
PI = 0;
elapsed_time = cputime -
start_time; %Calculate elapsed time to perform
algorithm
for p = 1 : np
OUTPUT = get( TASK(p), 'OutputArguments' );
A = OUTPUT{1};
PI = PI + A;
end
PI = PI/np;
%_____________________________________________________________________
disp( [' Computed PI and 1STD error: ' num2str(PI, 15) ] )
disp( [' Compare this to pi(): ' num2str(pi, 15) ] )
fprintf('Computing time is: %f
\n',elapsed_time )
The second file you need is, pi_mont.m
that is also listed below.
% pi_mont.m
% information and arguments for pi_dist_mont.m program
function[results] = pi_mont(n)
% Compute expected value of X_i^2 to use in error estimate.
Eofxsq = 0;
count = 0;
for i=1:n,
x = rand;
y = rand;
if x^2 +
y^2 <= 1,
count = count + 1;
Eofxsq = Eofxsq + 1^2;
end;
end;
pi_approx =
4*(count/n);
err = pi - pi_approx,
Eofxsq = Eofxsq/n;
% Variance in
individual approximations to pi/4.
varx = Eofxsq -
(count/n)^2;
% Std dev in
individual approximations to pi/4.
sigx = sqrt(varx);
sigma =
4*sigx/sqrt(n),
results(1) =
pi_approx;
results(2) = sigma;
end
Run the above program for different number of histories and number of
processors and record the error and the computing times in the
following table.
n
(histories)
|
np
(processors)
|
Computed PI |
Estimated
Error
(1
STD) |
Computing Time |
| 100000 |
1
|
|
|
|
| 250000 |
1
|
|
|
|
| 1000000 |
1
|
|
|
|
|
|
|
|
|
| 100000 |
2
|
|
|
|
| 250000 |
2
|
|
|
|
| 1000000 |
2
|
|
|
|
|
|
|
|
|
| 100000 |
4
|
|
|
|
| 250000 |
4
|
|
|
|
| 1000000 |
4
|
|
|
|
|
|
|
|
|
| 100000 |
6
|
|
|
|
| 250000 |
6
|
|
|
|
| 1000000 |
6
|
|
|
|
Workers
Watch Program
This is another handy program that can be run to monitor the
worker's' participation in the computation process. We will call
it watchWorkers.m
which is also listed below.
% watchWorkers.m
% Darren Greene - Appalachian
State University - Summer 2005
function
watchWorkers()
jm =
findResource('jobmanager', 'LookupURL', 'grid0.cs.appstate.edu');
run = true;
% determines when to stop monitoring workers
active = false;
% keeps track of if any workers are busy
while(run)
pause(0.5);
%
show how many workers are being allocated
fprintf(1,
'%i of %i', jm.NumberOfBusyWorkers, jm.NumberOfIdleWorkers+
jm.NumberOfBusyWorkers);
busy
= jm.BusyWorkers;
fprintf(1,
' --[');
for
workerCount=1:size(busy, 1)
fprintf(1, ' %s ', busy(workerCount).Name);
end
fprintf(1,
' ]--');
fprintf(1,
'\n');
if
jm.NumberofBusyWorkers == 0
if
active == true
fprintf(1, 'No activity\n');
run = false;
end
else
active
= true;
end
end
In-Lab Assignment
In Lab (1), we had a pi program that was based on the expansion.
Convert that program to a parallel program.
Solution can be found here.