MATLAB

Monash has a site license for Matlab. In addition, there are no licensing restraints on code compiled using the Matlab compiler.

What is MATLAB

Matlab is a general purpose analysis and graphics plotter from Mathworks. With more than 600 mathematical, statistical, and engineering functions, MATLAB provides immediate access to high-performance numerical computing. This functionality is extended with interactive graphical capabilities for creating plots, images, surfaces, and volumetric representations. A toolbox algorithms enhance MATLAB’s functionality in domains such as signal and image processing, data analysis and statistics, mathematical modeling, and control design. Toolboxes are collections of algorithms, written by experts in their fields, that provide application-specific numerical, analysis, and graphical capabilities.

Matlab compiler

The MATLAB Compiler (mcc) can be used to translate M-files into C files. The resultant C files can be used in any of the supported executable types including MEX, executable or library by generating an appropriate wrapper file. A wrapper file contains the required interface between the Compiler-generated code and a supported executable type. There are three main reasons for compiling M files:

  • To create stand-alone applications.

  • To hide proprietary algorithms.

  • To speed up the code.

Refer to the documentation for full details on using the compiler but the following gives the procedure for a simple example.

module load  matlab
mcc -mv cantileverLinear.m

SLURM Submission Script

#!/bin/sh
#SBATCH --job-name=example
#Name of the partition/queue
#SBATCH --partition=comp
## how much time is requested [ upper bound ]
#SBATCH --time=01:00:00
#specify the number of processors needed
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#specify memory needed (MB)
#SBATCH --mem=2000

##    notify me about this job
#SBATCH --mail-type=BEGIN,END,FAIL
#SBATCH --mail-user=first.last@monash.edu
##    specify output file
#SBATCH --output=SimpleJob-%j.out

module load  matlab/r2016a

MCR_CACHE_ROOT=$TMPDIR
export MCR_CACHE_ROOT
#
# Please note that MATLABROOT needs to be defined by the user in MonV2
# In MonARCH V1 this was placed in the module file
# Please point it to the appropiate directory for your version of Matlab
#
export MATLABROOT=/usr/local/matlab/r2016a

### on this next line is where you actually run your program
echo 10 | ./run_cantileverLinear.sh  $MATLABROOT

### Here are some explanations:
### notice that when you compiled you used:
###     mcc -mv name.m
### then this produces as output the file:
###     run_name.sh
### to run this use the line above:
###     ./run_name.sh $MATLABROO

Parallel Computing

MATLAB has the ability to run some operations in parallel using the Parallel Computing Toolbox. If you intend to use parallel computing in Matlab it is important that you set up your jobs to take into account the shared file system. Matlab stores some information when it runs in your home folder by default, which is normally fine. However when you run multiple jobs this is a problem as they all try to write to the same location causing errors such as:

Starting parallel pool (parpool) using the 'local' profile ... Error using parpool (line 103)
Not enough input arguments.

The problems can be resolved by specifying the directory where the temporary files are stored in your Matlab scripts or functions. See the following code for an example of how to setup your job. The example below uses the job id number assigned by Slurm to create a unique folder which is removed at the completion of the code.

%% Example code for parallel Matlab jobs

%% Create a local cluster object before creating the pool
pc = parcluster('local');

% Make temporary folder for storing the cluster control files changing to suit
% your username and project details (replace things in <> with your specific paths)
scratch_dir = strcat('/home/<user_name>/<scratch_folder>/', getenv('SLURM_JOB_ID'));
mkdir(scratch_dir);

% Explicitly set the JobStorageLocation to the temp directory that you
% have created in this script (above)
pc.JobStorageLocation = scratch_dir;

% Start the parallel pool with the number workers that matches your job
poolobj = parpool(pc,str2num(getenv('SLURM_NTASKS')));

%% Insert the code you want to run after this

% Example loop spread across the defined pool
parfor i=1:20, c(:,i) = eig(rand(1000)); end

%% Insert the code you want to run before this

% At the completion of the job tidy up
delete(poolobj);

% Clean up the temporary files
rmdir(scratch_dir,'s');

Parametric Jobs

It is common for people to run a large number of Matlab programs by using the same executable with a range of input parameters. This can be achieved by using array jobs within SLURM Here is an example of running Matlab with a SLURM Array Job. See Submitting jobs to MonARCH.

First you must rewrite your Matlab program so that it executes from a single function call. This is equivalent to a ‘main’ function in C or C++. Consider the following Matlab code in a file called print_input.m:

Build Command

module load matlab/r2016a
#
# Please note that MATLABROOT needs to be defined by the user in MonV2
# In MonARCH V1 this was placed in the module file
# Please point it to the appropiate directory for your version of Matlab
#
export MATLABROOT=/usr/local/matlab/r2016a

mcc -v -m  print_input.m

We can run the code to test it. Note that most of the output is from the shell script, and not the executabWe can run the code to test it. Note that most of the output is from the shell script, and not the executable.

sh run_print_input.sh $MATLABROOT 99
Setting up environment variables
---
LD_LIBRARY_PATH is .:/usr/local/sw/matlabR2011a/runtime/glnxa64:
/usr/local/sw/matlabR2011a/bin/glnxa64:
/usr/local/sw/matlabR2011a/sys/os/glnxa64:
/usr/local/sw/matlabR2011a/sys/java/jre/glnxa64/jre/lib/amd64/native_threads:
/usr/local/sw/matlabR2011a/sys/java/jre/glnxa64/jre/lib/amd64/server:
/usr/local/sw/matlabR2011a/sys/java/jre/glnxa64/jre/lib/amd64/client:
/usr/local/sw/matlabR2011a/sys/java/jre/glnxa64/jre/lib/amd64
Input parameter is  99

Now we create a shell script for the SLURM Array Job. Note that we use the –array parameter to define a range of values. Here we use steps of five.

#SBATCH --job-name=sample_array
#SBATCH --time=10:00:00
#SBATCH --mem=4000
#SBATCH --array=1-300
#SBATCH --output=job.out
#SBATCH --open-mode=append

module load matlab/r2016a
#
# Please note that MATLABROOT needs to be defined by the user in MonV2
# In MonARCH V1 this was placed in the module file
# Please point it to the appropiate directory for your version of Matlab
#
export MATLABROOT=/usr/local/matlab/r2016a

module list
ulimit -s 40000
echo "SLURM_SLURM_ARRAY_TASK_ID is $SLURM_ARRAY_TASK_ID"
./run_print_input.sh $MATLABROOT $SLURM_ARRAY_TASK_ID

Submit the job.

sbatch slurm.sh

Advanced Array Jobs

In the above example, the SLURM environment variable $SLURM_ARRAY_TASK_ID is passed to the program, which has to use it to determine which parameters to run. If your code base can not be changed, then you can use the following method to create a command line parameter that is a function of the $SLURM_ARRAY_TASK_ID variable.

First, create a file, in this case called LIST, which has all the parameters for each invocation of your program. The first line contains the first set of parameters. The second line contains the second set, and so on.

1.0 110
2.0 210
3.0 430
  :

The task at hand is to extract the line of parameters needed, using the $SLURM_ARRAY_TASK_ID from LIST. That is done with the following line of Bash Shell code. It uses two Unix shell commands, head and tail. The output of the command below prints out the Nth line of the file, where N is set to $SLURM_ARRAY_TASK_ID

PARAMS=`head -n $SLURM_ARRAY_TASK_ID LIST | tail -1`

So to call it with Matlab:

PARAMS=`head -n $SLURM_ARRAY_TASK_ID LIST | tail -1`
./myMatlabExeScript.sh $MATLABROOT $PARAMS

Common Problems

Matlab crashes before running

With this problem, the program runs fine on a headnode or compute node, but Matlab crashes when running inside the queue environment. The crash occurs when loading Matlab, and before it starts running your code. You may see error messages in stdout that advise you to contact support.

The problem is that Matlab is running out of stack memory on the compute node. The solution is simple - simply increase it by putting this code in your shell script before the Matlab executable is called.

Increase stack size in the Unix shell

ulimit -s 60000 #increase stack size in the shell
echo 10 | time ./run_cantileverLinear.sh $MATLABROOT

This increases the stack size to 60,000 kBytes. This figure can be changed (up or down) depending upon your application

Missing Function

You have compiled your Matlab executable but when you run it, you get an error like Error using myfunct Undefined function ‘newfunction’ for input arguments of type ‘double’.

The Matlab compiler mcc automatically tracks dependencies, but is not perfect. The solution is to include the 2nd Matlab file (e.g ‘newfunction.m’) in the mcc command line.

A warning message “which: no ifconfig in <path>”

A warning message “which: no ifconfig in <path>” appears when compiling Matlab scripts.

This is a harmless warning for an earlier version of Matlab. The files should compile and the execute without any other issues. To remove this warning you can either:

  • Use a more recent version of Matlab

  • Put the path to ifconfig in your PATH variable. i.e. add /sbin to your PATH variable. e.g.

    mcc -mv cantileverLinear.m newfunction.m