The Budapest Quantum Optics Group
Search:
The Budapest Quantum Optics Group

Developing and Running Programs


Torque

To submit a job to the cluster, please use the Torque queue manager. The queue manager ensures that all nodes are used at any time, if there are requests, and that no node is running more than 1 job at any time.

The preferred way of using the manager is

  1. write a script for your job, example: script.pbs
  2. copy the script to somewhere on the cluster, for example scp script.pbs asboth@optics.szfki.kfki.hu:/h/kakas/h/asboth/
  3. log in to kakas using, e.g., ssh asboth@kakas.szfki.kfki.hu
  4. use qsub to the script to the cluster: qsub script.pbs
  5. once the job is finished, you will find the standard output in the file myjob.o1234, where 1234 is the numerical id of your job assigned by torque, and "myjob" is the name you assigned to the job in the script. The standard error is saved in the file myjob.e1234.

To share computational resources in a fair way, we prefer if you submit jobs that terminate in a short time (short meaning less than 12 hours). Submit short scripts to the "short" queue (see below for how): jobs in this queue can use any number of nodes, but will be terminated after 12 hours.

If you have a code that takes a long time to run, try to cut it up into shorter chunks. If that is not possible, you can submit it into the "long" queue (see below). A job in the "long" queue will have priority over a job in the "short" queue, as long as less than 60% of the nodes is running "long" jobs. This means, that whenever a node finishes a job, the job in the "long" queue will be submitted rather than the "short" job. However, if 60% of the nodes are already running "long" jobs, a job in the "long" queue has to wait until one of the "long" jobs is finished.

At the heart of the above process is the script. An example is here, including the decision to send the job to the "short" queue:

### Set the job name
##PBS -N myjob
### Run in the queue named "short", or "long". More on this later. 
#PBS -q short
### (optional): To send email when the job is completed:
##PBS -M your@email.address
### Specify the number of cpus for your job.  This example will allocate 4 cores
### using 2 processors on each of 2 nodes.
##PBS -l nodes=2:ppn=2
### Another example will allocate 8 cores on a single node
##PBS -l nodes=1:ppn=8
### This example allocate 16 cores on one and 8 cores on two machines
##PBS -l nodes=1:ppn=16+2:ppn=8
### For submitting a single "threaded" job on the poultry farm, use this resource spec
#PBS -l nodes=1:rmki
### (optional):  Tell PBS how much memory you expect to use. Use units of 'b','kb', 'mb' or 'gb'.
##PBS -l mem=256m
### (optional): Tell PBS the anticipated run-time for your job, where walltime=HH:MM:SS
##PBS -l walltime=1:00:00
### (optional): Tell PBS to use a specific node for your job
###PBS -l nodes=csirke
### Switch to the working directory; by default TORQUE launches processes
### from your home directory.
cd /h/kakas/h/asboth/topological_walk
### Display the job context
echo Running on host `hostname`
echo Time is `date`
echo Directory is `pwd`
### Launch the actual program:
python python/2D.py -te 1 -t 20 -disp 0
### 
echo Job ended at `date`


The #PBS directives are read by the Torque system. The lines starting with ##PBS have been commented out.

A useful site for background information is here.

For techniques on submitting multiple processing jobs, see the Job Submission page of the TORQUE documentation. Notable possibilities are requesting multiple processors on a node (ppn), and the submission of array jobs. OpenMPI might be useful for making the most out of the latter.

GNU compilers

The following compiler commands are available:

  • gcc [4.4.6] (gcc-4.3, gcc-4.4, gcc-4.6)
  • g++ [4.4.6] (g++-4.3, g++-4.4, g++-4.6)
  • gfortran [4.4.6] (gfortran-4.3, gfortran-4.4, gfortran-4.6)

Compilation for 32 bit and 64 bit architectures

By default all compilers generate 32 bit codes which can be executed on every machine of the network. The machines with "-amd64" appended to their kernel names support running x86_64 executables. The 64 bit code can be generated by supplying the "-m64" switch on the command line. E.g.

gcc -m64 -o test test.c


OpenMP

The OpenMP library supports easy parallelization within one multi core machine. The OpenMP libraries are installed only for the version 4.3 GNU compilers, and can activated by supplying the "-fopenmp" flag on the command line. E.g.

gcc -fopenmp -o test test.c


LAPACK/ACML

Standard Debian LPACK libraries are installed both for 32 bit and 64 bit architectures.

Intel compilers and MKL

The Intel Composer XE 2011_sp1.6.233 is installed under the directory

/usr/local/intel


with the following components:

Intel C++ Compiler XE 12.1 on Intel(R) 64
Intel Fortran Compiler XE 12.1 Intel(R) 64
Intel Debugger 12.1 on Intel(R) 64
Intel Math Kernel Library 10.3 Update 6 on Intel(R) 64
Intel Threading Building Blocks 4.0
Intel Integrated Performance Primitives 7.0 Update 5 on Intel(R) 64

Compilation into 64 bit architectures

Initialize the necessary environment variables by source-ing

/usr/local/intel/compilervars.sh intel64

(bash flavor shells)

/usr/local/intel/compilervars.csh intel64

(csh flavor shells)

For successful linking, the argument -m elf_x86_64 must be passed to the linker ld. You can tell the Intel compilers to pass this options by including

-Xlinker -melf_x86_64


among the linking options.

nVidia CUDA

GPGPU graphics cards are currently installed in three machines, pulyka, kakas and liba. The working version of the CUDA Toolkit is 1.1, although 2.0 beta is also installed. The locations are /usr/local/cuda, and /usr/local/cuda-2.0, respectively.

Using the CUDA SDK, method one

The recommended way of using the CUDA Toolkit is copying the entire SDK (/usr/local/cuda/sdk) tree to somewhere in your home directory, and using the projects as templates for your CUDA program. To build the CUDA binaries successfully, you must have the following environment variables set:

 export CUDA_BASE=/usr/local/cuda/ 
 export LD_LIBRARY_PATH=${CUDA_BASE}/usr/lib:${CUDA_BASE}/lib:${CUDA_BASE}/X11R6/lib:$LD_LIBRARY_PATH
 export PATH=${CUDA_BASE}/bin:${CUDA_BASE}/usr/bin:/usr/local/bin64:$PATH
 alias make='make CUDA_INSTALL_PATH=${CUDA_BASE}/usr'
 export LD_RUN_PATH=${CUDA_BASE}/usr/lib64:${CUDA_BASE}/usr/lib:${CUDA_BASE}/lib:${CUDA_BASE}/usr/X11R6/lib


These variables can be set by sourcing

. /usr/local/cuda/env.sh


The SDK and the Toolkit are set up to use (a slightly hacked) gcc 4.3, and therefore OpenMP can be used in CUDA projects. However, OpenMP syntax cannot be used in .cu files: always move OpenMP code to a .cpp file. To use OpenMP the following lines must be added to your project's Makefile (after the inclusion of the common.mk):

# Add OpenMP flag and lib
NVCCFLAGS += --compiler-options -fopenmp
CXXFLAGS += -fopenmp
CFLAGS += -fopenmp
LIB += -lgomp


Using the CUDA SDK, method two

Alternatively, you can just copy any sample project directory from /usr/local/cuda/sdk/projects, and change the Makfile to include /usr/local/cuda/sdk/sdk.mk instead of ../../common/common.mk, and use the pre-compiled libraries from the system-wide SDK. Unfortunately there's is no guarantee that this method will always work. If you prefer the comfort of having your projects outside of the SDK tree, while you do not want to rely on some lazy super-user, you are free to copy the entire SDK subdir to somewhere in your home directory, and taylor your sdk.mk to reflect the correct path to your SDK installation by setting ROOTDIR accordingly.

Your OpenMP projects' makefiles will look something like this:

 ################################################################################
 #
 # Build script for CUDA w/ OpenMP project
 #
 ################################################################################

 # Add source files here
 EXECUTABLE      := deviceQueryOmp
 # Cuda source files (compiled with cudacc)
 CUFILES         := deviceQueryOmp.cu
 # CUDA dependency files
 CU_DEPS         := 
 # C/C++ source files (compiled with gcc / c++)
 CCFILES         := deviceQueryOmp_gold.cpp


 ################################################################################
 # Rules and targets

 include /usr/local/cuda/sdk/sdk.mk

 # Add OpenMP flag and lib
 NVCCFLAGS   += --compiler-options -fopenmp
 CXXFLAGS    += -fopenmp
 CFLAGS      += -fopenmp
 LIB         += -lgomp


MATLAB

MATLAB R2008a is installed on griff. Only one user can run it at a time. Both the 32bit and 64bit versions are available. To start the 32bit version, type

/contrib/MatlabR2008a/bin/matlab


To start the 64bit version, type

export LD_LIBRARY_PATH=/usr/local/cuda/usr/lib:$LD_LIBRARY_PATH
/contrib/MatlabR2008a_64/bin/matlab


Page last modified on January 19, 2015, at 03:49 PM