Roy and Niels

Roy and Niels

Sunday, January 22, 2012

The Quick and Dirty Guide for Parallelizing FLUKA

(Single PC version)

Imagine you got a desktop or laptop PC with 4 or perhaps even 8 CPU cores available, and you want to run the Monte Carlo particle transport program  FLUKA on it using all CPU cores.
The FLUKA execution script rfluka however was designed to run in "serial" mode. That is, if you request to repeat your simulation a lot of times (say, 100) issuing the command rfluka -N0 -M100 example, each process is launched serially, instead of utilizing all available cores on your PC.

A solution can be to use a job queuing system and a scheduler. Here, I'll present one way to do it on a Debian based Linux system. Ubuntu might work just as well, since Ubuntu is very similar to Debian. A feature of the method presented here, is that it can easily be extended to cover several PCs on your network, so you can use the computing power of your colleagues when they do not use their PCs (e.g. at night). However, this post will try to make it very simple, namely set it just on your own PC. In less than 10 minutes you'll have it up and running...

The idea is to use TORQUE in a very minimal configuration. There will be no fuzz with Maui or similar schedulers, we will only use packages we can get from the Debian/Ubuntu software repositories.
In order to be friendly to all the Ubuntu users out there, all commands issued as root are here prefixed with the "sudo" command. As a Debian user you can become root using the "su" command first.

First install these packages:

$ sudo apt-get install torque-server torque-scheduler 
$ sudo apt-get install torque-common torque-mom libtorque2
and either
$ sudo apt-get install torque-client
or
$ sudo apt-get install torque-client-x11

after installation we need to setup torque properly. I here assume that your PC hostname cannot be resolved by DNS, which is quite common on small local networks. You can test whether your hostname can be resolved by the "host" command. Assuming your PC has the name "kepler", you may get an answer like:

$ host $HOSTNAME
Host kepler not found: 3(NXDOMAIN)

this means you may need to edit the /etc/hosts file, so your PC can associate an IP number with your hostname. Debian like distros may have a propensity to assign the hostname to 127.0.1.1 which will not work with torque. Instead I looked up my IP number (which in my case is pretty static) using /sbin/ifconfig, and edited the /etc/hosts accordingly, using your favourite text editor (emacs, gedit, vi...)
My /etc/hosts file ended up looking like this:

127.0.0.1 localhost
#127.0.1.1 kepler.lan kepler
192.168.1.108   kepler

If your hostname of your PC can be resolved, you can ommit the last line, but under all circumstances you must comment out the line starting with 127.0.1.1.


Once this is done, execute the following commands to configure torque:
$ sudo echo $HOSTNAME > /etc/torque/server_name
$ sudo echo $HOSTNAME > /var/spool/torque/server_name
$ sudo pbs_server -t create
$ sudo echo $HOSTNAME np=`grep proc /proc/cpuinfo | wc -l` > /var/spool/torque/server_priv/nodes 
$ sudo qterm
$ sudo pbs_server
$ sudo pbs_mom

(Update: If qterm fails, you probably have a problem with your /etc/hosts file. You can still kill the server with $killall -r "pbs_*".)

Now let's  see if things are running as expected:
$ pbsnodes -a
kepler
     state = free
     np = 4
     ntype = cluster
     status = rectime=1326926041,varattr=,jobs=,state=free,netload=3304768553,gres=,loadave=0.09,ncpus=4,physmem=3988892kb,availmem=6643852kb,totmem=7876584kb,idletime=2518,nusers=2,nsessions=8,sessions=1183 1760 2170 2271 2513 15794 16067 16607,uname=Linux kepler 3.1.0-1-amd64 #1 SMP Tue Jan 10 05:01:58 UTC 2012 x86_64,opsys=linux

and also
$sudo momctl -d 0 -h $HOSTNAME

Host: kepler/kepler   Version: 2.4.16   PID: 16835
Server[0]: kepler (192.168.1.108:15001)
  Last Msg From Server:   279 seconds (CLUSTER_ADDRS)
  Last Msg To Server:     9 seconds
HomeDirectory:          /var/spool/torque/mom_priv
MOM active:             280 seconds
LogLevel:               0 (use SIGUSR1/SIGUSR2 to adjust)
NOTE:  no local jobs detected

Now setup a queue, which here is called "batch".
$ sudo qmgr -c 'create queue batch'
$ sudo qmgr -c 'set queue batch queue_type = Execution'
$ sudo qmgr -c 'set queue batch resources_default.nodes = 1'
$ sudo qmgr -c 'set queue batch resources_default.walltime = 01:00:00'
$ sudo qmgr -c 'set queue batch enabled = True'
$ sudo qmgr -c 'set queue batch started = True'
$ sudo qmgr -c 'set server default_queue = batch'
$ sudo qmgr -c 'set server scheduling = True'

[update: you may want to increase walltime to 10:00:00 so jobs dont stop after 1 hour]

and start the scheduler:
$ sudo pbs_sched

The rest of the commands can be issued as a normal user (i.e. non-root).

Let's see if all servers are running:
$ ps -e | grep pbs
 1286 ?        00:00:00 pbs_mom
 1293 ?        00:00:00 pbs_server
 2174 ?        00:00:00 pbs_sched

Anything in the queue?
$ qstat
$ 
Nope, it's empty.

Lets try to submit a simple job
echo "sleep 20" | qsub

and within the next 20 seconds you can test, if its in the queue:
$ qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
0.kepler                 STDIN            bassler                0 R batch


Great, now were ready to rock 'n roll! This is really a minimalistic setup, which just works. For more bells and whistles, check the torque manual.

All we need, is a simple FLUKA job submission script: rtfluka.sh
#!/bin/bash
#
# how to use this
# change to directory with the files you want to run
# and enter:
# $ qsub -V -t 0-9 -d . rtfluka.sh
#
#PBS -N FLUKA_JOB
#
start="$PBS_ARRAYID"
let stop="$start+1"
stop_pad=`printf "%03i\n" $stop`
#
# Init new random number sequence for each calculation. 
# This may be a poor solution.
cp $FLUPRO/random.dat ranexample$stop_pad
sed -i '/RANDOMIZE        1.0/c\RANDOMIZE        1.0 '"${RANDOM}"'.0 \' example.inp
$FLUPRO/flutil/rfluka -N$start -M$stop example -e flukadpm3

Update: Note that your RANDOMIZE card in your own .inp file must match the sed regular expression above, else you may repeat the exact same simulation over and over again...


Let's submit 10 jobs:
$ qsub -V -t 0-9 -d . rtfluka.sh

And watch the blinkenlichts.
$ qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
15-0.kepler               FLUKA_JOB-0      bassler                0 R batch          
15-1.kepler               FLUKA_JOB-1      bassler                0 R batch          
15-2.kepler               FLUKA_JOB-2      bassler                0 R batch          
15-3.kepler               FLUKA_JOB-3      bassler                0 R batch          
15-4.kepler               FLUKA_JOB-4      bassler                0 Q batch          
15-5.kepler               FLUKA_JOB-5      bassler                0 Q batch          
15-6.kepler               FLUKA_JOB-6      bassler                0 Q batch          
15-7.kepler               FLUKA_JOB-7      bassler                0 Q batch          
15-8.kepler               FLUKA_JOB-8      bassler                0 Q batch          
15-9.kepler               FLUKA_JOB-9      bassler                0 Q batch 

Surely, this can be improved a lot, suggestions are most welcome in the comments below. One problem is for instance, that the random number seed is limited to a 16 bit integer, which only covers a very small fraction of the possible seeds for the RANDOMIZE card.
Update: There is also a very small risk that the same seed occasionally is used twice (or more often). Alternatively one could just add a random number to a starting seed after each run. (Any MC random number experts out there?)

Output data can be processed in regular ways, using flair
Alternatively you may use some of the scripts in the auflukatools package, which for instance can do the merging of USRBIN output with a single command. Auflukatools also includes rtfluka.sh as well as a CONDOR job submission script rcfluka.py, which is better suited for heterogenous clusters.

Finally, here is a job script for SHIELD_HITxxA, (which is even shorter):

#!/bin/bash
#
# how to use
# change to directory you want to run
# $ qsub -V -t 0-9 -d . rtshield.sh
#
#PBS -N SHIELD_JOB
shield_exe  -N$PBS_ARRAYID

Enjoy!

Totally unrelated: englishrussia.com just posted some nice pics from the Budker institute for Nuclear Physics in Novosibirsk, Russia. Certainly worth visiting, have a look at:
http://englishrussia.com/2012/01/21/the-budker-institute-of-nuclear-physics/
 :-) Heaps of pioneering accelerator technology was developed there, such as electron cooling, the first collider, lithium lenses (e.g. for capturing antiprotons), and they supplied the conventional magnets for the beam transfer lines to the LHC at CERN. I visited the center many years ago but my pics are not as good. :-/ The German wiki about Budker himself, is also worth reading.


Wednesday, January 11, 2012

Visit at the Primary Standards Laboratory in Slovakia

This post is not related to computing, but more to medical physics. Primary standards dosimetry laboratories (PSDL) are important for medical physicists, since they define fundamental quantities such as dose. If you buy some dosimeter, say, an ionization chamber, it is most likely calibrated at a PSDL (for ample amounts of money) or a secondary standards dosimetry laboratory (SSDL) which is linked towards a PSDL. Not all countries have a PSDL or SSDL, and some countries (like Slovakia in this case) have both PSDL and a SSDL facilities. To my knowledge, Denmark quite recently got the SSDL status at the National Institute for Radioprotection.

During summer vacation 2011, after finishing the run at CERN, I had a rather messy tour across Europe and also went to Bratislava to visit some friends. Here I got the chance to get a tour of the Slovakian PSDL at the Metrological (not to mistaken with the Meteological) Institute. It is my second vist at a PSDL - a few years ago, I visited the PSDL at the National Physical Laboratory in the UK, but I only got crappy mobile phone pictures.

The Metrological Institute of Slovakia, Bratislava.

The director of the institute, Jozef Dobrovodsky gave me a tour of the facility. They have a close cooperation with the NPL - most physicists working with particle therapy may have heard of proton dosimetry expert Hugo Palmans who works at NPL near London, but (quite conveniently) actually lives in Bratislava.

Jozef and Hugo looking at Roos plane parallel ionization chambers from PTW, well suited for measuring depth-dose curves of pencil shaped ion beams.
Outline of the facility.
The facility has a Betatron, a Cobalt-60 unit, a 320 kV X-ray unit, a Caesium-137 irradiator and a neutron vault with various neutron sources.

Mock-up models of Cs-137 sources (these are NOT radioactive).
A part of the control room, with the very well known UNIDOS electrometer by PTW, which I worked a lot with e.g. while at CERN.
Probably the most important room is where the Co-60 irradiator is kept. Co-60 has a long history serving as reference radiation for a wide range of dosimetric tasks. Beam quality is usualy expressed relative to Co-60 standard. However, Co-60 irradiators are getting rare. Radiation treatment with Co-60 is rather something seen in developing countries, at most hospitals they were replaced with megavolt linear accelertors, also for safety issues. (Messing with radioactive sources is always a bad thing and should be avoided). As a researcher, it gets increasingly difficult to get access to a proper reference radiation.

The yellow box holds the Co-60 source, behind the tank an additional collimator is visible which can be mounted in front of the Co-60 unit.

Co-60 irradiation room. The tank holds a Markus ionization chamber, and the dose-rate can be reduced by increasing the distance to the Co-60 irradiation unit.
Next we took a look at the X-ray irradiation room. X-rays have lower energies than Co-60 and are made electrically and not from a radioactive source.
Two X-ray sources are seen here, one in the background with wheels of various copper filers which can be positioned in the beam. Copper filters can remove characteristic lines of the X-ray spectrum, thereby flattening it. In front an X-ray diagnostic device is visible.
How to make 90 Volts. :)
Cs-137 provide a photon field at about half the energy of the MeV Co-60 photons.
Cs-137 irradiation room.
Cs-137 irradiator seen from the front. Aperture can slide to the right, exposing the room to the source.
A real gem was their Betatron: it's a Czech construction, which can deliver both electron and photon beams. The betatron is an old design originally invented by the Norwegian Wider√łe, who also invented the idea of drift tubes, widely used at almost any accelerator today. Betatrons (especially functioning betatrons) are a very rare sight today,  most were replaced with LINACs long time ago. I once saw a betatron at the physics department of Freiburg in Germany, but it was not operational anymore. This one however is still functioning! (Look how clean and tidy it is... I am used to messy laboratories.)

Second time I ever see a betatron. Yay! :-)
You can extract either photons or electrons on either side. This is the photon exit (I think).
They even had a spare betatron tube, heavily tarnished by radiation damage to the glass.
Control console for the betatron. Nice and sleek.
Power supply and controls for the betatron. Many components are still genuine Czech, manufactured by TESLA.
Finally we visited the neutron vault. Here they had three neutron irradiators: two different accelerator based sources and a range of radioactive neutron emitters. The neutron sources were kept in a cave in the floor shielded with lots of plastic material for neutron moderation/absorption. The sources they have are quite common. Some intense alpha source (Plutonium or Americium) mixed with some light material (Beryllium). A Californium source was also there which fissions spontaneously.

Neutron vault. In the floor several neutron sources are kept, and can be raised out of the cave by the visible holder. I was a bit hesitant, when the scientist suddenly pulled a string, and the holder surfaced out of the neutron cave, as shown on this picture. “Is it empty?!” “Sure it's empty.”
This accelerator was very cute: protons accelerated towards a tritium target produces a neutron beam. The design of the high voltage terminal looks very much like the design found in the terminal of our Van de Graaf KN machine in Aarhus.
The ion source can be seen in the middle.
Beam is directed against a tritium metal hydride target, which is rotated to redistribute the dissipated power over a larger area. This produces a neutron beam, exiting to the lower left.

I always found acceletor technology very interesting, especially old designs where you easily can recognize what is going on (or not). If it is eastern-european design, it's just even more interesting, since they tend to look rather different and often show signs of various improvisations.

This is some Russian accelerator based neutron source. However, it was not really used if I remember correctly, and unfortunately I didn't pick up all the details about it.
I was once told that its very characteristic for Russian accelerator systems, that the vacuum tubes are fixed with 4 screws only.

This concludes our little tour at the irradiation facilities of the primary standard lab in Bratislava. Thanks to Hugo Palmans and Jozef Dobrovodsky for the tour!

An antiproton and a proton dosimetry researcher meet. No annihilation, but some sort of a bound state, clearly sharing common goals and interests.