Like the rest of the world, there is a lot of discussion going about the use of clouds and virtualization in gridpp.
Using virtualization will have a performance impact, so using it for our type of computing (hpc/htc) may not be the best solution. However just what impact does it have? A quick search of the web suggests anywhere between 3 to 30%. Most of the overhead appears to be in the kernel and in i/o.
I decided that I wanted to do some of my own tests with the focus on the type of work we do in gridpp.
Testbed: 24 thread westmere processor running at 2.66 GHz + 48 Gig of memory using Scientific Linux 6.3 (basically RHEL6). I'm using the default install of KVM with the virtual image as a local file setup to use all 24 threads.
Benchmarks: 1) I unpack and make the ROOT analysis package using 24 threads; 2) as 1 but using only one thread. 3) I generate 500,000 Montecarlo events using the HERWIG++ Generator; 4) as 3 but I also include the time taken to unpack and install HERWIG++; 5) I run the HEP-SPEC06 benchmark. For tests 1 to 4 i use the TIME command to obtain the real time taken (smaller is better), for 5 I report the hep-spec score (larger is better). I will run the benchmarks on the bare metal install and on the VM on the same hardware and compare the results.
Out of the box performance of KVM results in ~3% (CPU intensive) to 20% (sys call intensive) reduction in performance. There is some indication of correlation with ratio of sys time / user time (particular effect with make/tar/gzip?). This is not seen in HEP-SPEC result. SYS time is the CPU time spent within the kernel and from previous studies we expect this to incur a high performance hit in virtualization.
If I get the time I intend to repeat analysis using optimisations (e.g. guest image on LVM). Repeat analysis using fedora 18 ( ~RHEL 7). Repeat using sandybridge cpu. Look at network performance (eg iozone with lustre).