Performance test of ExaML

Introduction

We are looking into buying some hardware to to use for the development of the SUPER-SMART pipeline. To do this we need better understanding on how performance of the pipeline will scale on different hardware. The pipeline has two steps where the analyses can be run i paralel, namely the alignment of a lare set of matrices, and the phylogenetic inference step that runs examl. The pipeline is now running on "node0" (48 AMD Opteron 6168 1.9KHz cores, 96 GB RAM) of the Albiorix cluster, at the Department for Biological and Environmental Sciences, University of Gothenburg. Initial tests, allocating different numbers of cores to the analyes, will be run on this machine at times when few or no other analyses are running, in order to prevent competition for resources between processes. Although this test is performed on a computer cluster, all analyses are running on a single machine. Hence, no networking between nodes in the cluster is needed and has interfered with the test.

Results

4 cores

[mtop@compute-0-0 examl]$ grep ">" rbcL-matK-ndhF-exemplars.fa | wc -l
510
[mtop@compute-0-0 examl]$ mpirun -np 4 examl -S -D -m GAMMA -s rbcL-matK-ndhF-exemplars.binary -t rbcL-matK-ndhF-exemplars.dnd -n test.out
...
Partition: 0
Alignment Patterns: 3225
Name: No Name Provided
DataType: DNA
Substitution Matrix: GTR

ExaML was called as follows:
examl -S -D -m GAMMA -s rbcL-matK-ndhF-exemplars.binary -t rbcL-matK-ndhF-exemplars.dnd -n test.out

Memory Saving Option: ENABLED
...
Likelihood of best tree: -69176.641937
Overall Time for 1 Inference 1669.720315
real	27m49.767s
user	111m13.006s
sys	0m3.100s
Likelihood of best tree: -69176.641937
Overall Time for 1 Inference 1647.488822
...
real	27m28.535s
user	109m44.331s
sys	0m2.913s
Likelihood of best tree: -69176.641937
Overall Time for 1 Inference 1640.897833
...
real	27m20.941s
user	109m17.957s
sys	0m2.928s

Mean execution time using 4 cores was 1653 seconds (27 minutes and 32 seconds).

8 cores

Likelihood of best tree: -69171.998868
Overall Time for 1 Inference 921.272277
...
real    15m23.357s
user    122m9.701s
sys     0m19.891s
Likelihood of best tree: -69171.998868
Overall Time for 1 Inference 880.679393
...
real    14m41.744s
user    117m18.505s
sys     0m3.968s
Likelihood of best tree: -69171.998868
Overall Time for 1 Inference 873.304107
...
real    14m34.358s
user    116m20.487s
sys     0m3.127s

Mean execution times using 8 cores was 892 seconds (14 min. 51 sec).

12 cores

Likelihood of best tree: -69178.832109
Overall Time for 1 Inference 634.310779
...
real	10m35.372s
user	126m45.723s
sys	0m3.238s
Likelihood of best tree: -69178.832109
Overall Time for 1 Inference 640.420086
...
real	10m41.484s
user	127m58.757s
sys	0m3.531s
Likelihood of best tree: -69178.832109
Overall Time for 1 Inference 642.514302
...
real	10m43.578s
user	128m23.830s
sys	0m3.533s

Mean execution time when using 12 cores was 639 seconds (10 min. 39 sec.).

16 cores

Likelihood of best tree: -69171.998868
Overall Time for 1 Inference 515.302811
...
real    8m36.400s
user    137m18.702s
sys     0m3.375s
Likelihood of best tree: -69171.998868
Overall Time for 1 Inference 521.257728
...
real    8m42.340s
user    138m53.816s
sys     0m3.577s
Likelihood of best tree: -69171.998868
Overall Time for 1 Inference 516.501244
...
real    8m37.590s
user    137m38.115s
sys     0m3.105s

Mean execution time using 16 cores was 518 seconds (8 min. 37 sec).

24 cores

Likelihood of best tree: -69171.998868
Overall Time for 1 Inference 415.391650
...
real	6m56.499s
user	166m2.159s
sys	0m4.641s
Likelihood of best tree: -69171.998868
Overall Time for 1 Inference 415.971764
...
real	6m57.074s
user	166m13.707s
sys	0m6.866s
Likelihood of best tree: -69171.998868
Overall Time for 1 Inference 416.255413
...
real	6m57.362s
user	166m22.643s
sys	0m4.762s

Mean execution time using 24 cores was 416 seconds (6 min. 56 sec.).

32 cores

Likelihood of best tree: -69171.998869
Overall Time for 1 Inference 359.711132
...
real    6m0.856s
user    191m42.713s
sys     0m5.692s
Likelihood of best tree: -69171.998869
Overall Time for 1 Inference 361.735561
...
real    6m2.898s
user    192m47.172s
sys     0m5.946s
Likelihood of best tree: -69171.998869
Overall Time for 1 Inference 361.249953
...
real    6m2.419s
user    192m31.000s
sys     0m6.196s

Mean execution time using 32 cores was 360 seconds (6 minutes).