...
Job | Kupe | Fitz | speed-up |
Emod3d | 14 | 34 | 2.43 |
Post-emod3d | 0.016666667 | 0.4 | 24 |
HF | 0.5 | 4.2 | 8.4 |
BB | 0.1 | 41 | 410 |
14.6 | 79.6 | 5.45 |
Further testing
To test the scalability, we did another test performed a run on a much larger model (Cant Feb_February 2011 ), where: by Hoby). This model has nx=1400,ny=1200,nz=460.
The following results come from the LF calculation using EMOD3D for this model. The matrix below shows the relations of Cores and mean time used for each 100 iterations.
Requested cores | Physical cores | Nproc | CPU(phys) | Nodestime | _per_100iterMean time for 100 time steps |
80 | 80 | 2 | 90.3 | ||
128 | 80 | 2 | 129.6 | ||
160 | 80 | 2 | 97.47 | ||
160 | 160 | 4 | 46.2 | ||
256 | 160 | 4 | 65.5 | ||
320 | 160 | 4 | 48.6 |
We will note that Kupe is using hyper-threading by default. If we request N nodes, the best performance will be given by 40*N cores, anything above it seems to penalize the execution time.
Based on the matrix above, we did and estimation of the full run time on Cant 2011 earthquake(with sim_duration=100.0), and obtain the matrixobtaining
CPU | time | sec | core _seccore_hourshours for the LF part of the simulation | |
Kupe | 160 | 02:30:00 | 92001472000 | 408.88888888899 |
Fitz | 512 | 01:50:00 | 66003379200 | 938.66666666677 |
Speedup | 2.29565217393 |