...
Stampede2 (TACC) | |||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Model | Dell PowerEdge C6320P/C6420 | ||||||||||||||||||||||||||||||||||||||||||||
Number of CPUs | 367,024 Xeon Phi 7250 68C 1.4GHz | ||||||||||||||||||||||||||||||||||||||||||||
Total Memory | 736Tb | ||||||||||||||||||||||||||||||||||||||||||||
Scheduler | SLURM | ||||||||||||||||||||||||||||||||||||||||||||
Max num of submission per user | KNL: 1 node 68 cores (1 socket) = 272 hyper threads BUT 64-68MPI tasks advisable * 4200 KNL nodes (96Gb+16Gb)/node SKX: 1 nodes 48 cores (= 2 sockets* 24 cores/socket) = 96 hyper threads * 1,736 nodes
SKX is slightly more expensive than KNL | ||||||||||||||||||||||||||||||||||||||||||||
Dev env. | Default compiler: Intel 18. | ||||||||||||||||||||||||||||||||||||||||||||
File system | $HOME: 10Gb (200,000 files) $SCRATCH: unlimited. nobackup, deleted if not accessed for 10 day. /nesi/project/nesi00213 == $HOME/project /nesi/nobackup/nesi00213 == $HOME/nobackup or $SCRATCH/nobackp | ||||||||||||||||||||||||||||||||||||||||||||
Gotchas | Buildingmodule add fftw3/3.3.8 module add intel/18.0.2 module add impi/18.0.2 module add cmake/3.10.2 MPI_C_LIB_NAMES = mpifort;mpi;mpigi;dl;rt;pthread MPI_dl_LIBRARY = /usr/lib64/libdl.so MPI_pthread_LIBRARY = /usr/lib64/libthread.so MPI_rt_LIBRARY = /usr/lib64/librt.so By default gcc-6.5 creeps in and it attempts to build with gcc-6.5 instead of icc. Enforce it with CC=icc. I found "make VERBOSE=1" extremely useful to debug building issues Issueemod3d has a rounding error issue with icc and returns wrong "ny" failing post-emod3d test. Jonney has a fix (RobG adds 0.5 instead of round() functionRob Graves fixed this by converting float to double in the function get_n1n2() in misc.c. The fix is included in 3.0.6 (On Nurion, however, this fix was found to be not enough) RunningProject name must be CamelCase: DesignSafe-Graves Slurm script needs -N for number of nodes #SBATCH -N 4 Instead of "srun" it uses "ibrun" WorkflowA number of hardcoded bits assuming NeSI machine need to be updated. Check workflow and qcore "stampede" branches. | ||||||||||||||||||||||||||||||||||||||||||||
Usage check | (python3_stampede) sungbae@stampede21(1):~$ /usr/local/etc/taccinfo |
...