Running HPC Jobs with LoadLeveler
One frequent question we get is: Why won't my job run? The tools needed to answer this are available to all users. First of all, you need to find out what jobs are currently queued, using llq or llq -u abc123 for, eg: usercode abc123, or llq -b for Blue Gene jobs. You'll get a listing looking something like this:
Suppose your job is l3n01-c.17802.0 and you want to see why it hasn't started yet. Type llq -s l3n01-c.17802.0 and you'll get a long listing of information about the job. Right at the end is the information you want, which will say something like:
Blast! There's not enough CPUs for you You could cancel and then resubmit the job with fewer CPUs specified, or you could wait a little longer for more CPUs. See also the LoadLeveler page for some other tricks you can try.