Queues on Discovery Cluster are based on five node groups:
- nodes10g: These are the nodes that have the 10 Gb/s TCP/IP backplane (no IB) – “compute-0-000 to compute-0-003″ and “compute-0-008 to compute-0-063″.
- nodesib: These are the nodes that have the 10 Gb/s TCP/IP backplane and the FDR 56 Gb/s RDMA backplane - “compute-1-064 to compute-1-127″
- nodes10gint: These are nodes compute-0-000, compute-0-001, compute-0-002 and compute-0-003 that users can use for interactive work. Users can request via LSF interactive nodes here with 1 or more cores up to a maximum of 16 cores on each node.
- nodesibint: These are nodes compute-1-064, compute-1-065, compute-1-066 and compute-1-067 that users can use for interactive work. Users can request via LSF interactive nodes here with 1 or more cores up to a maximum of 16 cores on each node.
- nodes10ght: These are nodes compute-0-004, compute-0-005, compute-0-006, compute-0-007 that users can use for jobs using the 10 Gb/s backplane only but gain using Intel Hyper-threading (HT) and Intel Turbo-Boost. These nodes have 32 logical cores as opposed to the 16 logical cores that the other compute nodes have as HT is turned off on these nodes.
Every queue on Discovery cluster has a Wall Clock limit of 24 hours. You will need to partition your long running jobs to smaller ones that run no more than 24 hours at a time. For interactive queues after 24 hours you will be logged out of the interactive node assigned to you and you will have to resubmit a request for an interactive node and login again. Remember to save you work when using interactive queues before 24 hours elapse from every login.
There are five queues for users currently on Discovery cluster. Three queues are open to all users and two to users approved by the RCC (Research Computing Committee).
Queues open to all users are “interactive-10g”, ht-10g, and “ser-par-10g”. The first uses node group “nodes10gint”, the second used node group “nodes10ght” and the last ”nodes10g” which are the compute nodes “compute-0-000 to compute-0-063″ that have the 10Gb/s backplane only.
Queues open to users approved by the RCC are interactive-ib” and “parallel-ib”. The former uses node group “nodesibint” and the latter “nodesib” which are the compute nodes “compute-1-064 to compute-1-127″ that have the 10Gb/s and FDR 56Gb/s IB backplane.
Further details of the five queues are shown below. For instructions on using these queues go here. Note that “interactive-10g” and “ser-par-10g” are the default interactive and run queues if the “-q” #BSUB option is not used.