Efficient Managing and Monitoring of InfiniBand HPC
Clusters
SessionHPC Trends
Presenter
Event Type
Exhibitor Forum
TimeWednesday, November 15th2:30pm -
3pm
Location501-502
DescriptionThe Fabriscale Monitoring System (FMS) is a cluster
interconnect monitoring software that provides visual
insight into the status of your InfiniBand cluster. In
this presentation we present the key features of the FMS
and show you how to get an overview of performance and
drill-down into statistics, alerts and key metrics. With
FMS monitoring of the cluster is automated and alarms
are only raised when the operator's attention is
required. The operator will be pointed to where the
problem has occurred, supported by relevant metrics and
statistics. This saves time, leads to faster error
recovery, less strain on operators and reduced downtime
for your cluster.
The FMS integrates with job schedulers, e.g. Slurm, MOAB HPC Suite, and PBS Works, to leverage scheduling information to present performance information as a function of workload. Potential network bottlenecks can be identified per job, and utilization for a job can be specified per port.
The FMS integrates with job schedulers, e.g. Slurm, MOAB HPC Suite, and PBS Works, to leverage scheduling information to present performance information as a function of workload. Potential network bottlenecks can be identified per job, and utilization for a job can be specified per port.




