Running High Performance Computing (HPC) applications on-premise is complicated. The infrastructure can be complex to build and maintain, often runs over budget and does not scale well. With AWS, it is easier for customers to build scalable and cost efficient HPC infrastructure in the cloud. However, deep cloud infrastructure expertise is required to build, deploy, and operate HPC environments on AWS. In this blog, Samrat (Sam) Priyadarshi a Senior Cloud Solutions Engineer at MontyCloud, explains how researchers and students at St. Louis University use MontyCloud DAY2 to provision and manage HPC clusters using AWS Slurm and AWS Batch. Customers like St. Louis University can save up to 70% in their total cost of cloud operations with DAY2.
– Sabrinath Rao
Researchers and students needing HPC environments often find themselves having to deal with infrastructure, operational and security overheads that extends well beyond their job requirements and skill sets. Often customers have under-utilized and out-of-warranty clusters that don’t provide optimal performance. Customers desire to deploy smaller clusters for research groups and individual students while keeping the management complexity to a minimum. Also, customers find it hard to find the right talent to maintain these environments while keeping costs under check.
For example, one of our customers Saint Louis University (SLU) uses the DAY2 Slurm Blueprint to provision HPC clusters on-demand. Their Cloud team has increased productivity and is helping drive innovation faster by enabling researchers with self-service provisioning capabilities, along with easy to use operational automation of the HPC infrastructure. SLU team has saved several weeks of manual effort and continues to save both time and cost as researchers consume more capacity for their innovation.
With DAY2, SLU researchers are choosing between two options, HPC using Slurm and HPC using AWS Batch, to instantly provision the clusters along with the ready to use self-service tasks.
In this blog I am going to share how MontyCloud DAY2 enables customers to deploy Operations-Ready HPC applications that save time, costs and accelerate research, all in just a few clicks. With DAY2 researchers can offload the burden of well-architected infrastructure buildout, deployment, and operations and instead focus on their innovation. With this solution, customers can deploy right sized HPC environments on-demand, and instantly start submitting jobs without any of the heavy lifting involved with current solutions.
Well-architected HPC blueprints in DAY2 are ready to use templates that help deploy cost efficient, compliant, and secure HPC applications when used in conjunction with the DAY2 platform. By using well architected blueprints that are designed for compliance and cost savings, and accelerating deployment times with few clicks, customers can save several weeks of effort, and run cost optimized HPC environments.
Following are the benefits of using the blueprints in DAY2:
DAY2 is a No-Code, Autonomous CloudOps platform. Customers can accomplish the desired outcome without writing any additional code and without requiring expertise in AWS or HPC technologies. Unlike off-the-shelf Infrastructure as Code templates, customers do not need to build their own images. Unlike existing solutions, researchers have less decisions to make about provisioning a cluster and focus more on their research than managing the underlying infrastructure.
Figure 1: Researchers can use readily available, well-architected HPC blueprints in various configurations available to them.
SLU provisions High Performance Computing clusters using DAY2 HPC Slurm Blueprint for multiple purposes. SLU’s Research Computing Group uses this Blueprint to run Molecular Dynamics Simulations using Gromacs which is a molecular dynamics package designed for simulations of proteins, lipids and nucleic acids. SLU’s Chemistry department also uses this Blueprint to run their ORCA jobs for quantum chemistry. Researchers use DAY2 to provision the HPC Clusters on demand whenever they need it, submit their jobs, get the output back and shut down the clusters once the job is done, all in just few clicks and with no AWS expertise.
Figure 2: Operations-Ready HPC environment with AWS Slurm
MontyCloud’s DAY2 HPC Batch Blueprint manages all the infrastructure for you, avoiding the complexities of provisioning, managing, monitoring, and scaling your batch computing jobs. By using the HPC Batch Blueprint to move compute-intensive workloads to AWS, customers have increased speed, scalability, and cost-savings. The Blueprint automates the resourcing and scheduling of the jobs to save costs and provide the job results faster to accelerate decision-making.
Figure 3: Operations-Ready HPC environment with AWS Batch
With DAY2 Operations Ready HPC Blueprints, customers can save several weeks in effort and 40-70% in total cost of cloud operations, without requiring the sophisticated AWS skills. It is now easy to provision HPC clusters on-demand, with governance guardrails, security standards and highly automated self-service operations. Administrators can easily enable self-service provisioning, and end users such as researchers can spin up managed environments that they can control easily and accelerate innovation.
Like what you read and want to get started? You can click here and request a demo.