AWS’s new HPC-as-a-service offering democratizes supercomputer access

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Amazon’s cloud service AWS wants to democratize access to high-performance computing (HPC) for enterprises through its new managed services product, AWS Parallel Computing Service

AWS Parallel Computing lets AWS customers access computer servers for large, compute-intensive workloads without the need to train systems administrators.

Ian Colle, director of advanced compute and simulation at AWS, told VentureBeat this kind of access may accelerate the pace of innovation for technology or scientific discovery that traditionally rely on access to HPC clusters. 

“There are a number of existing workloads today that really should be or could be taking advantage of high-performance computing resources, but because of the perception that it’s only for large enterprises or labs, whether real or perceived, is too much that people go, you know what, I don’t even want to go there,” Colle said. 

However, Colle thinks that will change once companies realize they can use HPC clusters more easily with the new service, enabling more experimentation. 

“We’re reducing the administrative burden and thinking of making a capital procurement commitment in at least the six to seven-figure range for an HPC cluster. But now all I need is an AWS account, and I can do experiments, wondering if this workload could benefit to fan out to a thousand nodes, let me try that,” he said. 

What does the service offer

AWS Parallel Computing lets users set up and manage groups of Amazon’s Elastic Compute Cloud instances. The company tapped open-source HPC workload manager Slurm to build and maintain the clusters for system administrators. 

The company already offers customers access to HPC clusters, but the previous iteration required companies to provide their own system administrators and other professionals to maintain the network.

Customers who want to run scientific and engineering workloads at scale can use the same tools on AWS, such as the Management Console and software development kits. Since the service uses Slurm, users can migrate any existing workflows to the AWS HPC cluster without rearchitecting anything. Enterprises can also connect any APIs. 

Colle said AWS’s offering “simplifies cluster administration and unlike other products, customers can completely offload Slurm management” to the service. 

The service will first be available in AWS regions in Ohio, Northern Virginia and Oregon in the United States; Frankfurt, Stockholm and Ireland in Europe; and Sydney, Singapore and Tokyo in Asia-Pacific. Colle said some AWS customers got access to Parallel Computing early to show the breadth of use cases HPC clusters can do. Companies like Germany-based Marvel Fusion use the service for their research around unlimited zero-emissions energy. Australian company Ronin, which is working to run HPC simulations on the cloud, runs its environments on the service. 

Why there’s demand for HPC clusters

Providing access to HPC clusters gained traction in the past few years as companies began needing access to compute power to train large language models and other AI foundation models. More and more, HPC networks target not just large calculations needed for drug discoveries but also for AI workloads

It used to be that large government labs were researching big scientific discoveries, and very big companies had access to supercomputers. Hardware manufacturers like AMD, Intel, Nvidia and IBM competed to create faster and ever more powerful supercomputers for government and scientific clients. 

With more companies interested in using HPC clusters, “HPC-as-a-service” has grown thanks to cloud providers like AWS, Google, Microsoft Azure and Penguin Computing on Demand, which offer access to these powerful servers to clients. 

Gartner Analyst and Senior Director Tony Harvey told VentureBeat HPC-as-a-service is nothing new, but more kinds of companies are seeing new use cases for supercomputers that cloud providers will want to offer the service more and more.

“I suspect we will see more competition in the space. A lot of the companies already offer HPC access, and there are even some that offer novel ways to access GPUs and servers because HPC use has gotten into everything, not just AI,” Harvey said.

He added any move that further democratizes access to HPCs reduces the waiting list for large supercomputers like the Hewlett Packard Frontier supercomputer housed in Tennessee that can take months to open up.

“It enables people who didn’t used to use them get access and puts value in the time of the people who are running all these experimentations and predictions,” Harvey said.