

Supercomputers are becoming another cloud service. Here's what it means
source link: https://www.zdnet.com/article/supercomputers-are-becoming-another-cloud-service-heres-what-it-means/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

By Mary Branscombe for 500 words into the future | August 2, 2021 -- 08:46 GMT (16:46 SGT) | Topic: Microsoft
These days supercomputers aren't necessarily esoteric, specialised hardware; they're made up of high-end servers that are densely interconnected and managed by software that deploys high performance computing (HPC) workloads across that hardware. Those servers can be in a data centre – but they could also be in the cloud as well.
When it comes to large simulations – like the computational fluid dynamics to simulate a wind tunnel – processing the millions of data points needs the power of a distributed system and the software that schedules these workloads is designed for HPC systems. If you want to simulate 500 million data points and you want to do that 7,000 or 8,000 times to look at a variety of different conditions, that's going to generate about half a petabyte of data; even if a cloud virtual machine (VM) could cope with that amount of data, the compute time would take millions of hours so you need to distribute it – and the tools to do that efficiently need something that looks like a supercomputer, even if it lives in a cloud data centre.
When the latest Top 500 list came out this summer, Azure had four supercomputers in the top 30; for comparison, AWS had one entry on the list, in 41st place.
SEE: Nextcloud Hub: User tips (free PDF) (TechRepublic)
HPC users on Azure run computational fluid dynamics, weather forecasting, geoscience simulation, machine learning, financial risk analysis, modelling for silicon chip design (a popular enough workload that Azure has FX-series VMs with an architecture specifically for electronic design automation), medical research, genomics, biomedical simulations and physics simulations, as well as workloads like rendering.
They do some of that on traditional HPC hardware; Azure offers Cray XC and CS supercomputers and the UK's Met Office is getting four Cray EX systems on Azure for its new weather-forecasting supercomputer. But you can also put together a supercomputer from H and N-Series VMs (using hardware like NVidia A100 Tensor Core GPUs and Xilinx FPGAs as well as the latest Epyc 7300 CPUs) with HPC images.
One reason the Met Office picked a cloud supercomputer was the flexibility to choose whatever the best solution is in 2027. As Richard Lawrence, the Met Office IT Fellow for supercomputing. put it at the recent HPC Forum, they wanted "to spend less time buying supercomputers and more time utilizing them".
But how does Microsoft build Azure to support HPC well when the requirements can be somewhat different? "There are things that cloud generically needs that HPC doesn't, and vice versa," Andrew Jones from Microsoft's HPC team told us.
Everyone needs fast networks, everybody needs fast storage, fast processors and more memory bandwidth, but the focus on how all that is integrated together is clearly different, he says.
HPC applications need to perform at scale, which cloud is ideal for, but they need to be deployed differently in cloud infrastructure from typical cloud applications.
SEE: Google's new cloud computing tool helps you pick the greenest data centers
If you're deploying a whole series of independent VMs it makes sense to spread them out across the datacenter so that they are relatively independent and resilient from each other, whereas in the HPC world you want to pack all your VMs as closest together as possible, so they have the tightest possible network connections between each other to get the best performance he explains.
Some HPC infrastructure proves very useful elsewhere. "The idea of high-performance interconnects that really drive scalable application performance and latency is a supercomputing and HPC thing," Jones notes. "It turns out it also works really well for other things like AI and some aspects of gaming and things like that."
Although high speed interconnects are enabling disaggregation in the hyperscale data centre, where you can split the memory and compute into different hardware and allocate as much as you need of each, that may not be useful for HPC even though more flexibility in allocating memory would be helpful, because it's expensive and not all the memory you allocate to a cluster will be used for every job.
"In the HPC world we are desperately trying to drag every bit of performance out of the interconnect we can and distributing stuff all over the data centre is probably not the right path to take for performance reasons. In HPC, we're normally stringing together large numbers of things that we mostly want to be as identical as possible to each other, in which case you don't get those benefits of disaggregation," he says.
Cloudy HPC
What will cloud HPC look like in the future?
"HPC is a big enough player that we can influence the overall hardware architectures, so we can make sure that there are things like high memory bandwidth considerations, things like considerations for higher power processes and, therefore, cooling constraints and so on are built into those architectures," he points out.
The HPC world has tended to be fairly conservative, but that might be changing, Jones notes, which is good timing for cloud. "HPC has been relatively static in technology terms over the last however many years; all this diversity and processor choice has really only been common in the last couple of years," he says. GPUs have taken a decade to become common in HPC.
SEE: What is quantum computing? Everything you need to know about the strange world of quantum computers
The people involved in HPC have often been in the field for a while. But new people are coming into HPC who have different backgrounds; they're not all from the traditional scientific computing background.
"I think that diversity of perspectives and viewpoints coming into both the user side, and the design side will change some of the assumptions we'd always made about what was a reasonable amount of effort to focus on to get performance out of something or the willingness to try new technologies or the risk reward payoff for trying new technologies," Jone predicts.
So just as HPC means some changes for cloud infrastructure, cloud may mean big changes for HPC.
Recommend
-
77
As we sit here, in the year Two Thousand and Eighteen (better known as "the future, where the robots live"), our beloved Linux is the undisputed king of supercomputing. Of the top 500 supercomputers in the world, approxim...
-
50
Google’s scalable supercomputers for machine learning, Cloud TPU P...
-
41
Competing with supercomputers: HPC in the cloud...
-
41
Linux Powers All Of The World’s Top 500 Supercomputers
-
20
When the fine folks at Folding@Home put out a call for enthusiasts with spare computing cycles to help fight coronavirus, the internet...
-
6
HomeAIWhere AI Might Fit in the Supercomputers of 2030Where AI Might Fit in the Supercomputers of 2030 One reason we’re...
-
6
A company culture of growth Learn how your company's culture can either advance or totally sabotage your company goals.
-
12
April 30, 2021
-
13
The Future of Supercomputers: Democratization Is CriticalTo stay ahead in the global supercomputing competition, the US must invest in initiatives that further democratize the high-performance computing field. Credit: vladimi...
-
7
Here's How Scientists Plan To Build Next-Gen Supercomputers With Real Brain Cells
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK