Cloud Computing and HPC

Does cloud computing have a role to play in HPC or are the two fundamentally incompatible? Gillian Law, technical journalist at Tech Literate, spoke to UK HPC users about their current and potential future use of the cloud.

Cancer Research UK is in the middle of doubling the size of its Cambridge data centre. The server room is being expanded and the current 512 core cluster is being upgraded. The data centre will reach 1280 cores once the full refit is completed – and even that might not be quite enough to cope with peaks of demand. With ever increasing demands from the various teams using the facilities, sharing the resource is a challenge.

“In particular, the high throughput DNA sequencing done here puts a high demand on our high performance computing resources,” says Cancer Research UK’s Head of IT and Scientific Computing in Cambridge, Peter Maccallum.

“We have sequencers in-house that generate several terabytes of data a week, which we have to analyse and reprocess, and we’re increasingly working with external partners who have even larger sequencing facilities. We’re also seeing an increase in the amount of processing required for image analysis, lots of microscopy data and other imaging from MRI and so on,” Maccallum says.

The obvious solution would seem to be cloud-based – use an outside vendor to cover those peak times, instead of buying ever more kit or forcing people to wait for access. Does cloud have a place in HPC?

There’s certainly a need for an ‘emergency release valve’, a way of dealing with peaks and with short, discrete projects that don’t need to run on the main system, Maccallum says.

However, he would prefer, if possible, to approach other organisations that might have some excess capacity, or to look to a University grid provider, rather than turn to the ‘cloud’.

“Our problem with the classic cloud is that we have a balance problem. For every CPU hour that we need, we also need several hundred gigabytes of storage close to the compute. We tend to move a lot of data in, get it processed, and get the results back. So we’re paying storage costs while it’s in the cloud, and we’re paying to move the data in and out. When you’re talking about terabyte volumes, those costs start to stack up.”

Given that an organisation like Cancer Research UK is buying large amounts of hardware anyway, and can often negotiate  prices similar to those paid by the cloud vendors, “and given that we don’t have to make a profit”, it doesn’t always work out to pay the cost of cloud.

Banking services

Adam Vile, Head of Technical Consulting at Excelian, tells a similar story from the very different perspective of banking.

Excelian is a consultancy specialising in capital markets and deals with many banking clients. “The scale of compute blades in banks is immense”, Vile says.

“In a Monte Carlo simulation, to get one decimal place more accuracy you need ten times more simulation. We’re in the middle of a survey of the computer requirements in investment banks, and in some case the number of cores in place is upwards of 100,000. There are challenges in managing that level of resource,” he says.

The topic of cloud comes up all the time, talking to clients, Vile says, driven by the need to save money where possible, with the benefit of the simplicity that comes from handing over the management to someone else.

There is a move away from trading in structured instruments, which are high margin, high complexity products, towards high volume, low margin, low complexity instruments– which in some sense makes cloud less viable.

“Cloud doesn’t fit the low-latency requirements of many of these trading processes,” Vile says, “but it still fits the workload of straightforward overnight batch processing. Data centres are driven at 80 percent utilisation even at peak times, between 6pm and 7am, in order to account for potential burst capacity needs. During the day utilisation can be as low as 20 percent, so there is waste there that could be trimmed and cloud could be a good option to accommodate the burst needs.”

To satisfy banking requirements the service levels of any cloud provider would have to be extremely high, he says. “The jobs absolutely have to be done by the time trading starts, or traders are starting the day without a proper understanding of their risk.”

As in the research world, the ‘low cost’ benefit of cloud doesn’t quite stack up when you can negotiate prices as low as the cloud providers themselves can. The benefits are more likely to be seen in the ease of management and simplicity that cloud can bring – if vendors can offer the service levels that HPC clients need.

“When you need flexible computing for bursts of activity, for testing and development, for disaster recover – I think cloud could fly. You get access to new hardware more quickly, and a new deployment takes minutes. So – it does have potential,” he says.

HPC needs

Gary Wills of the University of Southampton, however, stresses that most HPC work does not suit the commercial cloud offerings available.

“The cloud is set up for non-time-critical processing, it’s a way of getting work done quickly without buying a lot of hardware. I don’t want to belittle cloud, but it’s a business model that allows a company to get its email and its batch processing done while improving its green credentials – it was never designed to replace High Performance Computing.

“With High Performance Computing you’ll have 600 processors all working on your particular problem for a set period of time. The current cloud offerings just aren’t designed to do that. Obviously, as academics we try to find ways of making it work, because of cost savings – and with some visualisations you can. But once they become complex, mathematical visualisations, again you need High Performance Computing,” Wills says.

This may change if a vendor chooses to offer a service that suits better, Wills says, or more likely if a service like the National Grid Service steps in and provides a tailored solution.

The cloud can come in useful if you can’t get access to ‘proper’ HPC, says John Milner, Programme Manager (Shared Information Services) at JISC.

“It does have benefits for people who just need short term access to HPC capability. If you have no other access then the ability to do your calculations outweighs any other shortfalls – the trade off between performance and timescale sort of works. But if you have access to any HPC facility then the usefulness is limited,” Milner says.

Cloud vendors simply don’t see HPC as a key target market, he says, so they don’t make the effort to make it more compatible with HPC needs.

National Grid Service

Over at the National Grid Service (NGS), Technical Director David Wallom is in the middle of developing what he hopes will be a solution to this HPC/ cloud incompatibility.

HPC does need cloud, he says. “But you need cloud in a more interesting and different way”.

Universities and research organisations are all managing multiple communities. Like Peter Maccallum at Cancer Research UK, HPC data centre managers have to find ways to support lots of groups, each of which will be using different tools and have their own service requirements.

“[Given] the flexibility that cloud can give in hosting all of these in a simple and easy way, that’s the way we want to move,” Wallom says.

Wallom’s vision involves a private cloud, with gateways or access portals that sit in front of a user’s own resources, avoiding the need to move large data sets.

Building on the work done by the NGS’s pilot cloud services , a UK Federated Cloud Group is being set up to develop a “flexible infrastructure layer connected into the National Grid Services own authentication, authorisation, accounting and monitoring services,” Wallom says.

Public cloud providers could be brought into this, too, Wallom says. “We’ve spoken to a couple of smaller providers and asked if they would be interested in deploying a compatible stack to the work of the UK Federated Cloud.”

Wallom is very positive about the role cloud can play.

“The NGS cloud pilots support about 100 people. We have never, ever been asked ‘How does this work?’ – they just go and do it. That’s one of the clear benefits of cloud, the ability of users to understand what they can do, and how to do it.”

Coming back to the refit of the Cancer Research UK data centre, cloud has brought one distinct advantage.

“We have 20 research groups and ten technology facilities using our service, so we have to ration it - and the layers of software that cloud vendors have developed are actually very useful in managing that. High volume storage vendors are being driven by cloud, too, so there are more people who understand what we need. More adaptable clusters, and more low cost high quality storage. Products optimised for cloud are actually very useful in a High Performance Computing environment!” Maccallum says.

Gillian Law

© Gillian Law

Comment on this article


PlanetHPC,  University of Edinburgh | James Clerk Maxwell Building | Mayfield Road | Edinburgh | EH9 3JZ

FP7 Logo
Web Design Edinburgh by Arcas