Technical writer Gillian Law of Tech Literate talks to industry experts on the challenges presented by multicore technology.
The past few years have seen the number of cores on a standard computer grow from one to two, then to six – and analysts suggest that hundreds of cores will be the norm very soon. Add the growth of GPUs and (to some extent) FPGAs as accelerators for specific tasks, and the hardware environment is looking very different. How are ISVs coping with this change, and what are they doing to adapt their software to let it run efficiently on the new architectures? Gillian Law reports.
Software will always lag behind hardware. The hardware manufacturers work on improving this, enhancing that, and then release new products to a market that has to work out how to use them.
To make things trickier, hardware doesn’t develop at a consistent speed, either – in recent years, for example, while the number of cores in the average machine is growing fast, the amount of memory available per core hasn’t grown much at all. A software developer working on code for that hardware has to find the best way to work with that limitation, while taking a guess at how things will change in the future.
So while software suppliers like ANSYS, developers of engineering simulation software, have been working on parallel processing for years and are unfazed by that aspect of multicore, the difficulty lies in working with the new balance of hardware components.
“In a sense we’re well positioned to make the transition to multicore processor architectures,” says Barbara Hutchings, director of strategic partnerships. “But the new challenges in how the processing power has access to other components of the computer, such as memory – that’s one of the biggest issues for us.”
In the past, she says, you would send a task to a processor and “it had really good, fast access to all the memory you could want. And now as multicore comes along, you have multiple processing cores that share access to memory. That really creates a bottleneck on the hardware side that we have to somehow address on the software side by re-architecting our algorithms.”
Rolf Fischer, software development director of Intes, has similar issues, as he is working on finite elements numerical simulation software that was first developed in the 1960s. There were major changes made in the late 1980s, he says, and a disruptive adoption to parallelism in the late 1990s as another major step, but little has been changed since in terms of the basic structure of the software.
“We are dealing with equation systems that are quite heavily coupled, and so data transfer is a real issue for us, network speed is a problem, and I/O speed from striped disks, too.”
John Barr, research director, high performance computing for The 451 Group, sees many ISVs struggling with this issue.
“Compared with my first PC, a disk today is probably tens of thousands of times larger. But the bandwidth to the disk is probably only ten times faster – so to write the complete contents of the disk would probably take longer than it would have done ten years ago. So you have to think carefully about where your data is – the more complex systems become with multicore and distributed systems and clusters, the more you need to make sure you have your data in the right place.”
And the biggest issue there, he says, is a lack of skills in the community, especially among the smaller ISVs.
“There is a substantial skills gap at the low end. In the mid range market I think most developers are in a reasonably good position. These guys have been doing parallel processing for years, so they just need to tweak things.
“The real problem I see is at the entry level, where a guy writes an application to run on a single processor on a very high end workstation – in the very near future he’ll be in a position where he has a hundred cores on a desktop machine, and he won’t know where the potential parallelism is or what to do about it.”
The commoditisation of entry-level high performance computing machines means there are a lot of people who can afford “not exactly a supercomputer, but a pretty beefy resource – but they don’t have the skills required to programme it,” Barr says.
The problems starts at university, where parallelism isn’t taught until MSc level, Barr says. “We’ve got a whole generation of programmers who don’t really understand parallelism. People come out knowing how to write Java programme but they don’t really understand computer architecture.”
Within a few years, Barr says, “a standard server or PC will have hundreds of cores. And the world needs to fundamentally change the way it programs things in order to exploit that.”
There are compilers available that will generate code for multicore and for GPUs, but the nature of individual codes means “there’s no magic box that you put your application in and out it pops, ready to run efficiently on whichever hardware platform you need,” Barr says. Human coding skill is still needed – and it’s in short supply.
Not everyone is racing to adapt to the latest hardware, though. As Fischer says, you could spend all your time chasing new architectures and never really catch up, “since they vanish in between, like IBM’s CellBE! We have several million lines of code. To reprogram large portions of this code when you don’t know how long the new architectural model will be valid – it’s not sustainable. Our customers want new functionality from us, and if we spend all our time reprogramming, we won’t be able to provide that,” he says. Instead, he says, Intes tries to look at each iteration of new hardware and see how it can extend concepts while balancing efficiency with hardware dependency.
Intes customers principally want reliable software, Fischer says. The fact that the hardware market is developing in different directions doesn’t mean that Intes has to follow. “We select the most efficient and general path,” he says. At the moment, that means being able to handle GPU and MIC.
Eugen Riegel is HPC software developer at FluiDyna, developing GPU software for computational fluid dynamics. Unlike Intes, FluiDyna is starting from scratch in developing software to suit the latest hardware, and looking out for the new markets being created by what the hardware can now do.
“For GPU computing, to do it efficiently, you have to take a new approach, and adapt to the specific architecture of GPUs,” he says. Anyone looking to recode existing software to run on GPUs really has to “start from scratch”, he says.
You need fully parallel code, adapted to the single instruction, multiple data (SIMD) approach of GPUs, and reducing the use of ‘if’ branches that will make the code inefficient.
“And with GPUs you don’t have memory caches like on CPUs, so you need to use some special strategies for memory access. It forces you to redevelop your code – you can’t just take some older code and port it,” he says.
The process is fairly time intensive and expensive to develop, he says, “but if you do it right you get the high speed you want, so it’s worth it.”
Hutchings agrees. “Everything I said about programming for multicore is true for GPUs, too – but more true! You have to be a little bit clever about how you change the software to take advantage of the compute power but relatively less memory.”
Adapting to changing architecture is a serious cost for an ISV, Hutchings says.
“For our customers it’s a good news story, they have much more compute power. For us, the on-going investment is significant. We can’t just sit still and use the same algorithms we used five years ago and maintain the performance our customers expect.”
Rewriting the code from scratch isn’t an option, she says. “Commercial software of this sort has been built up over years. And you don’t want to hit your customers with a ‘revolution’, anyway – more of an evolution.”
Having said that, ANSYS has made some major changes recently. A change to an algorithm in its computational fluid dynamics software sees ANSYS using hybrid parallelism to communicate within machines.
“We used to use the same method whether communicating within a machine, or between machines. Now that there are so many cores inside a machine we had to adopt a new, very specific way of handling the internal communications, to take advantage of all the speed we could get.”
Likewise, in GPUs, some careful analysis of what was happening led to changes that speed up processing, says ANSYS lead project manager Ray Browell.
“Just like the bottlenecks with I/O and memory on CPUs, on GPUs there’s the time it takes to get on and off the GPU board. So we basically looked for the most efficient use of the GPU, minimising transfers.”
All the talk of GPUs inevitable leads to FPGAs – field programmable gate arrays. Used in areas like financial services and biotechnology, they are useful for very fast processing of straightforward, unchanging code. In industry, however, they really haven’t had the impact that GPUs have had.
“You can take one code and really make it scream on an FPGA,” says Hutchings, “but most customers want a general purpose hardware solution that’s going to be supported across a range of software tools.”
John Barr agrees. “FPGAs are weird and wonderful things. They’re like a blank piece of paper – you design your own processor and run your program. And you can build a processor that’s spectacularly fast – the potential benefit is stunning. But it can be very difficult, and if you’re going to want to change the program often, then FPGAs aren’t the answer.”
Hutchings stresses that software manufacturers are not complaining about the changes to hardware: “The hardware revolution continues to fuel just enormous possibility in our space. What’s possible today compared to when I started out is just truly amazing. So yes, the evolution of hardware makes it harder for us, and we have to put a tonne of investment into this – but the return for our customers is just enormous, and it continues to revolutionise what’s possible.”
© Gillian Law
Gillian Law looks at the complexities of teaching High Performance Computing, and at what needs to ...
Dr Chris Jones is a Technologist Consultant at BAE Systems specialising in lightning, electromagneti...
Dr Francis Wray looks back over the history of HPC and gives his insight into what can be said about...
PlanetHPC has published a report titled "A Strategy for Research and Innovation through High Perform...
- 16 April 2013
- 07 February 2012
- 25 November 2011
- 09 May 2011