Sunday, May 28, 2006

Cell Processor for Scientific Computing

Slashdot has an article summarizing a Lawrence Berkeley National Lab paper on the usefulness of the STI Cell processor for scientific applications. In summary, the Cell processor makes a good compromise between programming complexity, performance and power. Compared to a vector architecture (Cray), superscalar architecture (AMD Opteron), and VLIW architecture (Intel Itanium2), the Cell processor performed 10x as fast on several common scientific operations. Similar improvements were found in power efficiency. The cost of this performance improvement is that the core computations require about 10x as many lines of code, at least when hand coding for the Cell processor. This is not a big deal, as the lines of code for the core computations are a small fraction of the lines of a complete program. Presumably, an auto-optimization by a compiler may eliminate the need for special code.

One of the deficiencies of the Cell is that it is not optimized for double precision (DP) floating point operation, although its DP speed is still several times faster than the other architectures. The authors include a few sections describing minor modifications that could be made to the Cell to bring its DP performance up to par with the single precision (SP) operations.

This paper should put to rest any doubts about the real-world benefits of the Cell architecture. Here, we have the nation's top technical researchers writing a paper describing the performance benefits of the Cell processor for scientific computing, and going on to recommend further improvements. Furthermore, the paper reveals that a team of several researchers at a government sponsored national lab spent over a month on this project. This means that serious researchers are seriously interested in the Cell processor.

While Cell performance is the explicit subject of this article, there is another very important implication. The researchers used IBM's freely available Cell simulator software to test their analytic Cell performance model. I don't think their research would have been possible if these tools and other Cell related documentation were not as openly available as they are. IBM has set up a Cell Broadband Engine Resource Center that makes it easy for even hobbyists like me to learn about the Cell processor and tinker with the Cell Simulator (I haven't actually done that, yet. I still need to get my hands on a computer suitable for running this simulator).

I foresee the a Cell processor based system making it into the Top 500 Supercomputer Sites list within a year or two. I expect most of these systems will be made by IBM, the current leader in the Top 500 list. I would be amused if a cluster of Sony Playstation 3's also makes it into the list.

0 Comments:

Post a Comment

<< Home