 | Level: Introductory Power Architecture editors, developerWorks, Federated Integration Test team, IBM China Development Labã
30 Mar 2004 Updated 15 Dec 2005 In the last decade alone, IBM scientists have announced one semiconductor breakthrough after another: copper technology, silicon-on-insulator, silicon germanium, strained silicon, and low-k dielectrics. All of these technologies came out of IBM's fertile in-house research community. This prowess in modern chipmaking know-how didn't come out of a vacuum -- rather, it came out of the hermetically-sealed clean rooms of the most advanced R & D department in the semiconductor industry.
In the beginning, each computer's central processing unit, or CPU,
was
unique. Each had its own instruction set, which was incompatible with any
other. All of that changed back in the thermionic valve (or "vacuum tube")
days with the introduction of the IBM S/360™ line of computers, in 1964.
Suddenly, code didn't have to be thrown away and reimplemented every time
you bought a new computer. Today's IBM mainframes still maintain
backwards-compatibility with that revolutionary 1962 instruction set. And
the same spirit of compatibility infuses IBM's other CPU lines.
At the user-mode level, the instruction set of the PowerPC®
family of processors provides full application compatibility, from the lowliest automated traffic light to the powerful BladeCenter JS20 or the Apple Xserve G5. In
addition, PowerPC microprocessors share a large common instruction set
with IBM's other RISC processor lines, POWER™ and Star, which leads to
"near" compatibility across all three families. In many cases, this equals
binary compatibility; in some cases, it means that a simple recompilation is
needed; in all cases, it means that porting is a breeze.
IBM's four families of processors -- the Power Architecture™, the
PowerPC
family of processors, the Star chips, and even the line of chips that
power IBM mainframes -- all have a common ancestor: the IBM 801.
The fifth and newest processor family to join this illustrious group is
the Cell Broadband Engine Architecture family. While its central
processor, or PPU, is Power-like -- for instance, compatible enough to run
PowerPC Linux --
it is surrounded by a number of SPUs (in
the first iteration, eight) which use a completely different ISA. For this
reason and because it is jointly designed and owned by IBM, Sony, and
Toshiba rather than IBM alone, it is really a separate product line, rather
than another PowerPC.
Family inheritance
The IBM 801 started out attempting to solve the same problem as a lot of
the
computers in the 1970s: switching telephone calls. The design
team's goal was to complete one instruction per clock cycle, and to
accommodate 300 calls per minute.
Most of the computers of the day, such as the IBM S/360 mainframe, had
complex and redundant instruction sets known today as
CISC (complex instruction set computer). The trend towards miniaturization
in computing, which began with the 1947 invention of the transfer resistor
(or "transistor") only exacerbated this. As integrated circuits grew
smaller, designers took advantage of the extra space to cram even more
instructions into the chip. All of this complexity meant that by the
1970s, computer chips could do really amazing things (like power
increasingly complex digital watches). But it also meant that the chips
needed more machine time to execute, making it impossible for the 801 team
to achieve their performance goals.
IBM's John Cocke was no stranger to the battle against
complexity. He had
already worked on the IBM Stretch computer, a rival to the IBM 704
mainframe, and on Stretch successor ACS (Advanced Computing Systems),
rival to the 704's successor, the S/360.
 |
Compatible by design
The PowerPC architecture is organized into three instruction-set
levels called "books." Book I is the base set of user
instructions and registers that should be common to all PowerPC
implementations. Book II defines additional user-level functionality that
is outside the normal requirements for application software. Book III
defines privileged operations typically required by operating systems.
Each Power Architecture processor has its own edition of Book IV, with
implementation-specific details. Book IV is selectively available under
NDA only.
Book E describes the Enhanced PowerPC architecture, which has
enhancements of particular interest to embedded developers.
For links to each of the publicly available books, see the Power Architecture zone's standards page.
|
|
He sliced away at the redundancy in the instruction set and designed a
machine with half the circuits of its contemporaries -- that ran
twice as fast as they did. The fast core and fewer circuits led not only
to greater performance, but also to lower power consumption and -- perhaps
most important for many consumers today -- much lower costs. This
architecture became known as RISC (reduced instruction set computer).
Some prefer to call it "load-store," pointing out that the instruction set
of a RISC computer can number as many as 100 instructions or more (as the
Power Architecture does). Others counter that the RISC is not a reduced
set of instructions, but rather a set of reduced instructions -- each of
the complex instructions of the CISC is broken down into shorter basic
building blocks that can then be combined.
In any case, the complexity that was removed from the CPU didn't
just
disappear; it was pawned off on the compiler. In order to do that
gracefully, John Cocke became not only an expert in compilers, but
especially in optimizing them. His work on RISC and optimizing compilers
won him many awards, not the least of which was the 1987 Turing Award.
As for the IBM 801, it never did become a telephone switcher. Instead, it
became the first RISC chip and powered many IBM hardware products -- for
a time, it even did a stint as a minicontroller and processor in its
rival, the IBM mainframe series.
The RISC architecture soon came to dominate the workstation and embedded
markets, and John Cocke moved on to other projects. In the 1980s, he had a
chance to refine his 801 design in a project that was code-named
"America" and that would become the POWER series of chips. He even had a
hand in the development of the PowerPC architecture a few years later.
Like the 801, the PowerPC was designed to be a universal microprocessor
that could run on any machine, from the smallest to the tallest.
Today, the RISC architecture is the single most common CPU type in use
and is
the basis for everything from workstations to cell phones, video game
consoles to supercomputers, traffic lights to desktops, and broadband
modems to automobile fuel-injection and collision avoidance systems.
Even x86 chip manufacturers, which continued for quite a time to produce
CISC chips, have based their 5th- and 6th-generation chips on RISC
architectures (and translate x86 opcodes into RISC operations to make them
backwards-compatible).
POWER
POWER stands for Performance Optimization With Enhanced RISC, and it is the main
processor in many IBM servers, workstations, and supercomputers. Descended
directly from the 801 CPU, it is a 2nd-generation RISC processor. Introduced in 1990 to power the RS, or RISC System/6000 UNIX® workstations (now called
the eServer™ pSeries®), POWER exists in iterations from POWER1, and
POWER2™, through POWER5™ and beyond.
The 801 was a very simple design. But because all instructions completed
in one clock cycle, it lacked floating-point and superscalar (parallel
processing) ability. The POWER architecture set out to correct this --
some would say, perhaps, to overcorrect it. With more than 100 instructions, the
POWER is a pretty complex RISC.
Highlights for each iteration follow; for comprehensive details, please
see the links listed in Resources.
-
POWER1
Released in 1990: 800,000 transistors per chip
Unlike other RISC processors of the day, POWER1 was functionally partitioned.
This gave it superscalar abilities beyond those of mortal chips. It
also had separate floating-point
registers and could scale from the low to the high end of the UNIX
workstations it was built for. The very first POWER1 was actually several
chips on a single motherboard; this was soon refined down to one RSC (RISC
Single Chip) with more than a million transistors. The RSC implementation
of the POWER1 microprocessor was used as the central processor for the
Mars Pathfinder mission and is the chip from which the PowerPC line is
descended.
-
POWER2
Released in 1993 and in use until 1998: 15 million transistors per chip
The POWER2 added a second floating-point unit (FPU) and more cache. The
PSSC superchip, a single-chip implementation of the POWER2's eight-chip
architecture, powered the 32-node IBM DEEP BLUE® supercomputer that beat
world champion Garry Kasparov at chess in 1997.
-
POWER3
Released in 1998: 15 million transistors per chip
The first 64-bit symmetric multiprocessor (SMP), POWER3 is completely
compatible with the original POWER instruction set -- and compatible with
the PowerPC instruction set as well. The POWER3 was designed for work on
scientific and technical computing applications from aerospace and pharma
to weather prediction. It features a data prefetch engine, non-blocking
interleaved data cache, dual floating point execution units, and many
other goodies. The POWER3-II reimplemented POWER3 using copper
interconnects, delivering double the performance at about the same price.
-
POWER4
Released in 2001: 174 million transistors per processor
A gigaprocessor incorporating 0.18-micron copper and SOI
(Silicon-on-Insulator) technology, the POWER4 was the single most powerful
chip on the market when it was introduced.
It inherited all of the characteristics of the
POWER3 -- including compatibility with the PowerPC instruction set -- but
reinvented itself with a completely new design. Each processor has two
64-bit 1GHz+ PowerPC cores, making it the first server processor with a
multicore design on a single die (also known as "SMP on a chip," or
"system on a chip"). Each processor can execute as many as 200
instructions simultaneously. The POWER4 supersedes the Star family of
processors and is the power behind the IBM Regatta servers as well as
being the father of the PowerPC 970 processor (also known as the Apple
G5). The POWER4+™ (also known as POWER4-II) does the same,
but at higher frequencies and with less power consumption. It was the
first to use
the 130-nanometer copper/SOI process.
-
POWER5
™
Released in 2003: 276 million transistors per processor
Like the POWER3 and POWER4, the POWER5 unifies the POWER and PowerPC
architectures. The POWER5 is also based on the 130-nanometer copper/SOI
process, and features communications acceleration, chip multiprocessing,
a larger L2 cache, a memory controller on the chip, simultaneous
multithreading, advanced power management, eFuse (morphing)
and hypervisor technology. IBM servers built with the POWER5
feature up to ten LPARs capable of running up to 256
independent operating systems on the higest end. POWER5 processors can be
found hanging about in iSeries and pSeries servers, as well as in the first IBM
entry-level UNIX/Linux box, the OpenPower™ line. IBM introduced the
POWER5+™ processors,
which are built with a 90-nanometer process similar to that used with the
Cell Broadband Engine, in 2005. POWER5+ ups the clockspeed significantly
-- on a smaller die.
-
POWER6
™
Under wraps for the most part; but IBM is taking the next step
in the POWER line's evolution by involving customers
more in setting the requirements for the design of the POWER5+ and higher
iterations. The POWER6 is
said to be code-named Eclipz, is
expected to be based on
the 65-nanometer process, and is expected by many to debut in the
2006-2007 timeframe.
 |
STAR power: PowerAS and RS64
The PowerAS family first appeared in 1995, and the first chips using the
RS64 name appeared in 1997. These chips are known within IBM as the Star
family, because most of the code words for the various iterations
contained the word "star" or something like it (the notable
exception is
the original RS64, code-named "Apache", and the PowerAS chips).
Descended from a modified PowerPC architecture, they also inherited a
number of traits from
the POWER line. From inception they were
optimized for one thing only: commercial workloads. This degree of
specialization put them at the top of the UNIX server game for
roughly six years.
The RS64 family left things like branch prediction, exceptional
floating-point powers, and hardware prefetch to its POWER3 cousin and
focused instead on exceptional integer performance and large,
sophisticated on- and off-chip caches. The RS64 family was 64-bit for
its entire span, and introduced multithreading in the RS64 II
iteration in 1998. The RS64 could scale to as many as 24 processor SMP in a
single machine, and -- unlike its POWERful cousins -- consumed as few as 15
watts per processor.
These qualities made it ideal for things like on-line transaction
processing (OLTP), business intelligence, enterprise resource planning
(ERP), and other large and hyphenated, function-rich, database-enabled,
multi-user, multi-tasking jobs with high cache miss rates -- including Web
serving. RS64 chips shipped in the IBM eServer pSeries™ (RS series) and
iSeries (AS series) only. The contrast with the (also highly specialized) Cell
Broadband Architecture design is interesting, and shows a very different
set of design priorities.
-
PowerAS
Released in 1995, model numbers A10, code name "Cobra;" and A30, code name "Muskie"
The A10 was a uniprocessor, and the A30 was 4-way SMP.
-
RS64
Released in 1997, model number A35, code name: Apache
The first RS64 and the world's first 64-bit PowerPC RISC. Both
superscalar and scalable, it was more compatible with POWER1 than later
RS64 chips would be. By focusing on commercial workloads, it was able to
implement functions on one chip that had previously required seven. It
was used in the AS/400® (then called A35) and RS/6000®.
-
RS64 II
Released in 1998, model number A50, code name: Northstar
The second iteration featured four processors per card and up to three cards
per RS/6000 to create a 4-way, 8-way, or 12-way SMP system.
-
RS64 III
Released in 1999, code name: Pulsar
The first RS64 to use IBM copper and SOI (Silicon on Insulator), now with
six processor cards scaling up to 24-way SMP.
-
RS64 IV
Released in 2001, code names: IStar, SStar
The first mass-market processor to implement multithreading, RS64 IV was
faster and smaller than its predecessors.
Today, the convergence of commercial and scientific computing has created
a need for a single processor to address both markets, and the Star family has
merged into the POWER family, starting with the POWER4.
PowerPC
The PC in PowerPC stands for performance computing. Descended from the
POWER architecture, it was
introduced in 1993. Like the IBM 801, it was designed from the beginning
to run on a broad range of machines, from battery-operated handhelds to
supercomputers and mainframes. But it saw its first commercial use on the
desktop, in the Power Macintosh 6100.
Born of an alliance between Apple, IBM, and Motorola (also known as the
AIM
alliance), the PowerPC was based on POWER, but with a number of
differences. For instance, PowerPC is open-endian, supporting both
big-endian and little-endian memory models, where POWER had been
big-endian. The original PowerPC design also focused on floating-point
performance and multiprocessing capabilities. Still, it did and
still does include most of the POWER instructions. Many applications work
on both, perhaps with a recompile to make the transition.
Since 1993, the PowerPC ecosystem has of course evolved; Apple is no
longer actively involved in PowerPC, and
Motorola's PowerPC (and other microprocessor) development work has been
spun off into the independent semiconductor company known as Freescale.
In 2004, AMCC acquired the 4xx line of customizable embedded PowerPC cores, and
in 2005, HCL Technologies of India announced it would open the first
non-IBM Power Architecture Design Center: the landmark agreement enables
HCL to sublicense end-to-end Open SystemC models, core hardening and
integration services, SoC prototyping, and other services around the IBM
PowerPC 405 and PowerPC 440 embedded microprocessor cores without any
involvement from IBM.
The Power.org consortium launched at the end of 2004 to foster
innovation in and growth around
the Power Architecture ecosystem (including PowerPC).
Top-of-the-line custom processors, such as the next-gen Microsoft® Xbox 360®
processor, are based on the PowerPC
architecture, and IBM remains firmly committed to the PowerPC family as an
important component of its microprocessor lineup (note that Freescale
hints
the PowerPC might undergo a name change in 2006!).
While IBM, Freescale, and AMCC develop their chips separately; at the
user level, all PowerPC processors run the same core PowerPC instruction set,
ensuring full ABI compatibility for the software products that run on them.
Since 2000, Freescale (then still Motorola) and IBM PowerPC chips have followed
the Book E spec, which
provides additional enhancements to make PowerPC more attractive for embedded
processor applications such as networking and storage equipment as well as
for consumer devices.
Aside from compatibility, one of the best things about the PowerPC
architecture is that it is open: it specifies an instruction
set architecture (ISA) that allows anyone to design and fabricate
PowerPC-compatible processors; and source code for software modules
developed in support of PowerPC is freely available. Finally, the small
size of the PowerPC core leaves a great deal of room on each die for
additional components, from added cache to coprocessors, allowing for an
amazing amount of design flexibility. The Xbox 360 processor and the Cell Broadband Engine processor
both give excellent examples of this, as do the various System-on-Chip
processors, such as the AMCC 405GPr.
Two of IBM's five server lines are based on the PowerPC
architecture, as are the current generation of
Apple Computer desktop and server lines, the Nintendo GameCube, many
IBM BladeServers, and
the IBM Blue Gene® supercomputer.
Today, the three main PowerPC families are the embedded PowerPC 400
series
and the stand-alone PowerPC 700 and PowerPC 900 families. For historical
perspective, we will also give highlights for the stand-alone PowerPC 600,
because it was the first.
-
PowerPC 600 family
The PowerPC 601 was the first chip in the first PowerPC family. A sort
of bridge between the POWER and PowerPC architectures, it maintained more
compatibility with POWER1 than later PowerPCs (even those from the same
family), as well as compatibility with the Motorola 88110 bus. The
PowerPC 601 made its debut in the very first PowerMac 6100 in 1994,
running at blazing speeds of up to 66 MHz. The next chip in the line was
the PowerPC 603™, a low-end, low-power core that is the chip most often found in
cars. Released at the same time as the 603, in its day the PowerPC 604™ was the
most powerful high-volume chip in the industry. Both the 603 and 604 were
rereleased in tweaked "e" versions (the 603e and 604e) with improved
performance. Finally, the first 64-bit PowerPC, the very high-end PowerPC 620®, was
released in 1995.
-
PowerPC 700 family
With a debut in 1998, the PowerPC 740 and PowerPC 750 were very similar
to the 604e -- some people would say they are all members of the same
600/700 family. The PowerPC 750 was the world's first copper-based
microprocessor, and when used in Apple computers is usually known as the
G3. It was rather quickly eclipsed by the G4, or Motorola 7400. The 32-bit
PowerPC 750FX wowed the industry with speeds of up to 1 GHz when it was
released in 2002. IBM followed this in 2003 with the 750GX, which incorporates
1MB of L2 cache at speeds of 1GHz at around seven watts of power consumption.
The 6xx processors have been retired, but the 7xx line is very much alive
and kicking at the high end of the embedded space: IBM
introduced the new RoHS-compliant, low-power
PowerPC 750GL in 800MHz and 933MHz flavors
in 2005.
-
PowerPC 900 family
The 64-bit PowerPC 970, a single-core version of the POWER4, can process
200 instructions at once at speeds of up to 2 GHz and beyond -- all while consuming
just tens of watts of power. Its low power consumption makes it a favorite
with notebooks and other portable applications on the one hand, and with
large server and storage farms on the other. Its 64-bit capability and
single instruction multiple data (SIMD) unit accelerate
computationally intensive workloads such as multimedia and graphics. It is
used in Apple desktops, Apple Xserve servers, imaging applications, and --
increasingly -- in networking applications.
Apple's Xserve G5, launched in March of 2004, represented the first use of
the new PowerPC 970FX -- the first chip made using both
strained silicon and SOI technologies together, enabling the chip to run
at even greater speeds with even less power consumption --
in an off-the-shelf system.
IBM introduced the PowerPC 970MP, with a slew
of intriguing power-saving features and capabilities, in 2005.
Today the PowerPC 970 is still found in
some Apple products; as well, in IBM BladeServers, Terra Soft Solutions
and Genesi systems, and has
even been known to make the occasional appearance in the embedded space.
-
PowerPC 400
This is the embedded family of PowerPC processors. The PowerPC's flexible
architecture allows for a great deal of specialization, and that is
nowhere so apparent as in the 4xx family, which is equally at home in
applications
ranging from set-top boxes to the IBM Blue
Gene supercomputer. On one end of the spectrum, the PowerPC 405EP
consumes just one watt of power to achieve speeds of up to 200 MHz, while
the copper-based 800 MHz PowerPC 440 series offers the 4xx line's highest
performance for an embedded processor. Each 4xx subfamily can be specialized
as well; for instance, the PowerPC 440GX's dual Gigabit ethernet and
TCP/IP off-load acceleration can decrease utilization for packet-intensive
applications by more than 50%. A large array of products
are built around
highly modified PowerPC 400 family cores, not the least of which is the
Blue Gene supercomputer with two PowerPC 440 processors and two FP (floating point) cores per chip.
AMCC acquired the 4xx line in 2004, although IBM is still handling
manufacturing for these processors. AMCC has introduced new designs,
including the security-conscious 440GRx and 440EPx, the next-gen PowerNP™
family, and the RAID-enabled 440SP and 440SPe, in 2005.
Originally thought of as a desktop chip, the PowerPC's low power
needs
make it an excellent candidate for the embedded space, and its high
performance makes it attractive for advanced applications. It is well
suited for
everything from video game consoles and multimedia
entertainment systems, to personal digital assistants and cell phones, to
base stations and PBX switches. It is at home in broadband modems, hubs
and routers, automotive subsystems, printers, copiers, and faxes. And of
course, server systems and workstations, too.
Welcoming the newest member of the family: It's a bouncing baby BE!
The Cell Broadband Engine™ (Cell BE) architecture is a new
architecture which extends the 64-bit Power Architecture technology. Capable of massive floating point processing
and ideal for compute-intensive
tasks, the Cell BE processor is a
single-chip multiprocessor no bigger than a fingernail, with nine
processors operating on a shared, coherent memory. The Cell BE processor
contains a Power Architecture-based control processor (PPU) augmented with
eight (or more) SIMD Synergistic Processor Units (SPUs) and a rich set of
DMA commands for efficient communications between them all.
While it was originally designed for use in the Sony® PlayStation® 3, Sony,
Toshiba, and IBM (known collectively as STI) have wider-ranging plans for
the new family than that. While some in the developer community fear the
complexity of its dual-ISA, others rightly point out that the chip world
is going multiprocessor across the board, and you've got to start
programming for them sometime. You can do that right now with the IBM Cell
BE SDK (see Resources); and it is
expected that you will be able to do it on real hardware -- including
multiprocessor systems, blade systems, some
special-purpose hardware, and at least one evaluation platform --
beginning in 2006.
So far, the new Cell Broadband Engine Architecture family has only one
member:
-
Cell Broadband Engine processor
Expected release date: 2006; 234 million transistors
The Cell BE processor is manufactured with a 90nm process and crams 234
million transistors onto a 221-mm-square die. This vast amount of real
estate is inhabited by the 64-bit Power-based VMX-enabled central core
(the PPE); eight synergistic cores (the SPEs, each of which has its own
dedicated DMA engine, over one hundred 128-bit register files, and 256KB
of Local Store). As well, the tiny die houses 512KB of L2 cache, the
chip's specially designed Element Interconnect Bus (EIB), the shared
Memory Interface Controller (MIC), and Rambus XDR, and FlexIO interfaces.
The Cell Broadband Engine supports hyper-pipelining, resource allocation,
locking caches, virtualization, power management, and just enough
redundancy to make it really reliable!
CMOS
 |
Moore's Law on the fast track
Everyone knows that IBM originated FORTRAN and disk drives; but did you
know IBM is also one of the leading innovators in chip technology? Here is
a quick rundown:
-
Chemically amplified photoresists
In which the application of chemicals, followed by deep UV light followed
by heating blazes a trail to the goal of sub-100nm optical lithography,
allowing ever-smaller circuit features to be reliably transferred to the
silicon wafer for manufacture. The goal of the IBM sub-100nm project is to
reach the fabled 1-10nm scale.
-
Copper conductors
The semiconductor industry has dreamed of using copper, which conducts electricity
40 percent more efficiently than aluminum, for ages. But it was only
recently that a manufacturing process was discovered to accomplish this
goal. Taking a page from Edison's notebook, IBM researchers use tungsten
to create copper-based chips capable of running significantly faster than
their aluminum-based brethren.
-
Silicon-germanium (SiGe)
Used in bipolar chipmaking in place of the more expensive gallium
arsenide process, SiGe allows for significant improvements in operating
frequency, current, noise, and power capabilities.
-
Silicon on insulator (SOI)
Placing a thin layer of insulation between the silicon surface and the
transistors protects the transistors from "electrical effects," leading
to higher performance and lower power consumption.
-
Strained silicon
This technique strains (or stretches) silicon, thus speeding the flow of
electrons through a chip, increasing performance and lowering power
consumption without any miniaturization. When coupled with SOI, the use
of strained silicon speeds performance and decreases power consumption
even more.
|
|
You will remember that the 801 project was in great part a reaction to
the
complexity of CISC systems and specifically, the extreme CISC of the IBM
mainframe. Nevertheless, IBM mainframes were also beneficiaries of the 801
project, and so are distantly related to IBM's three lines of RISC
processors. IBM's "fourth family" of processors, the
mainframe chips, have
a very complicated family history of their own.
One of the reasons for this is that mainframes rely much less on the CPU
and more on system architecture and I/O channels than other types of
computers. The revolutionary S/360 family of mainframes that introduced
compatibility to the industry were still powered by magnetic cores. With a
name change to S/370™ in 1971, they became the first mainframes in the
industry to switch to chips. Of course they used CISC chips; specifically,
bipolar junction transistors with a CISC architecture. Some of the S/360 and
S/370 systems adopted some RISC design techniques, implementing part of
the instruction set in hardware, which actually improved performance!
An even more significant change came when they began to use CMOS
instead of bipolar transistors; the first generation (or G1) CMOS
mainframe chips came out around 1994, and by 1997 IBM announced that
henceforth all mainframes would ship only with CMOS and never again with
bipolar transistors. And it isn't only mainframes that have made the
switch to CMOS: while bipolar transistors ruled the early chipmaking
world, most of the processors made today are CMOS.
So what are these CMOS chips, exactly? Well, CMOS (complementary
metal-oxide semiconductor) chips use metal-oxide semiconductor field
effect transistors (MOSFETs). These are fundamentally different from bipolar
transistors. A few of the effects of those differences are highlighted here;
see Resources for details.
Bipolar transistors are blazingly fast, but they consume a great deal of
power, even in a standby or steady state. Meanwhile, an FET transistor is
achingly slow, but consumes no power at all in a steady state. Thus, for
applications where long battery life is crucial -- and performance isn't
-- FETs are the way to go. Thus, in the days when computing was
still so primitive that people thought that digital watches were
a really neat idea, it was CMOS chips that powered them. They also powered other
applications requiring little power and none-too-fast performance -- like
housing a personal computer's BIOS.
Now, another big difference between bipolar and FET transistors is
topology: bipolar transistors have a vertical layout, while FET-based chips are built
on the horizontal. Thus, there is more room on a FET-based chip.
Eventually, around the cusp of the 1980s and 90s, the relentless march of
miniaturization approached sizes so small that the larger area of the
slower FET-based chips could be filled with enough transistors to whomp the
performance superiority of the bipolar model. FET-based chips have one last thing
going for them, which is that they interfere electronically with their
neighbors much less than bipolar transistors do. So, while bipolar
transistors run up against a wall where making them any smaller leads to
unacceptable levels of electrical interference, FET-based chips can be
made even
smaller than that, and so packed even more densely in their larger surface
area. Thus, most of the latest advances in nano-scale chip processing have
been on CMOS chips.
The other really interesting thing about mainframe chips is their level of
redundancy. They are usually packaged together in Multi-Chip Modules (MCM)
of 20 or 30 chips or more: fully one half of them are there as backups,
ready to take over if an active chip fails. Further, mainframes process
each instruction they receive twice, on separate chips, and check their
answer before returning it. As we reach the milestone of one billion
transistors on a single chip, we may find that kind of stability applied
to consumer processors as well.
Custom chips
What do the Nintendo GameCube's Gekko, Cray's X1 supercomputer
chips, NVIDIA's latest GeForce processor, and the next-generation
Microsoft Xbox
and Sony PlayStation all have in common? All of them use chip technology
licensed from or manufactured by IBM.
In the last few years, IBM has begun to open its foundries -- and its
research -- to outside business like never before. The E&TS division has
well over a thousand engineers for hire, available to work on software,
technology, and chip engineering for their clients.
E&TS did much of the work on the Xbox 360 processor. Between the
Xbox
360 processor (a three-core PowerPC), the Cell Broadband Engine (a dual-threaded
PowerPC with eight specialized mathematical processors), and the processor in
the Nintendo Revolution (which is an IBM chip, though it is not confirmed
yet whether it is or isn't Power Architecture technology), IBM semiconductor
solutions has managed a sweep of
next-generation gaming console hardware. Various System-on-Chip designs
are also based on Power Architecture technology, and HCL Technologies in India
is developing designs
built around Power Architecture technology.
Figure 1. It's wafer-thin: 300mm wafers yield more chips
And of course one of the many reasons that the Power Architecture technology is so
appealing is the new top-of-the-line IBM fab in Fishkill,
New York. The Fishkill fab is so up-to-date that it is capable of
producing chips with all of the latest acronyms, from copper CMOS technology to
Silicon-on-Insulator (SOI) and low-k
dielectrics -- all on 300mm wafers. The Fishkill fab is so with-it that
the server room runs exclusively on Linux. And the Fishkill fab is
so amazingly, mind-bogglingly hip that
it won Semiconductor International's 2005 Top Fab award.
As well, IBM foundries are the world's leading supplier of ASICs
(application-specific integrated circuits), from Customizable Control
Processor (CCP) options -- where a large portion of the design is fixed, but
there is plenty of room left for customization -- to IBM design expertise in
tailoring an existing product to a new application, to support for other
suppliers' processors and coprocessors. In short, they're ready for
anything.
Fab future
Just twenty years ago, chip components were measured in microns, or
thousands of nanometers. Today, chips produced on 300mm wafers contain
components with an average size measured in the tens of nanometers. You
will of course recall that one nanometer is one millionth of a millimeter,
and that a human hair has a thickness of about 100,000 nanometers. At this
rate, we will soon be measuring components in Angstroms.
Inexpensive processors with a billion transistors per chip are just
around the
corner, and industry watchers suggest we will reach speeds of 100GHz by
2010. The Cell Broadband Engine (a cooperative effort of Sony, Toshiba, and
IBM) is widely considered an
effective exploratory leap in one of the directions that might get us
there.
In the nearer-term, we can look forward to the release of
the Sony PlayStation 3 and the Toshiba Cell BE development board -- and
maybe even the POWER6 -- in 2006; as well as to another fabulous year at the
IBM developerWorks Power Architecture technology zone.
Attributions
Cell Broadband Engine is a trademark of Sony Computer Entertainment Inc.
Resources Learn
- Learn more about John Cocke
-
John Cocke's first assignment upon joining IBM in the late 1950s was
to work on the Stretch
computer. While it never delivered on its promise to outperform the
IBM 704 mainframe by a factor of 100, it did outpace its rival by a factor
of 30. It also pioneered lookahead, pipelining, branch prediction,
multiprogramming, memory protection, generalized interrupts, and the 8-bit
byte -- and more! All of these were later used in the IBM System/360 line,
and have since trickled out to most chips on the market today.
-
The successor to the 704 was known as Project X, and it competed
internally with the successor to Stretch, Project Y. While Project X was
to become the IBM S/360 family of mainframe computers, Project Y would
become ACS (Advanced
Computing Systems), IBM's first attempt at a supercomputer. ACS
was the project John worked on after Stretch, and was the forefather of
John's next assignment, the 801.
-
A hacker in the true
sense of the word, John
Cocke changed chip design -- and the computing world -- forever. For
this he received a number of industry and national awards, not the least
of which are commendation from The
Franklin Institute and the 1987 Turing
Award.
- More mainframes
-
The S/360 mainframe was released almost exactly 40 years ago, priced to
move at a mere US$133,000 for a basic configuration. Read a copy of the press
release dated April 7, 1964.
-
For more information on how mainframe architecture and issues have
influenced Power Architecture technology, take a look at the Big Iron
series in the Power Architecture technology zone.
-
The newest IBM mainframe was introduced in 2005. Representing a major systems strategy shift, the new z9 can
eat its predecessor, the T-Rex, for lunch.
- More history
- More background
-
As Wikipedia, the free encyclopedia, will explain, Bipolar junction transistors
(BJT) are not only doped sandwiches, but also something essentially
contrary to CMOS. If we
could only successfully apply the same advanced processes used in CMOS
manufacture today, to bipolar chips, we would make a quantum leap forward
to chips with absolutely undreamed-of levels of performance. This CMOS
gates demonstration will take your understanding of CMOS to the next
level.
-
"Maintaining
the benefits of CMOS scaling when scaling bogs down" by E. J.
Nowak (IBM Journal of Research and Development, 2002) attempts,
among other things, to answer the question What happens when we get to 5
nanometers?, while "The future of
CMOS technology" by R.D. Isaac (IBM Journal of Research and
Development, 2000) offers great background on challenges facing chip
designers, and how and why CMOS have displaced bipolar transistor designs
(note especially Table 1!).
- IBM Journal of Research and Development
-
Most of IBM
Journal of Research and Development Volume 34, Issue 1 is devoted to
the original POWER architecture, often referred to in those days also as
"the RS/6000 processor" because it powered RS/6000 machines.
This issue includes an article on "The
evolution of RISC technology at IBM" by John Cocke himself (with
Victoria Markstein). (IBM Journal of Research and Development,
1990).
-
All of the 2005 issues of IBM JoRD have been devoted to chippy topics as
well. See: Electrochemical
Technology in Microelectronics (Volume 49, Number 1), IBM BladeCenter
Systems (Volume 49, Number 6), Blue Gene
(Volume 49, Number 2/3), and POWER5 and
Packaging (Volume 49, Number 4/5), and (coming soon) Spintronics
(Volume 50, Number 1).
- More POWER to you
- Super Powers
- IBM Semiconductor solutions now
-
IBM Semiconductor
solutions has a great Photo
Catalog, and a nice group of resources on its technology and
innovation page, nifty new video presentations,
and oodles
of documentation.
-
In addition, IBM Semiconductor solutions offers Custom chip solutions with
the broadest architecture support in the business; from Power
Architecture as well as other cores, including other suppliers'
cores.
-
IBM Semiconductor solutions also offers Evaluation
kits for PowerPC cores, which come with schematics, source code,
design details, and a comprehensive selection of tools to enable
development of PowerPC-based applications. You can download the IBM PowerPC
970FX Evaluation Kit and the IBM PowerPC
750GX-750FX Evaluation Kit from the developerWorks
Power Architecture downloads page.
-
AMCC has announced a number of 4xx PowerPC cores in 2005,
including the PowerPC 440GR
and the low-cost, low-power, security-enabled PowerPC
440GRx and PowerPC 440 EPx.
-
And in India, HCL Enterprise can also help you design
your own PowerPC core
- Absolutely fabulous
-
Keep abreast of the next new Power Architecture breakthrough as it
happens: subscribe to the Power
Architecture Community Newsletter.
-
The Power.org organization is an
excellent starting point for exploring the Power Architecture community.
Become a developer-level member
and join the conversation.
-
Find more resources for Power developers in the developerWorks Power
Architecture zone.
Discuss
About the author  | |  | The developerWorks Power Architecture editors welcome your comments on this article. E-mail them at dwpower@us.ibm.com.
|
Rate this page
|  |