Electronic Connections: Economics slowly shift toward the multichip option

By admin In News, Technology No comments

Electronic Connections: Economics slowly shift toward the multichip option

In the world of mainstream chipmaking, the idea of putting multiple chips in each package has made theoretical sense but rarely worked in practice. Often, the multichip module was put out of its misery by the economic reality that you generally wind up with cheaper silicon if you pour as much as possible into the one chip. The multichip module has often been a stopgap, as a way to get more performance out of a system while Moore’s Law catches up with demand.

On each iteration, however, the economics have swung slightly against the monolithic, single-chip option. The technological and design burden of putting these enormously dense chips together gradually causes companies to get out of the business, leaving just the largest incumbents in play.

At the same time, memory has proved a major stumbling block to integration pretty much since the invention of the dynamic random access memory (DRAM). The processes needed for the DRAM do not play well with logic and analogue circuitry. In order to get the size of phone circuitry down and make more space for the battery, companies like Apple have made multichip modules viable in high volume and, in doing so, reduced the cost differential between those modules and printed circuit boards (PCBs) designed around single-chip packages.

With those lower differentials, it now makes sense for some sectors to look more closely at going the multichip route. Whereas it used to be a no-brainer to scale everything down onto one device because of the smaller cost-per-transistor, there are markets where the production costs are outweighed by the amortised cost of design. This is where the cloud-computing companies are now.

The likes of Baidu, Facebook, Google and Microsoft Azure have turned to custom silicon to make their servers run faster on workloads such as machine learning. Traditional general-purpose processors such as the Intel Xeon do not handle these arithmetically intensive programs well and the second choice of the graphics processing unit (GPU) is too energy-hungry to be a good fit. So, they are turning to custom accelerators while they wait for a better alternative to appear.

At a recent meeting in San Jose convened by the Open Compute Project (OCP) – an organisation that brings together these big-iron users – Bapi Vinnakota, director of silicon architecture program management at network interface specialist Netronome, explained why accelerators are changing the nature of design. For one, there is the silicon-area issue. “Accelerators tend to lead to larger die,” he claimed.

Even today, there is a hard limit on how big an individual chip can get. It is limited by the size of the optics in the lithographic steppers that put the circuitry onto the surface of each die. There is also an important soft limit. Random defects become more problematic on large chips. If you cannot use a lot of redundancy, when you go a long way above 1cm squared, you wind up with the situation that nothing on a wafer works properly. As a result, it makes sense to break up designs into smaller chip-scale chunks if you can.

In addition, Vinnakota said: “Each accelerator will serve a smaller market than a general-purpose CPU. Both impact the economics of building accelerators.”

The answer, in the view of Netronome and others working in the OCP is to go with multichip modules assembled from ‘chiplets’. These are regular chips but designed in such a way that they are only ever used in a module. They do not have the antistatic protection and resilient I/O ports of chips that are intended to go directly onto a PCB.

Late last year, Netronome kicked off an effort called the Open Domain-Specific Accelerator (ODSA) group, which is now part of OCP. The name is slightly misleading in that it’s more about chiplets than accelerators, but as the lead market is in servers, it will do for now. The ODSA group wants to create a set of standards that will make it possible for designers to mix and match chiplets almost as easily as today’s PCB designers do with conventional integrated circuits (ICs). That means finding standard ways for them to talk to each other at high speed and with as little power as possible. This may be one of the biggest technical stumbling blocks as there are stringent requirements and some of them conflict.

For example, the serial links used today to interconnect high-speed processors can easily meet the low-power limits expected of chiplets. However, they add hundreds of nanoseconds of latency, which is bad news for a designer trying to break closely coupled processors apart to go into chiplets that will be just millimetres apart. Old-school parallel buses get around the latency problem and can also hit the energy budget with some tweaks, but they will make it harder to lay out the module and may call for much more expensive substrates. The manufacturers of these devices have to agree on what they will use and that is never an easy process.

The problems facing chiplet users are not just technical. There are serious business reasons as to why it is hard to get an open market in the components off the ground. The biggest stumbling block is the semiconductor industry’s strategic use of secrecy.

If there is one thing no company making semiconductors wants to talk about its yield at the wafer level. This kind of information becomes readily available if you are buying chiplets: they will often be supplied on complete wafers. You can learn a lot about a company’s operating margins simply by counting the red dots of failure on each chip on a wafer, so it’s no surprise that management do not want to give up that information if they can help it. In reality, they do because many have to outsource production to packaging houses and test specialists. However, woe betide anyone at those suppliers who lets word get out of a particular client’s troubles: leaks in such a controlled environment should be easy to track down.

With an open market in chiplets, it could be any of your customers who give the game away on how much of each wafer they get is junk. The few chiplet assemblers who are out there right now either consume their own chiplets or have painstakingly built up a position of trust with a select few major suppliers. Intel and Marvell fall into the former category. In the second category we have the large manufacturers who are today stacking memories on top of their own processors – think Apple. Or they are the even rarer startups who believe multichip modules are the future, companies like Octavo and zGlue. Although Octavo recently persuaded STMicroelectronics to assemble its products into modules, that was after establishing a track record with Texas Instruments, where Octavo’s CEO Gene Frantz was a senior technologist before moving to his startup. Convincing ST was a major step forward for the Octavo business in terms of how a chiplet market is perceived.

Due to the lower volumes involved, it is possible that the module assemblers will mostly work with parts that have already been diced and put into chip-scale packages. This is not necessarily a problem, even for high-performance accelerators. The packaged devices will most likely use serial interconnect, but for things like management processors and networking controllers that latency is manageable. The custom accelerator part, on the other hand, would be under the control of the designer and so can make use of the trickier, low-latency buses without worrying about yield data. That data is not going outside of their captive supplier.

Even now, the multichip module faces an uphill struggle for acceptance. Yet the economics and business models are shifting in its favour and this latest effort, despite all the technical and commercial obstacles, may be the one that finally creates a sustainable market.

In the world of mainstream chipmaking, the idea of putting multiple chips in each package has made theoretical sense but rarely worked in practice. Often, the multichip module was put out of its misery by the economic reality that you generally wind up with cheaper silicon if you pour as much as possible into the one chip. The multichip module has often been a stopgap, as a way to get more performance out of a system while Moore’s Law catches up with demand.

On each iteration, however, the economics have swung slightly against the monolithic, single-chip option. The technological and design burden of putting these enormously dense chips together gradually causes companies to get out of the business, leaving just the largest incumbents in play.

At the same time, memory has proved a major stumbling block to integration pretty much since the invention of the dynamic random access memory (DRAM). The processes needed for the DRAM do not play well with logic and analogue circuitry. In order to get the size of phone circuitry down and make more space for the battery, companies like Apple have made multichip modules viable in high volume and, in doing so, reduced the cost differential between those modules and printed circuit boards (PCBs) designed around single-chip packages.

With those lower differentials, it now makes sense for some sectors to look more closely at going the multichip route. Whereas it used to be a no-brainer to scale everything down onto one device because of the smaller cost-per-transistor, there are markets where the production costs are outweighed by the amortised cost of design. This is where the cloud-computing companies are now.

The likes of Baidu, Facebook, Google and Microsoft Azure have turned to custom silicon to make their servers run faster on workloads such as machine learning. Traditional general-purpose processors such as the Intel Xeon do not handle these arithmetically intensive programs well and the second choice of the graphics processing unit (GPU) is too energy-hungry to be a good fit. So, they are turning to custom accelerators while they wait for a better alternative to appear.

At a recent meeting in San Jose convened by the Open Compute Project (OCP) – an organisation that brings together these big-iron users – Bapi Vinnakota, director of silicon architecture program management at network interface specialist Netronome, explained why accelerators are changing the nature of design. For one, there is the silicon-area issue. “Accelerators tend to lead to larger die,” he claimed.

Even today, there is a hard limit on how big an individual chip can get. It is limited by the size of the optics in the lithographic steppers that put the circuitry onto the surface of each die. There is also an important soft limit. Random defects become more problematic on large chips. If you cannot use a lot of redundancy, when you go a long way above 1cm squared, you wind up with the situation that nothing on a wafer works properly. As a result, it makes sense to break up designs into smaller chip-scale chunks if you can.

In addition, Vinnakota said: “Each accelerator will serve a smaller market than a general-purpose CPU. Both impact the economics of building accelerators.”

The answer, in the view of Netronome and others working in the OCP is to go with multichip modules assembled from ‘chiplets’. These are regular chips but designed in such a way that they are only ever used in a module. They do not have the antistatic protection and resilient I/O ports of chips that are intended to go directly onto a PCB.

Late last year, Netronome kicked off an effort called the Open Domain-Specific Accelerator (ODSA) group, which is now part of OCP. The name is slightly misleading in that it’s more about chiplets than accelerators, but as the lead market is in servers, it will do for now. The ODSA group wants to create a set of standards that will make it possible for designers to mix and match chiplets almost as easily as today’s PCB designers do with conventional integrated circuits (ICs). That means finding standard ways for them to talk to each other at high speed and with as little power as possible. This may be one of the biggest technical stumbling blocks as there are stringent requirements and some of them conflict.

For example, the serial links used today to interconnect high-speed processors can easily meet the low-power limits expected of chiplets. However, they add hundreds of nanoseconds of latency, which is bad news for a designer trying to break closely coupled processors apart to go into chiplets that will be just millimetres apart. Old-school parallel buses get around the latency problem and can also hit the energy budget with some tweaks, but they will make it harder to lay out the module and may call for much more expensive substrates. The manufacturers of these devices have to agree on what they will use and that is never an easy process.

The problems facing chiplet users are not just technical. There are serious business reasons as to why it is hard to get an open market in the components off the ground. The biggest stumbling block is the semiconductor industry’s strategic use of secrecy.

If there is one thing no company making semiconductors wants to talk about its yield at the wafer level. This kind of information becomes readily available if you are buying chiplets: they will often be supplied on complete wafers. You can learn a lot about a company’s operating margins simply by counting the red dots of failure on each chip on a wafer, so it’s no surprise that management do not want to give up that information if they can help it. In reality, they do because many have to outsource production to packaging houses and test specialists. However, woe betide anyone at those suppliers who lets word get out of a particular client’s troubles: leaks in such a controlled environment should be easy to track down.

With an open market in chiplets, it could be any of your customers who give the game away on how much of each wafer they get is junk. The few chiplet assemblers who are out there right now either consume their own chiplets or have painstakingly built up a position of trust with a select few major suppliers. Intel and Marvell fall into the former category. In the second category we have the large manufacturers who are today stacking memories on top of their own processors – think Apple. Or they are the even rarer startups who believe multichip modules are the future, companies like Octavo and zGlue. Although Octavo recently persuaded STMicroelectronics to assemble its products into modules, that was after establishing a track record with Texas Instruments, where Octavo’s CEO Gene Frantz was a senior technologist before moving to his startup. Convincing ST was a major step forward for the Octavo business in terms of how a chiplet market is perceived.

Due to the lower volumes involved, it is possible that the module assemblers will mostly work with parts that have already been diced and put into chip-scale packages. This is not necessarily a problem, even for high-performance accelerators. The packaged devices will most likely use serial interconnect, but for things like management processors and networking controllers that latency is manageable. The custom accelerator part, on the other hand, would be under the control of the designer and so can make use of the trickier, low-latency buses without worrying about yield data. That data is not going outside of their captive supplier.

Even now, the multichip module faces an uphill struggle for acceptance. Yet the economics and business models are shifting in its favour and this latest effort, despite all the technical and commercial obstacles, may be the one that finally creates a sustainable market.

Chris Edwardshttps://eandt.theiet.org/rss

E&T News

https://eandt.theiet.org/content/articles/2019/04/electronic-connections-economics-slowly-shift-toward-the-multichip-option/

Powered by WPeMatico