AMD is unveiling its Carrizo APU at ISSCC this week. This new chip occupies an odd intersection between the old and new. On the one hand, it’s the last of the Bulldozer-derived architectures and we’ve known for months that the core would remain on the now-mature 28nm process. Despite these apparent limitations, AMD is quite bullish on this core’s potential.

New features and capabilities

As expected, Carrizo will be the first AMD APU to integrate the use of High Density Libraries (HDL)’s on the PC side. AMD made a number of changes to the CPU’s structure to support this — Excavator uses nine metal layers as compared to 15 for Steamroller. We also know that the L1 data cache is larger in this core, from 16KB to 32KB per core. Total IPC improvements for Excavator are estimated at ~5%. That’s on top of the 7-10% that AMD delivered with Steamroller — but unlike that chip, the only major changes disclosed so far is the increased L1 data cache.
`
Pic Here

The graphs and diagrams to the left show the advantage of using HDL as opposed to HP (high performance) libraries for Carrizo’s design. The CPU is dramatically smaller than it would’ve been otherwise, with a 30-40% reduction for specific areas. AMD claimed that it could get the benefits of a full node shrink from moving to HDL and based on die and feature sizes, it appears to have succeeded. As for performance, the graph on the right shows how the power-optimized version of Excavator is able to hit higher normalized frequency compared to Kaveri. The implication of the chart is that Excavator / Carrizo is capable of ~10% higher frequencies at 15W TDP as compared to Kaveri at the same power point.

The power saving mechanisms and metrics in Carrizo will have the most benefit at lower TDPs. This duplicates the pattern we saw with Kaveri, where the 95W desktop chips were often a wash when compared to the previous Richland processors but 15-35W mobile Kaveri was significantly faster than Trinity/Richland chips. Carrizo will target 12W – 35W TDPs, but the chip will shine the brightest around the 15W point.

Pic Here

The CPU inside Carrizo changed more than the GPU, but the graphics core inside Carrizo is getting a point update, to Tonga. AMD rebuilt the GPU core — the orange dots show the percentage of total transistors in the core that were previously of a particular type, while the blue dots show the percentage of transistors in Carrizo of that same type at normalized Ion / Ioff ratios. The entire GPU has shifted down and to the left, meaning it draws significantly less power. AMD claims that they can drive 10% higher frequency on the GPU core at the same power level or cut power consumption by 20% at the same frequency.

One other major change actually addresses an issue in the lowest-power Kaveri cores that we weren’t previously aware of. Some of AMD’s high-end mobile cores had eight GPU clusters for a total of 512 cores — but didn’t actually enable all eight Compute Units (CUs) at the same time. These systems are effectively limited to six CUs to keep the core from overheating. Carrizo removes this limitation, which should substantially improve performance.



Power improvements via AVFS

Carrizo’s largest gains are expected to be in battery life. Some of this comes from rebuilding the APU with HDL and some of it is from integrating the southbridge on-die. A huge chunk of the improvements, however, purportedly comes from Carrizo’s use of AVS (Adaptive Voltage and Frequency Scaling).

Some of you are likely familiar with DVFS (Dynamic Frequency and Voltage Scaling) as used by both Intel and AMD. In DVFS, the chip is programmed to use certain voltage levels that correspond to particular frequency targets. This creates a stairstep pattern of adjustment. DVFS is designed with large margins on purpose — you have to make certain that these pre-defined levels will operate smoothly in a wide range of operating environments and temperatures, and they have to be set to allow for all chips coming off a given process to operate properly.

Pic Here

Unlike DVFS, AVFS uses real-time monitoring of chip conditions to measure various characteristics of a processor as it operates. The clock speed and voltage can be adjusted by this monitoring hardware in real-time, which greatly reduces the operating margins that DVFS requires for smooth operation.

Using AVFS is more complicated than DVFS, and it requires more implementation hardware, but the benefits can be significant. Certainly AMD believes it’s worth it — the Excavator CPU core incorporates 10 separate AVFS monitoring points and roughly 500 frequency sensing paths. The impact on low-power operation is seen in the graph to the right. At 15.0W, the Excavator with AVFS is clocking significantly higher than the Steamroller chip without this technology. The benefits appear to be nearly as large as the gains from moving to HDL with Excavator from Steamroller at 15W.

One thing to note is that both the blue and purple lines arc back down toward the original yellow line as TDP and frequency ramp up. Just as mobile Kaveri was strongest below 65W, Carrizo will offer the best performance gains over Kaveri at 15W-20W. That’s the downside to the HDL approach — packing transistors into tighter spaces gives you a die size advantage, but it comes at a price.

Pic Here

AMD’s other major innovation with Carrizo is support for the S0i3 sleep state. This is expected to significantly cut power consumption as the old S3 option, but with vastly improved exit and entry times.

Putting it all together

For months, AMD has been saying that Carrizo would be the most power-efficient CPU it had ever built. Obviously we need to see shipping hardware to make that determination for ourselves, but looking at the systems and concepts the company has implemented, I can see why they’re saying it. The use of AVFS and the modest IPC improvements, combined with higher clock speeds, should allow Carrizo to hit dramatically better performance per watt figures, offer better battery life, and close the gap between AMD and Intel.

Pic Here

We don’t expect AMD to seriously challenge Intel in the ultrabook space, but the company has previously targeted the $300-$400 segment with its APUs. Pushing performance up by 10-15% should help them win fresh designs, as will H.265 decode. More importantly, these gains establish Carrizo as something more than the also-ran shoved out the door to buy time for Zen. If the AVFS technology performs as advertised, it’s easy to see AMD incorporating this hardware into future products as well. What AMD needs most is a product that’s “good enough” to convince OEMs to at least hold orders and business steady. Based on what the company is unveiling at ISSCC, Carrizo seems like it could do that, at minimum — possibly with a fresh set of wins as icing on the cake.