Smarter


Intel reiterated that just increasing the size of the buffers and the number of execution units requires smart algorithmic management to strike a balance between performance and the power budget. That revolves around improving branch prediction accuracy and reducing latency under load conditions. The net effect is a 'significant' increase in IPC. Intel didn't provide specific measurements. but promises to share more information as products come to market.

Intel also designed Sunny Cove to address specific use cases, like cryptography, AI, and compression/decompression workloads. The company accomplished these goals by creating new instructions and features to improve performance.

Exploding Memory Capacity

Intel also improved the amount of memory the processor can address, which is a key consideration given its goal of boosting memory capacity with Optane DC Persistent Memory DIMMs. The speedy Optane memory modules provide up to 512GB of memory-addressable storage per DIMM, meaning memory capacity is set to explode as more data centers transition to the technology.

Intel's Sunny Cove moves to a five-level paging structure (up from a four-level structure). That increases the virtual address space up to 57 bits and the physical address space up to 52 bits, meaning it supports up to 4 petabytes of memory. That's up from 64 TB of addressing capability with Skylake.

Intel's New Course with OneAPI

Intel's Sunny Cove is an innovative design that looks promising, but as with all designs, we won't know the true benefits until we see the silicon in our labs. Intel's new vision to decouple its CPU core designs from its process improvements is the real advance that will help the company remain competitive in the future. Intel surely can't afford another period of stagnation like we've seen during its struggles with the 10nm process.

Third-party fabs have proven to be Intel's greatest competitor. TSMC has taken the process lead with its pending 7nm node, and as a result, new 7nm chips will soon flow from the stalwarts of the semiconductor space, like AMD, Apple, Qualcomm, and Nvidia. These companies work together with TSMC to bring their new designs to market, meaning that Intel isn't just competing with one company -- it faces the combined might of several of the behemoths in the chip market.

Intel does have a plan to outmaneuver its rivals by leveraging its wide-ranging product stack, but it surprisingly revolves around software. Intel is working on its new "One API" software, which is designed to simplify programming across its GPU, CPU, FPGA, and AI accelerators. The software goes by the tagline of "no transistor left behind," and given its goals, that's an accurate statement. The new software provides unified libraries that will allow for applications to move seamlessly between Intel's different types of compute. If successful, this could be a key differentiator that other firms will not be able to match with as many forms of compute.

The company does have its own plans for its resurgence on the process front, though, as evidenced by the brief display of its 10nm Ice Lake data center chip. The company didn't share any details about the new processor.

Intel has been notoriously silent on its roadmap and future plans, but its Architecture Day event seems to signal a new level of openness and disclosure from the company. Several executives from Intel's executive management team were present for the event and were willing to answer our questions openly and frankly. The company also plans to hold future events to drill down more on these topics as it comes closer to delivering products to market.

Intel held roundtable discussions at the event, where Raja Koduri (R) and Jim Keller (J), and Dr. Murthy Renduchintala (M) fielded questions during the last session. Here are some of the questions and answers:

Q: A lot of the CPU microarchitecture at Intel has been hamstrung by delays on process node technology. What went wrong, and what steps have been made to make sure it doesn't happen again?

R/J: Our products will be decoupled from our transistor capability. We have incredible IP at Intel, but it was all sitting in the 10nm process node. If we had had it on 14nm then we would have better performance on 14nm. We have a new method inside the company to decouple IP from the process technology. You must remember that customers buy the product, not a transistor family. It’s the same transformation AMD had to go through to change the design methodology when they were struggling. At Apple it was called the ‘bus’ method.

M: This is a function of how we as a company used to think about process node technologies. It was a frame tick (limiting factor) for how the company moved forward. We've learned a lot about how this worked with 14nm. We now have to make sure that our IP is not node-locked. The ability to have portability of IP across multiple nodes is great for contingency planning. We will continue to take aggressive risks in our designs, but we also will have contingency. We need to have as much of a seamless roadmap as possible in case those contingencies are needed, and need to make sure they are executed on ASAP if needed to keep the customer expectations in line. You will see future node technologies, such as 10/7, have much more overlap than before to keep the designs fluid. Our product portfolio on 14nm could have been much better if our product designs were not node-locked to 10nm.

R: In the future there will be no transistor left behind, no customer left behind, and no IP left behind.

Q: Will we ever see a 10nm monolithic desktop CPU at the high end?

R: Yes.

Q: How is 10nm? Has it changed?

R: It is changing, but it hasn't changed. There are a lot of lessons learned in how Intel approached it to begin with. We are established a much better model between manufacturing and design. We want good abstractions in product and process node going forward. When everything was going well, this issue didn't manifest and so wasn't an issue. There's complexity here when something bad happens on process, so the whole pipeline clogs up -- the rest of the world solves this with abstraction. We need to make sure it won't happen again, and we have a desire to build resilience in the roadmap.

Q: Are there plans for mixed SoCs, combining CPU / GPU / AI / FPGA ?

R: In our roadmap there will be scalable vector/matrix combinations. What our customers want are very scalable solutions. Customers want similar programming models regardless of the silicon.

Q: What has been the effect of hiring Raja/Jim and bringing outsiders to Intel?

M: Intel is very innovative. We want to add to that chemistry and make sure we bring in people who understand Intel but also bring in good ideas. It's about respecting the rest of the market and make sure Intel is competitive. It's balancing the center of internal debates by making sure we are challenging internal beliefs and the status quo by bringing in people who have done this sort of thing before. It shows to Intel's strengths in its ability to absorb interesting ideas from the outside. We went for the very best on the outside because that was what required to join with the very best on inside.

Q: What is Intel’s current approach to 5G, given the topics discussed today?

M: We think about 5G from the datacenter to the network to the edge and to the device. We at Intel believe the transition to 5G and its implications on the network, in terms of accelerating data and catalyzing a software-defined network where bespoke silicon gets replaced by containers, is as transformative as the jump from analogue to digital. It will accelerate the "cloudification" of the network. The edge is important, especially to minimize latency for new services. Sub-millisecond latency for these services is critical. The over-the-air interface is important too. The intelligent cloud domain is going to be the flywheel about how fast the industry evolves. We mentioned in November that our XMM 5G modem will be in the hands of partners in the second half of 2019 with products in early 2020. It is a multi-mode 5G LTE architecture from day one, supporting all 3 mmWave bands, and sub-6 GHz frequencies.

Q: As Thunderbolt 3 requires additional chips, how do you see future OEM adoption?

M: Integrated Type-C Thunderbolt 3 is the first generation. We will refine it in the future - that's the natural genealogy of the technology. We constantly think about how much we integrate into the chip and how much we leave on the board.

R: This is a big IP challenge, not only for TB3, but for other IP. Integrated PHYs are important. For example, by dis-aggregating the transceiver in our FPGA lineup, it has allowed us to focus on that decoupled IP a lot.

Q: In the demo of FOVEROS, the chip combined both big x86 cores built on the Core microarchitecture and the small x86 cores built on the Atom microarchitecture. Can we look forward to a future where the big and little cores have the same ISA?

R: We are working on that. Do they have to have the same ISA? Ronak and the team are looking at that. However I think our goal here is to keep the software as simple as possible for developers and customers. It's a challenge that our architects have taken up to ensure products like this enter the market smoothly. We’ll also have a packaging discussion next year on products like this. The chip you see today, while it was designed primarily for a particular customer to begin with, it’s not a custom product, and in that sense will be available to other OEMs.

M: We've made the first step on a journey. That first step is a leap, and the next step is incremental. As we've said about One API strategy – if we homogenize the API then it'll go into all our CPUs. Foveros is also a new part/product that shows that we had a gap in our portfolio – it has helped us create technologies to solve an issue and we expect to expand on this in the future with new IP.

Q: Are you having fun with Foveros?

J: Because Raja deals in GPUs, he’s having fun with high bandwidth communications between compute elements. It's a new technology and we're having some experimentation with it. What is frustrating is that as an industry we hit a limit for current flux density a year before stacking technology became viable, so for high performance on stacking we're trying a lot of things in different areas. There's no point having to make thermal setbacks if it also removes the reason why you're using the technology. But we're having fun and trying a lot, and we'll see Foveros in a number of parts over the next 5 years. We will find new solutions to problems we don't even know exist yet.

Q: When is Manufacturing Tech Day?

M: We will tell you when it happens! I'm sure you all have opinions on Intel 10nm right now and yes we are looking at what we're doing, eating an amount of humble pie, but we're re-adjusting our process to make sure that we can take the best process no matter what the product is.