Intel unwraps Lunar Lake architecture: Up to 68% IPC gain for E-cores, 14% IPC gain for P-Cores

Lunar Lake NPU 4.0

Intel shared deep dive architectural details of its fourth-gen NPU unit at the event, but I was unfortunately unable to attend this specific briefing. I do have a recording that I will watch and use to update this section, but for now I’ll have to let the slides do most of the talking.

The NPU is the central component in Intel’s AI strategy, and with 48 TOPS of performance it easily meets Microsoft’s requirements for next-gen PCs. However, the NPU is primarily designed for AI offloading for low-intensity work, thus saving tremendous amounts of battery power. The GPU steps in for more demanding workloads with 67 TOPS of performance, while the CPU contributes another 5 TOPS. Overall, that gives Lunar Lake 120 total TOPS of AI performance.

The key architectural components include 12 enhanced SHAVE DSPs, six neural compute engines, and a MAC array and DMA engine. This is fed with twice the memory bandwidth of the prior-gen NPU on Meteor Lake, and the NPU also has access to the 8MB shared side cache on the compute tile. This further enhances efficiency.

Overall Intel claims a 4X improvement in peak performance and a 2X improvement in performance at the same power over the previous-gen NPU 3.0 used in Meteor Lake.

Lunar Lake Platform Controller Tile and Connectivity

The Platform Controller Tile houses all of the external I/O functions for the chip, including Wi-Fi and Bluetooth, USB 3.0 and 2. 0, Thunderbolt, and the PCIe 4.0 and 5.0 interfaces. It also houses the memory controllers.

Intel guarantees that all Lunar Lake laptops will have at least two ports of Thunderbolt 4 connectivity, while some models will offer up to three ports. Intel used Thunderbolt 4 instead of the newer Thunderbolt 5 due to the target market for this class of laptop. The interface also supports the new Thunderbolt Share feature, which allows the interface to provide drag-and-drop file sharing functionality between PCs, along with screen and peripheral sharing.

The platform also supports Bluetooth 5.4 and Wi-Fi 7 that’s partially embedded into the Platform Controller Tile. Wi-Fi 7 functionality still requires another CNVi module that’s connected externally via the CNVi 3.0 interface. The new BE201 CRF module is 28% smaller than prior-gen Wi-Fi modules.

Lunar Lake Thread Director Improvements

This is another area of the architecture that I wasn't able to attend the briefing, but we'll update this section once we have more time. The above slides provide most of the high-level overview.

Thoughts

Intel’s rethinking of its first-order priorities is important as it looks to fend off Apple’s M3, Qualcomm’s new Snapdragon X Elite, and AMD’s Ryzen AI 300 series processors. Intel will release Lunar Lake as two models, at least initially, but it hasn’t shared the final specifications for those models yet. Intel plans to ship 40 million AI-enabled processors by the end of the year, and Lunar Lake wafers are already in the company's fabs. The chips will arrive in shipping systems in Q3, 2024.

Intel’s Lunar Lake architecture, and all of the associated core IPs, represents a dramatic rethinking of the company’s design goals to a power-first design to maximize battery life and performance. The improved design methodology and CPU and GPU microarchitectures will soon filter down to Intel’s other mobile products, like the upcoming Panther Lake, its Arrow Lake chips for desktop PCs, and its data center Xeon 6 processors.

The current Meteor Lake processors were the first step on this road to placing multiple tiles on a single package, and Intel looks to make improvements in every aspect of the design with Lunar Lake. With more competition in the mobile sector, Intel needs its upcoming processors to be more revolutionary than evolutionary, and the architectural deep dive seems to indicate everything is in place. We'll find out this fall how it all comes together, and how Lunar Lake competes with other options.

Paul Alcorn
Managing Editor: News and Emerging Tech

Paul Alcorn is the Managing Editor: News and Emerging Tech for Tom's Hardware US. He also writes news and reviews on CPUs, storage, and enterprise hardware.

  • dimar
    I'd like to see extensive power consumption/performance benchmarks for the next gen. Intel, AMD, ARM CPUs and how slower and faster SSDs affect the battery life.
    Reply
  • cyrusfox
    Fascinating the disparity in improvement between Floating point vs integer single thread uplift. Huge FP uplift! Eye popping improvements!!! That is lame though, comparing to the gimped e-cores...
    but the comparisons are once again being made to the low-power Meteor Lake E-core instead of the full E-core
    iGPU uplift is looking really solid, this seems to check all the boxes of what I need in a device. For toting around I really don't need 24+ cores, 8 is plenty and with this much GPU as well as hopefully maturing platform. Excited to see the products and where the pricing lands. Hope Framework gets a flavor of this.
    Reply
  • jenci8888
    68% ipc gain e-core? That doesn't seem right... I think they meant 68% gain e-core on specfp. It should be 30% ipc around.
    Reply
  • usertests
    cyrusfox said:
    That is lame though, comparing to the gimped e-cores...
    They also compare it to Raptor Cove, and if I'm reading it right, claim +2% IPC over that. With the caveat repeatedly mentioned in the article that Intel is giving itself a 10% margin of error.
    Reply
  • thestryker
    Well it seems we have an answer regarding HT and that is it depends. It'll be interesting to see which version of the Lion Cove desktop ARL uses. I wouldn't be surprised if mobile ARL went without HT, but it seems like desktop could probably keep it even though they emphasize hybrid when referring to dropping it.

    Will be looking forward to seeing real world performance on LNL and ARL.
    cyrusfox said:
    That is lame though, comparing to the gimped e-cores...
    Just a guess based upon the testing Chips and Cheese did on the MTL LP E-cores removal from the ring bus and thus the L3 cache can have an outsized impact on performance. This would in theory be the closest comparison to an existing product.
    Reply
  • Giroro
    I think cheap Mini PCS using Intel's N100 all E-core processor are a perfectly usable office machine for many people, and I would love to see those upgraded with the new E cores.

    That said, I would never buy one, because they would definitely spec those machines with a worthless amount of non-upgradeable memory.
    One of the major features adding value of the N100 machines, is that you can usually upgrade the RAM.

    Now for the high end Lunar Lake products... There is no high end. If Intel doesn't convince manufacturers to keep ultrabooks with the highest available configuration under $1200, then they're going to have a problem
    Reply
  • Evildead_666
    Can people please stop using the word Architect as a verb please ?
    "Redesigned" would have been perfect for this article.
    Architected or Rearchitected do not exist.
    Cheers.
    Reply
  • Dragos Manea
    Evildead_666 said:
    Can people please stop using the word Architect as a verb please ?
    "Redesigned" would have been perfect for this article.
    Architected or Rearchitected do not exist.
    Cheers.
    That would to similar with the article from which they copied and pasted, they had to change some words even if it is with words that does not exist.
    Reply
  • bit_user
    The article said:
    38% and 68% IPC gains in the new Skymont architecture.
    This is based on a somewhat biased performance comparison (see below).

    cyrusfox said:
    Fascinating the disparity in improvement between Floating point vs integer single thread uplift. Huge FP uplift! Eye popping improvements!!!
    That is lame though, comparing to the gimped e-cores...
    Initially, I missed what you probably meant by "gimped". As I see @thestryker has pointed out, comparing to the LP E-cores is indeed quite lame of them, since its lack of L3 cache has been shown to disadvantage it relative to the Crestmont cores on Meteor Lake's CPU tile.

    Dang. I was really excited for a minute, there.
    : (
    Reply
  • bit_user
    Giroro said:
    I think cheap Mini PCS using Intel's N100 all E-core processor are a perfectly usable office machine for many people, and I would love to see those upgraded with the new E cores.
    Yeah, they seem to be somewhere around the performance of a Sandybridge or Haswell i5, which is still pretty usable. Of course, their iGPU is much better than those CPUs'.

    Giroro said:
    That said, I would never buy one, because they would definitely spec those machines with a worthless amount of non-upgradeable memory.
    One of the major features adding value of the N100 machines, is that you can usually upgrade the RAM.
    You can find some that take a DDR5 SO-DIMM. I have 32 GB in my N97 machine. It doesn't need that much, but I did it just to get dual-rank, for the small performance boost it provides.
    Reply