Arm has unveiled the Cortex-X4, its latest flagship performance core and the most powerful Arm core ever developed. This fourth-generation Cortex-X series core offers a significant 15% performance improvement over its predecessor, the Cortex-X3, which is used in devices like the Snapdragon 8 Gen 2.
The Cortex-X4 introduces major changes to its front end, including a redesigned instruction fetch delivery system. It eliminates the macro-operation cache and expands the pipeline to support up to 10 instructions, resulting in improved bandwidth. Arm has also enhanced the accuracy of the branch predictor, reducing stalls during real-world workloads.
In the back end, the Cortex-X4 features upgrades to its out-of-order execution capabilities. It includes two integer MAC units, a third branch unit, and eight integer ALUs, enhancing processing power. The core’s out-of-order buffers have increased by 20%, providing more efficient execution. Although it falls short of the Golden Cove core’s ROB, the Cortex-X4’s 384-entry ROB surpasses the Sunny Cove core’s performance.
The Cortex-X4 incorporates improvements in its memory subsystem. It rebalances the pipes, featuring one general-purpose AGU, two load AGUs, and one store AGU. The data prefetcher and the L1 temporal data prefetcher have been enhanced to improve memory performance. The core’s private L2 cache has been expanded to allow up to 2MiB, doubling the capacity compared to the previous generation.
Arm emphasizes a 13% improvement in IPC for the Cortex-X4, showcasing its commitment to performance. Manufactured using advanced processes like TSMC’s N3E 3nm, the Cortex-X4 demonstrates its high-end positioning. It can be integrated into systems using the DSU-120, supporting up to 32MB of shared L3 cache. The DSU-120 introduces a power mode to reduce leakage power and enhance efficiency.
The DSU-120 offers flexible core configuration, allowing system designers to combine Cortex-X4, Cortex-A720, and Cortex-A520 cores as needed. This versatility benefits laptops, where a combination of 10 Cortex-X4 cores and 4 Cortex-A720 cores optimizes performance. The Cortex-X4’s compatibility with the DSU-120 enables adaptable and efficient high-performance computing systems.
The Cortex-X4’s arrival promises to revolutionize various industries, including mobile devices, laptops, servers, and embedded systems. Its remarkable performance enhancements and architectural improvements set new standards for processing power and efficiency. Arm’s dedication to innovation and its collaboration with leading manufacturers reaffirm its commitment to pushing the boundaries of technology.