Ibex ICache Stall Bug: PMP Errors Cause Infinite Loops

by Alex Johnson 55 views

Understanding the iCache Stall Issue in Ibex

In the world of embedded systems and processor design, iCache stalls are a critical concern. These stalls occur when the instruction cache (iCache) encounters a problem fetching instructions, leading to delays and potential performance degradation. Recently, a specific issue has surfaced within the lowRISC Ibex processor, specifically related to how the iCache interacts with the Physical Memory Protection (PMP) unit. This bug report delves into the intricacies of this problem, explaining how PMP errors, when encountered during an iCache fetch, can lead to an infinite stall within the Ibex core. We’ll explore the observed behavior, the expected correct behavior, and the steps to reproduce this issue, providing a comprehensive understanding for developers and users alike. The goal is to shed light on this performance bottleneck and guide towards a resolution, ensuring the robustness and reliability of the Ibex processor, especially in security-sensitive applications where PMP is a key feature. The implications of such stalls can range from minor performance hits to complete denial-of-service scenarios, making it crucial to address.

This issue is particularly interesting because it highlights a complex interaction between two core components of the processor: the iCache responsible for efficient instruction retrieval and the PMP unit designed to enforce memory access controls. When these two interact unfavorably, the consequences can be severe. The infinite stall means the processor essentially freezes, unable to proceed with any execution. This is not a simple delay; it's a complete halt that requires a reset or external intervention to resolve. The problem arises when the iCache attempts to fetch an instruction from a memory region that the PMP has subsequently been configured to deny execute access to. Normally, a PMP violation should trigger an exception, allowing the system to handle the error gracefully. However, in this specific scenario with Ibex, the pipeline gets stuck, with the program counter (PC) seemingly frozen, indicating a failure in the exception handling mechanism or a deeper pipeline deadlock. This is a significant flaw, as it can be exploited to halt the processor, making it a critical bug for systems relying on both iCache efficiency and memory protection.

Furthermore, the reproducibility of this bug is tied to specific configurations. It requires both the iCache and the PMP to be enabled, and the error occurs only when a PMP access denial is triggered during an iCache fetch. This specificity makes it challenging to debug but also provides clear parameters for testing and verification. The testbench provided, tb_icache_pmp_error.sv, is designed to precisely replicate this scenario. It sets up a situation where an instruction is initially prefetched into the cache from an allowed region. Then, the PMP rules are changed to disallow access to that exact region, while still allowing access to other critical areas like the trap handler. When the core attempts to execute instructions from the now-denied region, the iCache encounters the PMP error, and the stall occurs. Understanding these precise steps is key to grasping the root cause and potential solutions. The fact that this is an internal issue to the Ibex RTL, not related to external memory or testbench errors, points towards a design or implementation flaw within the core logic itself. This demands careful examination of the iCache and PMP interaction logic.

Observed Behavior: The Infinite Stall Explained

Let's dive deeper into the observed behavior when this iCache stall bug is triggered. The scenario begins with a seemingly normal operation. An initial successful prefetch occurs, meaning the iCache brings instructions from a memory region that is currently permitted. For instance, the testbench might allow access to memory addresses 0x10000030-3F. Once these instructions are safely in the cache, the PMP configuration is altered. In TOR (Top of Range) mode, the PMP is set to deny execute access specifically to that 0x10000030-3F region. However, it's crucial to note that other regions, such as the trap handler located at 0x10000040-8F, remain accessible. This setup is deliberate, designed to trigger the PMP error during a subsequent iCache fetch.

When the processor core then attempts to fetch or execute an instruction from the memory region that has now been denied by the PMP (e.g., through a jump or a branch instruction targeting an address within 0x10000030-3F), the iCache encounters the access denial. This is where the bug manifests. Instead of handling the PMP denial as an exception and triggering a trap, the core enters an infinite stall. The pipeline essentially freezes, and the simulation hangs indefinitely. The provided testbench is configured to time out after exactly 100,000 cycles, a clear indicator that no further progress is being made. During this stall, there is no trap exception, no illegal instruction detection, and no ebreak instruction is executed. The program counter (PC) remains stuck at or very near the address that caused the PMP denial, typically around 0x10000030. This prevents any forward execution, including any attempts to execute code from previously allowed and cached regions, such as a loop that might be running at 0x1000001c-20. It’s a complete and utter standstill.

To further illustrate the severity, consider the baseline run without any PMP denial. In this scenario, execution proceeds remarkably quickly, with the core often hitting an ebreak instruction in just a few cycles (e.g., cycle 4). However, even this baseline run exhibits an unexpected behavior: it appears to jump to the trap handler PC (0x10000080) rather than executing the ebreak directly at its intended location. This suggests that there might be secondary issues at play, possibly related to misaligned fetches or other exceptions that occur even when PMP is not actively causing a denial. This observation, while separate from the main stall bug, points towards potential areas for further investigation in the iCache's fetch logic.

The core problem, the denial-of-service-like hang, is specifically reproducible only when both the iCache and PMP are enabled and configured in a way that triggers a PMP error during an iCache fetch. Critically, this issue is confirmed to be internal to the Ibex RTL. The testbench does not report any external memory errors; the problem lies entirely within the core's logic when handling the interaction between the iCache and the PMP unit. This internal nature makes it a challenging but essential bug to fix for ensuring the reliability and security guarantees of the Ibex processor.

Expected Behavior: Graceful Exception Handling

When the iCache encounters a PMP denial during an instruction fetch, the expected behavior from the Ibex processor is a clean and predictable exception handling mechanism. Instead of entering an infinite stall, the core should immediately recognize the access violation and raise an appropriate exception. Specifically, this should manifest as an instruction access fault exception. According to the RISC-V privilege specification, this corresponds to mcause = 1. The mtval register, which provides information about the fault, should be set to the precise memory address that caused the fault, in this case, the address within the denied region (e.g., 0x10000030). This immediate and accurate reporting of the error is the first step towards a robust system.

Following the exception, the core should cleanly trap to the configured exception handler. The handler's address is typically set via the mtvec register. Assuming the trap handler is located at 0x10000080, as in the test case, the processor should transfer control to this address. From the trap handler, the software can then analyze the mcause and mtval registers to understand what happened and take corrective action, such as logging the event, restarting the instruction, or terminating the offending task. This orderly transition prevents the program from crashing or freezing unpredictably.

Crucially, the iCache itself should play a role in managing this error condition. Upon detecting the PMP denial, the iCache should signal the error, possibly through an output signal like err_o or pmp_err. It should also invalidate any partial or incomplete cache line that was being fetched when the error occurred. This prevents potential data corruption or the fetching of incorrect instructions later on. While a brief stall might be acceptable to allow for error propagation and handling (perhaps less than 100 cycles), it should not result in an indefinite hang. The pipeline should quickly resolve the error condition and transition to the trap handler, ensuring that the system remains responsive.

Regarding the baseline scenario (without PMP denial), the expected behavior is straightforward execution. The program should proceed normally to its intended target. For example, if an ebreak instruction is placed at 0x10000028, the processor should execute it directly at that address and cycle count, not be diverted to the trap handler at 0x10000080. This means low cycle counts and no unexpected faults or exceptions. The fact that the baseline run in the observed behavior did divert to the trap handler suggests a potential issue with instruction alignment or other subtle exceptions that might be masked by the PMP stall bug, but should still be addressed for full correctness.

In summary, the expected behavior is a well-defined, exception-driven response to PMP violations during iCache fetches. This ensures system stability, security, and predictability, allowing the software to handle memory access control failures gracefully rather than suffering a complete system freeze. This aligns with the principles of robust processor design and secure memory management.

Steps to Reproduce the Issue

Reproducing the iCache stall bug in the Ibex processor requires a specific setup using the provided testbench and simulation environment. The key is to create a scenario where the iCache attempts to fetch an instruction from a memory region that is actively being denied by the PMP unit. The following steps outline how to trigger this behavior using the tb_icache_pmp_error.sv testbench.

First, you need to compile and run the simulation using the appropriate commands for your EDA tool. The example provided uses Xilinx's xsim command, as detailed in the tb_icache_pmp_error_compile.do and xrun.do files. These scripts handle the compilation of Verilog sources and the execution of the simulation.

Here’s a breakdown of the reproduction process:

  1. Environment Setup: Ensure you have Vivado (version 2025.1 in this case) installed and configured correctly. The simulation is run within a Cygwin environment on a Windows machine, but the principles apply to other environments.
  2. Testbench: Use the tb_icache_pmp_error.sv testbench. This testbench is specifically designed to create the conditions necessary for the bug. It includes logic to manage the PMP configuration changes during the simulation.
  3. Compilation and Simulation: Execute the compilation script (e.g., tb_icache_pmp_error_compile.do) followed by the simulation script (e.g., xrun.do). The exact commands might vary depending on your setup, but they typically involve compiling the Verilog files and then running the simulation with specific top-level modules and simulation time limits.
  4. Triggering the PMP Error: The testbench internally handles the sequence:
    • It first allows the iCache to perform an initial successful prefetch of instructions from a region (e.g., 0x10000030-3F).
    • Then, it configures the PMP module to deny execute access to this exact region using TOR mode.
    • Crucially, it continues to allow access to other essential areas, such as the trap handler (e.g., 0x10000040-8F).
    • Finally, it prompts the core to attempt fetching an instruction from the now-denied memory region (e.g., via a jump or branch).

Expected Output (Indicating the Stall):

When the bug is successfully reproduced, the simulation output will clearly indicate the stall. The testbench is designed to detect this condition and report a timeout. A typical output indicating the stall would look something like this:

Time resolution is 1 ps
run -all
Waiting for prefetch to complete...
PC reached denied region (prefetch) at time               155000
PC returned from prefetch at time               185000
Applying PMP denial forces at time               185000 (allow up to 0x1000002F, deny 0x10000030-3F, allow trap handler 0x10000040-8F)
Timeout with PMP:     100000 cycles - stall bug
Cycles with PMP:     100000
Releasing PMP forces at           1000185000
ebreak at cycle          4 (PC: 0x10000080 [trap handler])
Cycles without PMP:          4
$finish called at time : 1000245 ns :

The critical lines here are:

  • Applying PMP denial forces... : This shows when the PMP configuration is changed to deny access.
  • Timeout with PMP: 100000 cycles - stall bug : This is the definitive indicator that the processor has stalled indefinitely, and the simulation has been terminated by the timeout condition set in the testbench.
  • Cycles with PMP: 100000 : Confirms the duration of the stall before timeout.

Baseline (No PMP Denial) Output:

For comparison, the output without the PMP denial enabled would show a much faster execution, typically reaching the ebreak instruction much sooner:

  • ebreak at cycle 4 (PC: 0x10000080 [trap handler])
  • Cycles without PMP: 4

Note that even in the baseline, the ebreak appears to be at the trap handler address, suggesting other potential minor issues that are distinct from the primary stall bug but worth noting.

By following these steps and observing the specific output, developers can reliably reproduce the iCache stall due to PMP errors in the Ibex processor. This is essential for debugging and verifying any proposed fixes.

My Environment and Ibex Version

To effectively diagnose and resolve the iCache stall bug related to PMP errors, understanding the specific environment and Ibex version used is crucial. This information helps pinpoint potential toolchain issues, version-specific behaviors, or hardware dependencies.

Testbench Files:

The issue was observed and reproduced using the following testbench files:

  • tb_icache_pmp_error.sv: This SystemVerilog testbench is central to reproducing the bug. It contains the logic to orchestrate the iCache prefetch and the subsequent PMP configuration change that triggers the stall.
  • tb_icache_pmp_error_compile.do: This file contains the commands for compiling the Verilog sources using the EDA tool.
  • xrun.do: This file executes the simulation.

These files were made available through GitHub attachments, indicating a commitment to sharing the exact setup used for bug discovery.

EDA Tool and Version:

The simulation was performed using Xilinx Vivado version 2025.1. Vivado is a comprehensive suite of electronic design automation software from Xilinx, used for FPGA design and simulation. The specific version is important, as different versions can have subtle differences in their simulation engines or synthesis capabilities that might affect behavior.

Operating System:

The simulation environment was Cygwin. Cygwin provides a Linux-like environment on Windows, allowing users to run many Unix-specific tools and applications. While the simulation logic is in Verilog/SystemVerilog, the build and run scripts might rely on Unix-like shell commands often found in Cygwin.

Version of the Ibex Source Code:

The specific version of the Ibex source code used for this report is identified by the Git commit hash: 1f2232a94581538f535f4bb32b2455d8c9beadf1. This hash points to a particular state of the Ibex repository. It is important to note whether any modifications were made to the source code beyond what this commit represents. In this case, it is implied that the report is based on this specific commit without user modifications to the core Ibex RTL files.

This detailed environmental information ensures that the bug can be accurately reproduced by others and helps in understanding if the issue is tied to a specific tool version, operating system setup, or a particular revision of the Ibex design. It forms the basis for collaborative debugging and resolution.

Conclusion and Further Acknowledgment

The iCache stall caused by PMP errors in the Ibex processor represents a critical issue that can lead to a complete system freeze, effectively a denial-of-service. This bug, observed when the iCache attempts to fetch instructions from a memory region subsequently denied by the PMP unit, bypasses the expected exception handling mechanism, resulting in an infinite pipeline stall. The accurate reproduction steps and detailed environmental information provided aim to facilitate swift debugging and resolution by the lowRISC team.

Understanding the interaction between the iCache and the PMP is vital for building secure and reliable embedded systems. PMP is a cornerstone of memory protection, and its failure to correctly trigger exceptions when violated can undermine the security guarantees of the entire system. The observed behavior deviates significantly from the expected graceful handling of instruction access faults, which should result in a clean trap to a software handler.

We trust that this detailed report, including the specific steps to reproduce and the environmental context, will enable the Ibex development team to quickly identify the root cause within the RTL. Addressing this bug is paramount for ensuring the stability and security promises of the Ibex core, especially for applications that heavily rely on memory protection mechanisms.

For further reading on processor architecture and memory management units, the following resources are highly recommended:

  • Explore the official RISC-V Specifications on the RISC-V International website, particularly sections detailing the Privileged Architecture and the Physical Memory Protection (PMP) extension. This provides the definitive guide to expected behavior for such mechanisms. You can find detailed specifications at riscv.org.
  • Learn more about the Ibex processor and its development by visiting the lowRISC GitHub repository. This is the central hub for the project, including documentation, code, and issue tracking. Check it out at github.com/lowRISC/ibex.