Overview

The Linux kernel extensively uses the Berkeley Packet Filter (BPF) to allow user-written BPF applications to execute in the kernel space. The BPF employs a verifier to check the security of user-supplied BPF code statically. Recent attacks show that BPF programs can evade security checks and gain unauthorized access to kernel memory, indicating that the verification process is not flawless. In this paper, we present Moat, a novel hardware-assisted, cross-platform isolation framework designed to protect the kernel from malicious BPF programs.
Moat leverages two classes of hardware primitives: key-based hardwares, such as Intel Memory Protection Keys (MPK) and Arm Permission Overlay Extension (POE), and virtualization-based haedwares, such as Arm Stage-2 translation, AMD Rapid Virtualization Indexing (RVI) and RISC-V H-mode. Given the widespread support for Intel MPK and Arm Stage-2 translation on modern processors, we select these two primitives from each class as representative examples to illustrate Moat's cross-platform design.
Moat introduces a two-layer memory isolation scheme that leverages hardware features such as Intel MPK and Arm Stage-2 translation to enforce isolation. Our design overcomes several key challenges, including the limited scalability of available hardware isolation mechanisms and the risk of helper function abuse. We implement Moat for Intel x86 and Arm on Linux (ver. 6.1.38), and our evaluation shows that Moat delivers low-cost isolation of BPF programs under mainstream use cases, such as isolating a BPF packet filter with only 3% throughput loss.


FAQ

(0) What is the difference between the USENIX Security paper and the TDSC paper?

The original Moat was initially confined to the Intel x86 platform, where it leveraged Intel MPK to isolate BPF programs. In the TDSC paper, we extended Moat to be a cross-platform framework, with significant enhancements focused on cross-platform capability. We generalize the design of Moat to support multiple architectures and introduce a new approach, Moat-vir, for platforms equipped with virtualization-based hardware primitives, where key-based primitives such as Intel MPK are unavailable.

(1) How to run Moat?

Check out our repo. We have a detailed guide on how to setup and run Moat. If you have questions about Moat, you can contact @jwnhy and @Lijian Huang. We will try to help you. (if you cite this paper; this is a joke.)
Note that this is a highly experimental prototype. DO NOT USE IT IN PRODUCTION.

(2) What challenges has Moat overcome?

Both key-based and virtualization-based hardware primitives offer limited support for scalable, fine-grained isolation, making it difficult to efficiently isolate a large number of concurrent BPF programs. To address this hurdle, we propose a novel two-layer isolation scheme that protects both the kernel and benign BPF programs from malicious BPF programs. Layer-I leverages the hardware isolation primitives to construct three isolation domains, preventing unauthorized kernel access from BPF programs. Layer-II enforces intra-BPF isolation within the same domain by assigning each BPF program a dedicated address space, while mitigating the Translation Lookaside Buffer (TLB) flush overhead with emerging hardware features.
We also propose two scheme to regulate the bahavior of BPF helper functions to prevent them from being abused by malicious BPF programs.

(3) What are the application scenarios for Moat?

If you want to allow unprivileged user to run BPF programs, but you don't want these BPF programs break your system, then you might consider migrating Moat to your system.
There are other things you need to fully enable unprivileged BPF on your system (e.g., access control), Moat only ensures the memory/helper safety of your BPF programs.

(4) What will we do in the future?

We are actively working with some company on turning Moat into a production-level system.



Publication

Moat: Towards Safe BPF Kernel Extension

Hongyi Lu, Shuai Wang, Yechang Wu, Wanning He, Fengwei Zhang

Presented in the Proceedings of 33rd USENIX Security Symposium

@inproceedings {moat,
	author = {Hongyi Lu and Shuai Wang and Yechang Wu and Wanning He and Fengwei Zhang},
	title = {{MOAT}: Towards Safe {BPF} Kernel Extension},
	booktitle = {33rd USENIX Security Symposium (USENIX Security 24)},
	year = {2024},
	isbn = {978-1-939133-44-1},
	address = {Philadelphia, PA},
	pages = {1153--1170},
	publisher = {USENIX Association},
}

Towards Secure BPF Kernel Extension with Hardware-enhanced Memory Isolation

Lijian Huang^, Hongyi Lu^, Shuai Wang*, Fengwei Zhang*

^ Equal contribution. * Corresponding authors.

To Appear in IEEE Transactions on Dependable and Secure Computing (TDSC), 2026
Early Access