In 2014 nvidia released their own ``Nvidia Shield`` tablet. This is quite a good tablet with as main selling point it's ``Tegra K1`` SoC with a good GPU. The tablet was aimed at gamers.
However, shortly after launch it was discovered that some batteries had the tendency to expand or burn. In order to solve this, Nvidia decided to start a replacement program. Users could enter their serial number on a Nvidia website, along with their home address and would receive a new tablet from Nvidia without any costs. The old tablet would then be disabled remotely.
## Introduction
### OTA
Back then a relatively new concept was introduced in Android devcies, which was the **O**ver **T**he **A**ir(OTA) update, which is just an update mechanism for android devices. This mechanism allowed Nvidia to push updates to specific devices. The update that was pushed to devices that were set to be disabled was simple, it corrupted the BCT(more later) and Bootloader of the bad tablets and forced a reboot. After that the device would never wake again(up until now).
On XDA there is a [big post](https://xdaforums.com/t/kill-the-kill-switch-st-yy.3179489/) about how the update works. So all credits here to the people that investigated the update back then( @Beauenheim, @Jackill, and @runandhide05)
This post will restore one of the disabled tablets back to it's functioning state. To do this a bootROM exploit will be used and a debugger that I developed in the process. It also serves as an example device for several other use cases which I will write a post about later.
### Nintendo Switch
The [Nintendo Switch](https://en.wikipedia.org/wiki/Nintendo_Switch) is a game console that was launched in 2017 that contains *almost* the same chip as the Nvidia Shield Tablet. Several independent researchers found a bug in the BootROM and used it to gain access to the Nintendo Switch(fusee gelee). This exploit was then adapted to the T124 chip which is in the Nvidia Shield [by this guy](https://github.com/LordRafa/ShofEL2-for-T124).
So, there is a method to gain RCE on a device, let's try to see if we can restore a tablet.
## EMMC
The simplest approach to unbricking this device is to unsolder it and reprogram it. To reprogram it you can use a EMMC writter, like EasyJTAG or a device as I used below:
![EMMC programming](images/emmc_isp.jpeg)
This will connect the device as a sdcard and allows you to reflash the bootloader. The chip can then be soldered back to the tablet and it will work again.
But this is no fun at all and I can't introduce my debugger, so let's continue our software only approach.
[323058.201469] usb 1-5.3.1: new high-speed USB device number 91 using xhci_hcd
[323058.302006] usb 1-5.3.1: New USB device found, idVendor=0955, idProduct=7f40, bcdDevice= 1.01
[323058.302011] usb 1-5.3.1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[323058.302012] usb 1-5.3.1: Product: APX
[323058.302014] usb 1-5.3.1: Manufacturer: NVIDIA Corp.
```
And in lsusb
```bash
Bus 001 Device 091: ID 0955:7f40 NVIDIA Corp. APX
```
APX is the emergency download mode from Nvidia, similar to EDL mode on Qualcomm devices. Reusing the code from LordRafa I dumped the BootROM and loaded into Ghidra. On github also the source code of both T210 and T214 bootROMs was leaked, which makes reversing very easy.
### BCT
One of the offsets provided by LordRafa is *do_bct_boot*, which should allow loading a BCT without security checks. This would be a good starting point, however upon further inspection it only resets the chip on my ROM:
I think the inteded function was 4 bytes higher and that would allow booting a BCT from UART. This is nice, however I have no UART on this device. There is a [github issue](https://github.com/NVIDIA/tegra-nouveau-rootfs/issues/22) that describes a UART interface over the sd card controller. I don't know if it will work from within the ROM and it's probably no worth the effort since I will probably have to maintain control over the device after it boot's the BCT(Trustzone, Fastboot locks).
### PCB
The exploit chain from LordRafa allows us load and jump into a new payload. When the device hangs you can press and hold the power button to let it reset and then try again. One thing I learned is that it's always worth the time to make a proper debug setup. So I took a PCB out of a tablet, connect it to a laptop and started shorting lines to ground on the tablet PCB. At some point there was a reset which I can see in dmesg. I soldered this line to a button and put the whole thing on a PCB. This way I can reset the device with a single push and try again. Saves lots of time and frustration.
I ofcourse forgot to document what I soldered before applying the hot glue so I have no clue which pin it was that I am shorting.
![pcb setup](images/pcb_setup.jpeg)
There is also a Raspberry Pico on this board which might be usefull for glitching later on.
### Debugger instrumentation
LordRafa already found the USB send/recv functions. Upon further inspection we can also see where our payload needs to be loaded, let's try to connect the debugger.
int recv(void *buffer, uint32_t size, uint32_t *num_xfer){
usb_recv(buffer, size, num_xfer);
return (int)&num_xfer;
}
```
For the Peek/Poke part this is all that actually needs to be setup for the debugger to function properly. For this target at least
#### Memory
The debugger needs to know where it is located at compile time, as well where the stack and storage locations are. We define them in a symbols file and to the linker:
```text
debugger_storage = 0x40013000;
debugger_stack = 0x40014000;
```
Linker script
```text
MEMORY {
ROM (rwx): ORIGIN = 0x4000E000, LENGTH = 0x1000
}
SECTIONS
{
. = 0x4000E000;
.text . : {
*(.text*)
*(.data*)
*(.rodata*)
} >ROM
}
```
Finally we need to build the target:
```makefile
ifeq ($(ANDROID_NDK_ROOT),)
$(error Error : Set the env variable 'ANDROID_NDK_ROOT' with the path of the Android NDK (version 20))
Let's connect to it from the ``Ghidra Assistant``. I rewrote the ShofEl2 exploit in python. The class is called ``TegraRCM``, maybe I will write a bit more on it later how it works but there are plenty of descriptions of the fusee-gelee exploit.
#Overwrite all calls to make the concrete target function properly
concrete_device.copy_functions()
```
Once we see b"GiAs" over our USB endpoint we know that the debugger is functional. We can test this by sending b"PING" over USB:
```python
rcm.dev.write(b"PING") # Writes
rcm.dev.read(0x100) # try to receive some data.
b'PONG'
```
The ``ConcreteDevice`` implementation needs to be configured with the correct addresses of the debugger and be told what the architecture is. In this case this is thumb and the Architecture debugger is set to this mode.
#### Playing around
Because the debugger is running we can now do some really cool stuff. For instance let's try to see the state of the processor:
And, one of the coolest features. We can restore the context and jump to a user specified address. So if we have the following function in our BootROM:
A variable ``bootinfo`` is loaded from a hardcoded offset in memory and populated with some version info and then passed to the function ``NvBootColdBoot``. We want to execute that function to see what is the result. The easiest method would be to patch this function and jump back to the debugger. If we jump in at
```c
puVar5 = BootRomVersion;
```
We can continue execution until after ``NvBootColdBoot``. But can we patch code in the ROM? Let's test this:
The result is not written back. The reason for this is that the ROM is shadowed(No faults are raised while trying to write to it). This means we can't write patches to the ROM. One approach would be to remap part of the ROM to RAM. This would allow patching but it's quite a bit of work. This will probably become a feature of the debugger in the future but for now it's not supported by default. Maybe instead we can just setup the memory structure ourself and execute the function?
```python
NVCOLDBOOT = 0x00101ad2
def nvbootcoldboot():
'''
Works, attempts to load the BCT from the EMMC
'''
nvbootinfo = 0x40000000
cd.write_u32(nvbootinfo, 0x400001)
cd.write_u32(nvbootinfo + 4, 0x400001)
cd.write_u32(nvbootinfo + 8, 0x400001)
# cd.write_u32(nvbootinfo + 0xc, 0x00000001) # Boot type, set later also
The error code, in this case is 8 or NvBootError_DeviceError, this is because I desoldered the EMMC chip for this device. On a device with a working EMMC chip the error code is 24 or NvBootError_IdentificationFailed.
### K1 tablet
At this point I bought a working K1 tablet from marktplaats(Dutch ebay). Since this tablet has the same SoC it will have the same BootROM vulnerability and I can reuse my code while also see what would be a *correct* boot flow.
If we execute the same function on the K1 tablet the result is 0x0, as expected. Also the local variable &BootAddress is set to ``0x4000e000``. This should be the target branch. Let's dump this region and see what is in it.
![stage2 bootloader](images/stage2_entry.png)
Looking with Ghidra, we can see that its valid code. The MSR instruction is clearly visible and you almost always find that instruction at the start of a bootloader, since the security/system control is transfered to that bootloader.
Maybe we can just try to boot this code directly? There is 1 issue however, the debugger is currently located at ``0x4000e000`` and loading the bootloader there would overwrite it. The debugger is build with this problem in mind, we can just relocate it to another position. To do this, we create a new entry in the Makefile and a new symbols_reloc.txt, along with a linkscript linkscript_reloc.ld:
We can load the debugger at that address and instruct the Ghidra Assistant that we are relocating the debugger. We can check if it worked by querying the ``debugger_main`` function of the debugger.
Crash ofcourse, didn't really expect that would work. But executing the same code on the working tablet results in a booting device. Let's debug what is going on.
#### Hooking
The stage2 bootloader contains a lot more strings than the ROM. I searched a bit on the internet to see if there were any leaked sources for this bootloader but I found nothing. Reversing bootloaders can be challenging when there are almost no strings, for this one there are several and they seem to be used to log the state of the bootloader over UART. Sadly we don't have UART to inspect what is going on, but we do have the debugger.
We can just place a hook in the log function and jump to the debugger. Then we can inspect the state of the device and dump the log string to see what is being logged. Setting up the hook is easy:
cd.restore_stack_and_jump(cd.arch_dbg.state.LR) # Restore as if nothing happened.
except:
pass
```
The output is fascinating:
Good device boot
```
b'Checking whether Onsemi FG present \n'
b'[TegraBoot] (version %s)\n'
b'Processing in cold boot mode\n'
b'Reset reason: %s\n'
b'Battery Present\n'
b'Battery Voltage: %d mV\n'
b'Battery charge sufficient\n'
b'Error getting nvdumper carve out address! Booting normally!\n'
b'Sdram initialization is successful \n'
b'PMU BoardId: %d\n'
b'CPU power rail is up \n'
b'Performing RAM repair\n'
b'CPU clock init successful \n'
b'Bootloader downloaded successfully.\n'
b'CPU-bootloader entry address: 0x%x \n'
b'BoardId: %d\n'
b'%s Carveout Base=0x%x%08x Size=0x%x%08x\n'
b'%s Carveout Base=0x%x%08x Size=0x%x%08x\n'
b'%s Carveout Base=0x%x%08x Size=0x%x%08x\n'
b'%s Carveout Base=0x%x%08x Size=0x%x%08x\n'
b'%s Carveout Base=0x%x%08x Size=0x%x%08x\n'
b'Platform-DebugCarveout: %d\n'
b'Using GP1 to query partitions \n'
b'WB0 init successful\n'
b'Secure Os PKC Verification Success.\n'
b'Loading and Validation of Secure OS Successful\n'
b'NvTbootPackSdramParams: start. \n'
b'NvTbootPackSdramParams: done. \n'
b'Starting CPU & Halting co-processor \n\n'
```
And device with corrupted bootloader:
```
b'Checking whether Onsemi FG present \n'
b'[TegraBoot] (version %s)\n'
b'Processing in cold boot mode\n'
b'Reset reason: %s\n'
b'Battery Present\n'
b'Battery Voltage: %d mV\n'
b'Battery charge sufficient\n'
b'Error getting nvdumper carve out address! Booting normally!\n'
b'Sdram initialization is successful \n'
b'PMU BoardId: %d\n'
b'CPU power rail is up \n'
b'Performing RAM repair\n'
b'CPU clock init successful \n'
b'Instance[%d] bootloader is corrupted trying for next Instance !\n'
b'Instance[%d] bootloader is corrupted trying for next Instance !\n'
b'No Bootloader is found !\n'
b'Error in %s: 0x%x !\n'
b'Error is %x \n'
```
So the next stage bootloader is also broken sadly. Maybe we can do the same trick?
On the good device I placed a hook on the message *Bootloader downloaded successfully*, after dumping the registers it seems that the bootloader is loaded at address ``0x83d88000``(in DRAM). The bootloader is also much bigger thatn the previous bootloaders, which is a good sign(It probably contains fastboot).
For the bad device I waited for the message *bootloader is corrupted* and continue execution at just after the good log message:
```python
elif b"corrupted" in msg or b"GPT failed" in msg:
# Restore bootloader
print(msg)
dat = open("/tmp/bootloader.bin", 'rb').read()
cd.memwrite_region(0x83d88000, dat[:0x90000])
# Jump to bootloader loaded
cd.arch_dbg.state.R0 = 0 # set bootloader as loaded correctly
1073848940:b'Secure Os PKC Verification Failed !\n'
1073848972:b'Error in %s [%d] \n'
1073849136:b'Validation of secure os failed !\n'
1073801008:b'Device Shutdown called ...\n'
```
It seems something like trustzone is used to verify the next boot stage. And since this bootloader is taken from another device(K1 revision) its probably signed with the wrong keys. We can just try to not do the trustzone call and see if it will boot?
Let's place a hook at the message *WB0 init succesful* and jump to the function after trustzone validation:
Now, let's try to flash a good bootloader. I grabbed the last update from nvidia's website and unpacked it. It contains a *blob* file that needs to be flashed to the staging partition:
```bash
$ fastboot flash staging blob
<waitingforanydevice>
Sending 'staging' (17598 KB) FAILED (remote: 'Bootloader is locked.')
fastboot: error: Command failed
```
Ofcourse, the bootloader is locked so we are not allowed to flash or boot anything. Let's patch it out.
### Fastboot
When fastboot loads on this device, it reads a status from somewhere(probably some metadata partition) and that determines it's lock state. Lucky for us there is the string *locked* and *unlocked* that is displayed whether this device is locked or not. After a bit of reversing I found this suspicious function:
It turns out that it is doing an SMC call to determine it's lock status. We can ofcourse just patch this function to tell fastboot it's always unlocked:
It works, we can now flash the staging partition. And yes the device fully boots.
## Debugger
The main thing I wanted to show ofcourse is the debugger(Gupje) and the ``Ghidra Assistant``, which work very well in reversing and post exploitation for doing tasks like this. Thanx for reading.