WebAssembly security: potentials and pitfalls

2018年6月19日 |

0 分の読み物

We at Forcepoint have recently touched on the topic of WebAssembly (also known as WA or Wasm). Part of this effort was discussed briefly in an earlier blog post on in-browser coin mining. Today we are going to talk more about the basics of Wasm, and discuss some of the security implications of this new technology. More posts will follow in this series. Later, we will make another post on reverse-engineering a basic Wasm file.

What is Wasm?

In brief, Wasm is a new way of distributing code to be executed in a browser. It is a compact binary language that cannot be run directly on the processor. Instead, code is compiled to an intermediary bytecode (similar in concept to CIL) that can be quickly converted to machine code inside the browser, and then executed much more efficiently than traditional JavaScript – indeed, execution speed was the primary driver for inventing Wasm. While Wasm is designed to run not only on the web, we will focus on the web context in this blog post.

As programs made in high-level languages such as C/C++ can be compiled to Wasm and then run in a browser, this gives new possibilities when porting existing desktop applications to web based versions.

Wasm runs in a sandboxed execution environment usually, but not necessarily, within a browser. Wasm is a work in progress, and is constantly being developed. Version 1.0 of Wasm, described as a ‘Minimum Viable Product’ (MVP), was released in March 2017 with support from all major browsers (Chrome, Safari, Firefox and Edge).

It should be pointed out that Wasm is not intended to be seen as a replacement for JavaScript, but rather as a complement. For example, Wasm modules may be used for performing computation intensive tasks and JavaScript (and HTML) for providing the UI and gluing things together.

WASM IN A BROWSER CONTEXT

While browsers may implement Wasm support differently, the sandboxed environment used is typically the JavaScript sandbox. In all cases, Wasm will enforce the permissions security policies of the browser.

When run in a browser, a Wasm application needs to have its code defined either as a separate file or as an array of bytes inside a block of JavaScript. The file or block of code is then instantiated using JavaScript as, at the time of writing, Wasm cannot be called directly within a page without a JavaScript wrapper.

Below is a screen shot of a simple JavaScript that instantiates a Wasm declared in an array:

Figure 1: Wasm declared in array of bytes and then instantiated.

While Wasm can be written in languages such as C/C++, it cannot by itself interact with the environment outside its sandbox. Even an operation such as printing text to the screen requires having some environment around the application where the text can be printed, whether it be a console, a canvas, a browser or something else. This means that when a Wasm application wants to print a text, it needs to call a function that the browser provides, and the browser in turn will print the text somewhere.

Much as with instantiating and running Wasm code, this interface is provided by JavaScript. Further, in addition to calling imported JavaScript functions, Wasm can also export functions to be called from JavaScript.

WASM MEMORY HANDLING

Memory in Wasm is linear, and it is shared between the Wasm application and JavaScript. When a Wasm function returns a string to JavaScript, it actually returns a pointer to a position inside the Wasm application’s memory space. The Wasm application itself can only access the part of JavaScript’s memory that is allotted to it, not the entirety of the memory space.

COMMUNICATION CONSTRAINTS

As previously noted, a Wasm application cannot directly interact with the system outside of the sandboxed environment. As such, it cannot directly open a normal network socket to the outside world. Instead, much like JavaScript, it can open a WebSocket.

This means that within a browser, a Wasm application cannot just call anywhere it wants. The receiving end has to be ready to receive a WebSocket connection, meaning that outbound connections effectively have to be wrapped in an HTTP request. This was discussed in depth in our blog post on in-browser miners.

Wasm can also take advantage of the XMLHttpRequest function provided by JavaScript. When calling this function, normal HTTP requests can be made. These requests will be subject to the same-origin policy of the browser.

Detection of Wasm files

A Wasm binary file typically has an extension of “.wasm”. Whether defined as a separate file or as an array of bytes inside a block of JavaScript, the Wasm binary starts with the magic bytes \x00\x61\x73\x6d. That is: the null byte followed by the ASCII code for the string “asm”. After that comes four bytes in little-endian format determining the version number, for example 0x1 for the current version. In a hex editor it might look like this:

Figure 2: Wasm header in hex editor.

Much as with PE files, this header is generally a reliable way of identifying Wasm files.

Beyond this, the Wasm needs to be instantiated within the JavaScript wrapper – i.e. somewhere within the more easily readable code of a web page. This instantiation can be detected by looking for one of the following function calls:

new WebAssembly.Instance(...)
WebAssembly.instantiate(...)
WebAssembly.instantiateStreaming(...)

See Figure 1 further up for an example of using the WebAssembly.Instance() function.

Wasm security model

The security model of Wasm has the following declared goals:

Protect users from applications having vulnerabilities due to inadvertent bugs.
Protect users from applications that are purposely written to be malicious.
Provide developers with good mitigations against exploitation.

To this end, there are several features designed-in to Wasm that work towards meeting these goals:

A Wasm application runs inside a sandbox. Sandbox escape (e.g. to JavaScript) needs to go through an appropriate API.
Function calls cannot be done to arbitrary addresses. Instead, functions are numbered, and their number is an index in a function table.
Indirect function calls are subject to a type signature check.
The call stack is protected, meaning it is not possible to overwrite a return pointer.
Control-flow integrity is implemented, meaning that calling a function that is unexpected is going to fail. Expected and unexpected paths of execution are determined at compile time. This makes it hard to hijack the control flow of a Wasm application.

These exploit mitigations go a long way with protecting a Wasm application against abuse. However, the protections are not 100% watertight (few are), and there are conditions that can nevertheless be exploited in some circumstances.

Potential security issues with Wasm

There are several potential security issues to consider with Wasm.

Firstly, as with the implementation of any new feature, the attack surface of a browser has increased: there is a real risk of implementation bugs in Wasm support that may give attackers the opportunity to execute code in a victim’s browser.

Secondly, Wasm applications themselves are potentially exposed to some (but not all) of the vulnerability classes seen in native applications. This is a particular possibility for applications cross-compiled from pre-existing codebases with only minor modifications to allow them to work within a browser framework.

Thirdly, there currently is no way to do integrity checking on Wasm applications. This means that there is no process for verifying that a Wasm application has not been tampered with.

Potential malicious Wasm applications

Despite a Wasm design goal of protecting users against malicious applications, threat actors still have a lot of opportunities. Let us take a couple of examples.

Cryptocurrency mining has become a popular activity for malicious actors, leveraging a victim’s CPU and electricity to make money. JavaScript based miners can be injected into compromised web pages. With a Wasm based approach, the return of investment will be higher for the malicious actors, since heavy math calculations can be done faster with Wasm than with JavaScript. To date, the majority of Wasm samples we have analyzed have been associated with cryptocurrency miners.

Another opportunity for an attacker may be exploitation of hardware bugs. Very recently, a new CPU side channel attack, Speculative Store Bypass (CVE-2018-3639), was announced. Earlier this year, in response to the CPU vulnerabilities Spectre and Meltdown, this family of CPU vulnerabilities was mitigated in browsers by lowering the precision of timers in JavaScript. However, once Wasm gets support for threads with shared memory (which is already on the Wasm roadmap), very accurate timers can be created. That may render browser mitigations of certain CPU side channel attacks non-working.

Analyzing Wasm

From a research and analysis standpoint, Wasm is significantly more laborious to analyze than JavaScript because Wasm is a binary format whereas JavaScript is clear text. While malicious actors often heavily obfuscate JavaScript, deobfuscation is still relatively easy. By bringing Wasm into the game, bad guys get new ways to hide and obfuscate the intentions of their code.

Due to Wasm being a rather new technology, there are not many publicly available tools for analyzing Wasm binaries. Similarly, hardly any documentation exists on how to analyze a Wasm application at this time. This means that, largely, an unknown Wasm application can be a bit of a black box to a human analyst. The researcher may need to resort to analyzing only the network traffic, without being able to understand the inner workings of the code. We expect the situation to improve with time though, as more tools and documentation is released.

Conclusion

WebAssembly is an interesting technology, especially for developers looking to create performance-intensive programs that run platform-independently in browsers (e.g. games).

However, like with many new technologies there are potential security issues which need to be considered. Collectively, these present new opportunities for malicious actors. Much as with JavaScript, the possibilities with Wasm are – if not quite endless – very broad.

Forcepoint take a combinatory approach to detection and blocking of coin miners associated with compromised websites, blocking the instances of the scripts which we identify but – more critically – blocking the WebSocket command/relay servers which entire campaigns depend on.

Those interested in looking at Wasm in further depth should take a look at our follow-up blog post available here.