WebAssembly: Neither Web, Nor Assembly, but Revolutionary

WebAssembly: Neither Web, Nor Assembly, but Revolutionary
 

Thank you to Jay Phelps for contributing this article to JavaScript January!

You’ve probably heard it claimed that WebAssembly is fast. But what exactly does that mean and what is it? In this post, we’ll dive into those questions and also get a glimpse into the future.

I think it’s important to start with some clarification: statements made regarding performance are always relative, and in the case of WebAssembly there is a lot of nuance that is often difficult to fully see and so this post doesn’t read as one continued rambling of “well, actually”, not everything is elaborated on. Keep that in mind! Just remember that WebAssembly is exciting, but it’s not a silver bullet.

webassm.jpg

WebAssembly (aka Wasm) is an efficient, safe, low-level bytecode for the Web. That sounds nice, but let’s break this down to truly understand what we mean.

Efficient

When most people talk about WebAssembly, they talk about its potential performance benefits at runtime because it’s compiled to efficient machine code by your browser. However, runtime performance isn’t the only thing it’s efficient at.

WebAssembly was designed from the ground up to have a very compact, binary file format, so it can be fast to download, but more importantly it can be compiled to machine code while it’s being downloaded. This is called Streaming Compilation.

Before WebAssembly, if you wanted to use a language like C++ or Rust on the Web it had to be compiled to JavaScript. While modern JavaScript virtual machines do compile to native code, that process can’t fully start until the file has completely finished downloading. WebAssembly, on the other hand, can be compiling to machine code as the bytes comes in, meaning significantly faster startup times, often making your Internet speed the biggest bottleneck, particularly on mobile.

Streaming Compilation is wonderful, but this is one of those cases where it requires nuance to fully understand that, just because it can do it, doesn’t mean it would always be a win to compile to WebAssembly. It’s not hard to craft some handwritten JavaScript that’s much smaller in file size than a compiled WebAssembly binary from C++, because your handwritten JavaScript code doesn’t need to ship an allocator or any of the standard library functionality already provided by browser runtimes. You don’t need to include an implementation of Array, your own JSON parsing library, etc. JavaScript runtimes provide them. WebAssembly, on the other hand, has no first-class knowledge of C++ (or any language.) It doesn’t provide the C++ standard library, the compiler has to include it in the WebAssembly binary, or at least the parts of it that are used.

Unfortunately, this is probably a bit confusing to some readers because it requires a relatively deep understanding of how JavaScript and C++ actually work, particularly how different they are. But fear not! You don’t need to understand these points to reap the benefits of WebAssembly.

Safe

The Web has enjoyed tremendous success in advancing what a browser can do, while still protecting us from malicious intent. WebAssembly continues that precedent.

WebAssembly is sandboxed, just like JavaScript is. It can’t access the user’s operating system directly, it only has access to the same APIs JavaScript does. So it can’t make arbitrary system calls and it can’t read your browsers internal memory. If it wants to read from the file system, it has to use the Web’s File API, just like JavaScript.

This is might seem fairly limiting — and it is — but it was very important to start off with a proven, secure foundation.

WebAssembly also provides additional safety for languages like C/C++. The most common example is the good old stack smashing/overflow exploit, where a buffer is overrun and executable code is injected. Because of the way WebAssembly works, that exploit isn’t even possible! You can still overrun buffers — this is allowed from the C++ specification — but there is no concept of executable memory so injecting code would be pointless.

That said, WebAssembly doesn’t mitigate every class of exploits. Although attackers can’t perform direct code injection attacks, it is possible to hijack control flow using code reuse attacks against indirect calls; e.g. a buffer overrun could change a stored function pointer to another existing function pointer somewhere else. Most of these other exploits exist because that’s just how C++ works. Using a language like Rust mitigates even more, but of course, nothing is 100% safe. Similar JavaScript exploits exist too.

Low-level Bytecode

WebAssembly is intended to be compilation target, not something you’d normally write by hand — but you can. You’ll write code in a human-readable programming language (like C++ or Rust) and compile it down to the machine-readable binary object code aka bytecode.

Some C/C++ code compiled to WebAssembly bytecode

Bytecode is similar to your native computer’s machine code, except it’s designed for a virtual machine instead of a real one. This makes it very low-level, which enables performance optimizations, but still portable; you don’t have to worry about which CPU (x64, ARM, etc) your users have. Browsers provide a built-in WebAssembly virtual machine, but ironically, WebAssembly isn’t just for the Web — more on this later!

You might be familiar with other bytecode formats like the .NET CIL (CLR) or Java bytecode (JVM). It’s natural to wonder why browser vendors didn’t just add support for one of these (or another) instead. The exact reasons are a bit complex and technical, but ultimately other virtual machines had incompatible goals. As one example, WebAssembly files can be verified and compiled by the virtual machine in a single pass, unlike most contenders, allowing streaming compilation.

There are numerous other reasons, but even if they had chosen one of the existing bytecode formats, it wouldn’t have allowed you to run existing binaries as-is because they don’t sandbox the operating system enough. For example, the JVM provides unrestricted file system access with APIs like java.io.* which is a big no-no for the Web. Instead, the goal is to use the Web’s existing APIs and sandbox model.

Is WebAssembly going to kill JavaScript?

gif.gif

NOPE! JavaScript isn’t going anywhere. JavaScript was designed to be a language written by humans, WebAssembly was designed to be a compilation target — you compile a language like C++ or Rust to it. Particularly right now, because it’s early, there are many, many cases where JavaScript is still the best choice on the Web. After all, even with its faults, modern JavaScript is a great language.

Buttttttt, no one can predict the future. So while unequivocally the intent is not to kill JavaScript, it’s theoretically possible another language compiled to WebAssembly might someday gain significant usage and “compete” so to speak. If (and that’s a big if!) it does happen, I’m betting it would be a brand new language designed for the Web, like Dart, Elm, Reason were. Just don’t hold your breath!

Will we compile JavaScript to WebAssembly?

JavaScript is an extremely dynamic language as a whole. Take a look at this (admittedly ridiculous) example:

will+we+compile%3F.jpg

If you wanted to compile any arbitrary JavaScript, say the lodash library, to WebAssembly, would require shipping a rather large JavaScript runtime. To come close to the built-in runtime performance of browsers, you’d probably need to ship an entire JS virtual machine, which obviously is not practical because of file size.

So taking any arbitrary JavaScript and compiling to WebAssembly isn’t a good idea…but a strict subset of JavaScript or a similar dialect could be fast!

In fact, some folks at Facebook have a project called Prepack that is an experimental JavaScript optimizer. While the project has mostly worked on optimizations unrelated to WebAssembly, they indeed toyed with the idea of trying to statically identify a subset of certain JavaScript patterns that can be safely and efficiently compiled to WebAsssembly. As of now, though, it remains to be seen whether this will ever be worthwhile.

A much more likely approach is to write code in a language that looks like JavaScript, but really isn’t. AssemblyScript looks identical to a very strict subset of TypeScript, making it much easier for JS-folks to write certain performance critical functions from your larger overall JavaScript application. It shows a lot of promise, but for most people, it’s not yet viable to write your entire application in it — most of its limitations are tied to future WebAssembly features. In any case, I think it’s important to not conflate AssemblyScript with real TypeScript. Real TypeScript is a superset of JavaScript, so all the dynamic features of JavaScript are possible in TypeScript.

WebAssembly v1.0 aka MVP

In my opinion, the biggest reason WebAssembly hasn’t taken off like a rocket is that it’s being designed incrementally, so certain features are “missing” for broad appeal. The initial 1.0 version, dubbed the MVP, is best suited for languages like C/C++ and Rust and even with them, not all use cases.

This was a wise, intentional choice because standards notoriously take forever to finalize. It also allows the standards body take in feedback and refine their goals incrementally from real-world usage. It’s very hard, usually impossible, to remove something from the Web once it’s shipped, so making the right choices the first time is paramount.

Thankfully, efficiently supporting more higher-level languages like Java and OCaml are a stated future goal. Even your Dart, Elm, Reason, etc. will likely someday compile to WebAssembly more efficiently than JavaScript. The largest missing pieces are around garbage collection; doing GC in WebAssembly is possible (Go lang already works!) but it’s not particularly efficient; and also direct access to host bindings aka WebIDL aka the DOM/HTML APIs. Right now, WebAssembly in the browser has to call through JavaScript to interact with the Web APIs, and can’t directly hold or interact with objects created by them, so toolchains like Emscripten and wasm-bindgen have to do extra work under the hood to hide this fact for you.

Getting Started

Here is where things get a bit awkward. Since WebAssembly is a compilation target, not something you write by hand, how exactly you “get started” depends on the higher-level language you’re going to write and ultimately compile.

If you’re already familiar with C/C++, you’ll undoubtedly want to use Emscripten, which essentially wraps Clang/LLVM and provides a bunch of JavaScript glue code for the standard libraries to interact with the Web APIs.

If you’re into Rust, you’re quite in luck because the Rust core team has really doubled down on WebAssembly. There’s even a dedicated Working Group to improve the experience. You’ll want to check out their Rust+Wasm book.

Other languages have early support for WebAssembly too, like Go. Expect these to improve significantly in the coming years.

Neither Web nor Assembly

Most of this post (and the Web in general) assumes that when we’re talking about WebAssembly, we’re talking about it running it inside of a browser. However, pretty ironically I must say, WebAssembly’s name is a misnomer. It’s not an assembly language, it’s a bytecode — though there is a textual “assembly” language to represent the binary bytecode.

More importantly, it’s not just for the Web!

The WebAssembly designers knew it was unprecedented for the major tech companies of the world to get together and design a completely free, standardized instruction set. They made a conscious effort to make it generic, not fundamentally tied to browsers.

We’re just beginning to see how it’s used outside of the browser, but already there are some exciting projects.

Desktop WebAssembly

  • Nebulet, operating system microkernel that runs only WebAssembly

  • wasmjit, WebAssembly running in the kernel/ring-0

  • Wasmer, universal binary platform

Serverless WebAssembly

Crypto/Smart contracts

  • ewasm, Ethereum-flavored Wasm virtual machine

  • Parity wasmi, EVM client Wasm interpreter

Conclusion

WebAssembly can be used today in all modern browsers (i.e. not IE11), and in performance critical, fairly algorithmic use cases it’s already providing huge performance benefits for real world use cases. But in particular I’m most excited for the post-MVP future of WebAssembly and what it will mean for not just for the Web, but everywhere.

I also highly recommend checking out the Awesome Wasm repo which has links to all things WebAssembly.

 

The contributors to JavaScript January are passionate engineers, designers and teachers. Emily Freeman is a developer advocate at Kickbox and curates the articles for JavaScript January.