Yesterday, I solved a long-standing question I'd had - how do you get data out of a WebAssembly program without having to copy it back? Ideally, in such a way that a web worker wouldn't have to copy it back to the main thread either. I've been able to find some information on this around the web, but much of it seems to be rather outdated or does not address the issue. I decided to have a crack at it myself and figure out the state of the art by writing a small proof-of-concept.
My first approach was to try to create the web worker using a SharedArrayBuffer backing its code. As a bonus, we should be able to redefine bytecode on the fly then which will be fun.
Copying from Depth-First's excellent guide (read it before this post), we arrive at something like this:
(async () => {
const memory = new WebAssembly.Memory({ initial: 1 });
const log = (offset, length) => {
const bytes = new Uint8Array(memory.buffer, offset, length)
const string = new TextDecoder('utf8').decode(bytes);
console.log(string)
};
//Blob generated with compile with `wat2wasm hello.1.wat --enable-threads --out /dev/stdout | base64 --wrap 0`
const unsharedData = new TextEncoder().encode(atob(`
AGFzbQEAAAABCQJgAn9/AGAAAAIZAgNlbnYGbWVtb3J5AgABA2VudgNsb2cAAAMCAQEHCQE
FaGVsbG8AAQoKAQgAQQBBDRAACwsTAQBBAAsNSGVsbG8sIFdvcmxkIQ==
`))
const sharedData = new Uint8Array(new SharedArrayBuffer(unsharedData.length))
sharedData.set(unsharedData)
sharedData[sharedData.length - 1] = '?'.charCodeAt()
const { instance } = await WebAssembly.instantiate(sharedData, {
env: { log, memory }
});
instance.exports.hello()
sharedData[sharedData.length - 1] = '.'.charCodeAt()
instance.exports.hello()
})()
Here, we start by defining some WebAssembly memory to pass args around with. (The initial
value is in number of 64-KiB pages to allocate.) We then define a function, log
, which will take this memory and print the contents using console.log(…)
. We'll call this from our WASM code, which we've serialised in this case as a base64 string. (The source of which is hello.1.wat, compiled using wat2wasm from WABT.)
To get our shared memory, we create a new array backed by a SharedArrayBuffer
. In JS, all the typed arrays have a backing buffer. Usually, by default, it's an ArrayBuffer
. Amusingly enough, an ArrayBuffer
can be shared between multiple typed arrays, even of different types. The SharedArrayBuffer
is called so because it can be passed between web workers without copying as well, which a regular ArrayBuffer
can't do. This is the behaviour we're after.
So, let's test it! First, we'll set the final byte of our WASM program to ?
, from it's original value of !
, to prove we're loading the right memory and can manipulate it. Then, we start the WebAssembly program and call the hello()
function of the instance
we created. This in turn calls our log()
, which prints "Hello, world?".
(Note: WebAssembly.instantiate(…)
will also let you pass in an ArrayBuffer
/TypedArrayBuffer
, in addition to the Uint8Array
we have here… in Firefox and not in Chrome.)
Now we modify our memory again, this time changing the final byte to .
. However, calling into hello
again, we find we get the same output, "Hello, world?". We can't just poke the memory of a running WASM program, it would seem - probably for the best. So, what do we do now?
We have one other memory-buffer-ish object we can tweak. Let's see if we can't get that initial const memory = …
declaration to be a shared buffer, instead of an unshared buffer. Some brief searching later, and we find that WebAssembly.Memory
can indeed take a shared
flag. It's not very well supported, but let's rework our code to try to test it anyway. (I believe the shared
flag is part of the WebAssembly Threads system, which seems to just be referring to using shared memory to communicate between workers VS message passing.)
(async () => {
const memory = new WebAssembly.Memory({ initial: 1, maximum: 1, shared:true });
let globalBytes = null
const log = (offset, length) => {
const bytes = new Uint8Array(memory.buffer, offset, length)
globalBytes = bytes
//Can't use TextDecoder because it doesn't handle shared array buffers as of 2021-04-20.
//const string = new TextDecoder('utf8').decode(bytes);
const string = bytes.reduce(
(accum,byte)=>accum+String.fromCharCode(byte), '')
console.log(string)
};
//Blob generated with compile with `wat2wasm hello.2.wat --enable-threads --out /dev/stdout | base64 --wrap 0`
const wasm = new TextEncoder().encode(atob(`
AGFzbQEAAAABCQJgAn9/AGAAAAIaAgNlbnYGbWVtb3J5AgMBAQNlbnYDbG9nAAADAgEBBwk
BBWhlbGxvAAEKCgEIAEEAQQ0QAAsLEwEAQQALDUhlbGxvLCBXb3JsZCE=
`))
const { instance } = await WebAssembly.instantiate(wasm, {
env: { log, memory }
});
instance.exports.hello()
globalBytes[0] = '\''.charCodeAt()
instance.exports.hello()
})()
With our new memory declaration returning a shared buffer… on most non-Apple desktop browsers… 😬 we can now test this method of memory manipulation. We immediately find three things:
TextDecoder
doesn't accept a SharedArrayBuffer, so we have to write our own little routine here. I guess this is because, as the bytes could change at any time, we could potentially output invalid utf-8 as our data shifted under us. We don't care for this single-threaded demo, but it would be an issue normally.globalBytes
), so we won't bother manipulating it before we instantiate our WebAssembly program.To test this, we call hello()
again, which sets globalBytes
to our "Hello, world!" message. We now set the first character to an apostrophe, and call in to our function again to test if we were able to set data visible to WASM. It prints "'ello, world!", thus demonstrating we are! Since we're working with a SharedArrayBuffer
here, we can share this reference across threads to get fast, efficient data transfer.