26

Programming Servo: My own private runtime

 4 years ago
source link: https://medium.com/programming-servo/programming-servo-my-own-private-runtime-8a5ba74c63c8
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Servo is a big system written in Rust and doing lots of different things. One of those things is running JS and/or Wasm code in a VM, with that VM doubling-up as a compiler too when appropriate. That VM is called Spidermonkey , and this article is about how it integrates with Servo, which also makes it an article about how to integrate Spidermonkey with any Rust program.

Javascript, and Wasm, are not the Web

First of all I think it’s important to state the difference between a VM for either Javascript or Wasm, and “ the Web ”.

The way I like to think about it is as “the Web” being a set of API’s that are offered to the code running in the VM. So if you have some code operating on numbers or objects, you’re not using the Web yet. Once you start using an API like setTimeout or fetch , that is when your code running in the VM is “using the Web”.

I think it’s also important to note that those “Web API’s” almost always involve running some work outside of the VM. For example a call to fetch will spawn some network related work, without blocking the code in the VM, and the result of that work happening “ in-parallel ” will eventually be made available to the code in the VM in an event-driven fashion.

Web workers is a different example, allowing code to run in a “worker VM”, with communication with the “main VM” happening via a specific set of APIs .

It’s almost like the code in the VM is getting multiple threads at it’s disposal, although indirectly via an event-driven API.

This event-driven API is implemented via a specified event-loop , and a set of task queues , which enable the coordination of “events, user interaction, scripts, rendering, networking, and so forth” by the user-agent implementing the Web API’s.

In other words, the Web is the “platform” on which the VM runs, and the various objects through which the API of the Web platform are offered to the code, are referred to as “platform objects”.

Servo: a Web engine embedding a JS/Wasm engine

Building further upon the distinction between “the Web platform” and “the code running in a VM”, Servo is the engine implementing the “platform” part, and is therefore embedding Spidermonkey, the engine running the VM for the actual JS and/or Wasm code running on that platform.

The code running inside the VM is actually “driven” by an event-loop, and Servo is running that event-loop for that specific VM(and there can be many of these, one per JS execution context, also known as an agent ).

An event-loop is then usually implemented via a native thread, running Rust code one task at a time. This Rust code will, when it runs as the current task on the event-loop, usually call into the VM to run some user-supplied code(for example by firing an event ).

The code inside the VM can then use the various Web platform APIs, and those will usually call into Servo again, almost always resulting in another task being queued on the event-loop(either directly from the event-loop, or indirectly via another set of steps running on another component running in parallel to the event-loop, see for exampleFetch). And this will go on until the agent/event-loop is shut-down.

Picture a snake eating it’s own tail, and that’s basically it.

By the way, I havewritten elsewhere about how an event-loop can be a good way to model concurrency in Rust , even when not implementing the Web.

In Servo, almost every component runs one, or several, event-loop(s), regardless of whether that particular component is running code in a VM or not. The “event-loop” is just a way to implement concurrency via message-passing.

This also has little to do with where the event-loop is running on. You can run an event-loop on a task , a thread , some other construct , the important element is only that it runs sequentially one task at a time(while multiple event-loops can be running multiple tasks in parallel of each other).

So how does Servo do it?

Here we can benefit from the fact that in Servo, if you want to know how something works, you can just follow the function calls. It’s all there, somewhere.

So let’s literally follow the code, from where Servo initiates Spidermonkey, starts using it, how code running inside of it can call back into Servo, how that spawns some work elsewhere, and how eventually Servo calls back into Spidermonkey via a task queued on the relevant event-loop.

Down the rabbit hole we go!

Introduction to rust-mozjs and mozjs

Servo uses Spidermonkey via Rust-specific bindings, found at https://github.com/servo/rust-mozjs, itself pointing to a Servo specific fork of Spidermonkey, found at https://github.com/servo/mozjs .

rust-mozjs contains some higher-level abstractions, and also bindings to the actual public API of mozjs . So when using rust-mozjs , you actually look at lot at mozjs , because that is where the documentation is found(or at least that is what I do).

Using that API takes a bit of getting used to, however it is actually really simple once you have. Most of the goodies are found at https://github.com/servo/mozjs/tree/master/mozjs/js/public, for example the API to manipulate Promises is at https://github.com/servo/mozjs/blob/master/mozjs/js/public/Promise.h . All of this stuff is re-exported by rust-mozjs at https://github.com/servo/rust-mozjs/blob/ea10bed291aaa8064fe8da8b36d46b5586a36357/src/lib.rs#L35 under js::jsapi , while the Rust specific abstractions are found at https://github.com/servo/rust-mozjs/blob/master/src/rust.rs and exported under js::rust .

So, when using something from jsapi , it usually pays to look at the docs directly in mozjs , while when using stuff from rust , you look in rust-mozjs .

Initializing the engine

Assuming Servo is running in multi-process mode, a JS/Wasm engine will be initialized for each content-process(for more info on this structure, seean previous article). Each thread in that process(for example one running a window-agent, and another running a worker-agent), will use the same engine, although they each will get their own Runtime and Context .

rust-mozjs gives us the JsEngine abstraction, which can be initialized only once(per process), and which should only be shutdown when nothing is using it anymore(duh!). Internally, JsEngine uses the mozjs API call JS_Init , which comes with a pretty nice doc comment:

6jIbayq.png!web

https://github.com/servo/mozjs/blob/e21c05b415dfc246175ff8d5fc48b0e8c5b4e9e9/mozjs/js/public/Initialization.h#L61

So, that’s a good example of how it can be useful to actually look at mozjs to read some docs when using an API from rust-mozjs , even when it’s via a Rust abstraction.

When does Servo call into this? There is a function called script::init which does this, and it’s called at the starting-point for a content-process at run_content_process .

So, essentially, when a content-process is started, one of the first things to do is call script::init , which will initialize Spidermonkey, and this will make a handle to the initialized engine available to all threads in that process, via:

https://github.com/servo/servo/blob/06803a2edb992cddd90b1ec71ec60af9c36de7ea/components/script/script_runtime.rs#L435

To each event-loop, a JS Context!

When the engine has been initialized for a given content-process, and an event-loop and it’s agent(window or worker) start running, what do they do to get their own execution context?

This is where the JSContext comes in play.

Servo calls a function entitled script_runtime::new_rt_and_cx_with_parent , which itself creates a new js::rust::Runtime , which itself ends-up calling js::jsapi::JS_NewContext . For the purpose of this article, we can just assume a context and a runtime are the same thing: a VM tied to a specific event-loop(thread), for it to run user code supplied as JS or Wasm and downloaded over the internet.

Since window agents come with a bit more ceremony, we can for simplicity look at how a dedicated worker starts-up it’s VM.

The entry point is a method called run_worker_scope , and after starting a new thread , it creates a new runtime /context combo.

Then, it actually loads the worker script over the internet, executes it, and starts running an event-loop. So the “execute” step is basically the code that runs when the worker starts-up, and any subsequent code will (mostly)run in response to some event firing from a task queued on the event-loop.

Web IDL, meet Spidermonkey

So, now that we’ve established that each content-process has a single Spidermonkey engine, and that each event-loop running in that process has it’s own “context”, and that this event-loop will continuously run one task at a time until it is shutdown, we are left to wonder: how does Rust call into the VM, and the VM call into Rust? Which could be reworded as: how are the Web platform APIs made available to the Js or Wasm code running in the VM?

A quick answer could be: https://heycam.github.io/webidl/

The slightly longer answer is that Servo contains a folder at https://github.com/servo/servo/tree/master/components/script/dom/webidls, that is simply full of IDL files representing all the Web API’s that Servo offers to the running code in the Spidermonkey VM.

For example, the fetch function:

2mMVfaQ.png!web

https://github.com/servo/servo/blob/06803a2edb992cddd90b1ec71ec60af9c36de7ea/components/script/dom/webidls/Fetch.webidl

Above we can see that a WindowOrWorkerGlobalScope mixin is extended with a fetch method. If we look at the actual mixin , we can see that Window includes it, and finally in the window.rs file, we find the following method:

nAfmAbn.png!web
https://github.com/servo/servo/blob/06803a2edb992cddd90b1ec71ec60af9c36de7ea/components/script/dom/window.rs#L1278

Where is this coming from? So the actual Fetch is “implemented by hand”, however all the glue code between the Rust and the Spidermonkey code, is generated automatically .

So, if you were to develop the “fetch” feature in Servo, you only had to add the WebIDL file, and then implement that actual Fetch method, without having to worry about the glue code that somehow makes those arguments available to Fetch , or that takes the returned Rc<Promise> and makes it available to the code running in the VM.

That’s pretty cool right?

Finally, please do try this at home

One thing that’s worth noting is how the Spidermonkey VM is actually very much separated from “the Web platform”. The VM only runs some code without any concepts of “when” the code runs, it doesn’t actually provide the event-driven API that “drives” the code running on the VM. That is done by the event-loop for that specific execution context(either a window, worker, or worklet, agent ).

Also, the WebIDL infrastructure is actually pretty generic. You could literally take the script component in Servo, throw out all the WebIDL, and build your own platform with it’s own set of APIs. Even the code generation stuff would all pretty much work I think.

It wouldn't be “the Web”, but if I had to build a Wasm runtime from scratch(for which there seems to be some enthusiasm lately), the last thing I would do is work on a VM. Instead, I’d just use Spidermonkey, and bring my own event-loop/platform objects combo(this would currently I think require some JS around the Wasm, but this situation will evolve with the Wasm spec and Spidermonkey’s implementation of it). Also, if you’re going to run Wasm, it has to come from somewhere right? So Spidermonkey can actually compile the code, while it is still being downloaded . Now that’s really cool!

And finally, those Rust bindings are already there…

Further reading:

Here is a very good guide on embedding Spidermonkey into a C++ program: https://developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/JSAPI_User_Guide, and since those APIs are available, or can be made available, to Rust via rust-mozjs , it’s a very good guide for a Rust program too.

This article also writes about how Wasm interfaces with JS, and with other APIs like those offered by the web: https://hacks.mozilla.org/2019/08/webassembly-interface-types/


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK