Orca Jam Project

Orca is a launcher and sandbox for applications based on WebAssembly. It allows downloading applications from a webpage and running them in a sandbox, as well as saving them in a local library for offline use.

Usage

The main Orca UI shows the apps in your local library. You can click an app icon to see its banner and a description of what the app does. Double-clicking launches the app in a new tab.

day5_short_small.mov

You can also run an app from the web by accessing a special orca link (provided by the developer of the app) in your web browser.

Applications started from the web are normally run in a transient way, meaning their data gets cleaned up once the app closes. Optionally, you can instruct Orca to cache the application into your local app library, allowing for easier future access and offline use.

day7_short_small.mov

How does it work?

As a developper, you can build an Orca application by packaging a WebAssembly module along with some resources into an 'app bundle' following this specific directory structure:

bundle dir/
├─ run/
│  └─ main.wasm
├─ ressources/
│  ├─ icon.png
│  ├─ banner.png
│  └─ blurb.txt
└─ localfs/

The icon is displayed in the library view of the Orca launcher, and the banner and blurb text get displayed when the app is selected.

You can then zip the app bundle, upload it on your web server, and point users to it with a custom url starting with orca://.

When a user clicks an orca link, the browser automatically opens the Orca launcher, passing it the url. Orca downloads and uncompresses the bundle in a temporary folder, and runs it in a new tab. If the users requests to cache the app, the bundle is marked as persistent and moved to the local apps library.

What can an App do?

As of the jam, not much! Currently applications can receive input such as mouse clicks and key events and are provided a GLES context into which they can draw their content.

Multi-Process Architecture

Each application instance is run in a tab backed by a separate 'host' process, communicating with the main process through pipes. A host process creates a graphics surface that they expose to the main process, and setup a GLES context on that surface. They then load and run the app's WebAssembly module. The main process passes input events to host processes and displays their graphics surface when their tab is selected.

How is it reinventing the wheel?

This jam entry was made as an idea-proofing step towards re-thinking the web from an application platform perspective.

The Promises and Shortcommings of Web Technology

The web does provide (or at least promises) huge benefits to both developers and users:

Developers only need targeting one 'platform' (the browser)
Developers can distribute their apps just by setting up a website.
Users can use their (pre-installed) browser to go to a webpage and use an app right away. No (apparent) downloads, no installation, no file management.
The same apps are accessible from any of your devices (provided you have an internet connection)
There is the implicit promise that the app won't do much harm to your computer/phone, thanks to the browser sandboxing features.

On the other hands, the real world experience with web apps is rather poor. They often have janky UIs, are slower than they should, don't work well unless you're using a specific browser or even a specific version of that browser.

As for the developer, they have to deal with byzantine frameworks and APIs, browser-specific quirks, and a particularly inadequate document model that's built into the core of the web technology.

The current web ecosystem is layered in the wrong way.

The web was originally designed as a global hyperlinked knowledge base composed of documents. It then grew more dynamic features from there, adding more flexible layouting, animations, push protocols, scripting features, etc., and at one point we realized we can code entire applications in it.

The promises I mentionned above were also obvious, all the more so as Operating Systems failed to provide them. So it's no wonder web technologies have been broadly adopted to build and distribute applications. We're even to the point we're seeing hacks added on top of them to make web apps look like native desktop ones (ie electron).

But all of this comes at the price of layers and layers of accidental complexity, fragility and slowness. We're building applications on top of a tech stack grounded on a document oriented platform that was never meant to do what we're making it do. It only succeeded because the OSes failed.

Flipping the Tech Stack

What if instead we flipped things upside down and had Operating Systems that could seamlessly download applications using a cross-platform format, and run them in a truly sandboxed way?
What if they gave users the choice to run one-shot apps, or to cache some apps for offline use (and maybe optimize them for better perfs on the users' machine)?
What if they had a simple, understandable capability model that let users choose which operations each app is allowed to perform on their machine, which file they can access, which servers they can connect to, which input feed they can capture?
What if this system could provide standardized ways of embedding views of an application into another, interlinking contents, and sharing data across applications?

I personally think this would massively simplify web technology and infrastructure, enabling developers to write better, more performant and stable apps, and give user more freedom and agency over their relation with computing.

Note that on top of such a system, the legitimately great idea of an world-wide decentralized knowledge base and expression platform consisting of dynamic mixed media documents could still exists! It would consist of specific document formats displayed by different 'viewer apps' that would plug into the system as any other app.
I would even argue this part of the web would be better, because it would be much simpler to write custom viewer apps. Each viewer wouldn't have to carry all the weight and baggage of modern browsers, and could use the system embedding and cross-linking capabilities to leverage other simple apps. For instance, we can envision using our casual document reader with a document containing tabular data, and being able to leverage the features of a spreadsheet editor app on demand, to sort columns or display graphs out of them.

Orca is much much more modest than that!

Of course I wasn't going to reinvent the OS, let alone the web, from scratch in a jam week :) Orca was written as an way to test some ideas of how it could look like in terms of app distribution and sandboxing. It was also a way to explore some of the problems that need to be solved, such as process isolation, inter-process communication and delegated drawing. Finally it's trying to hint at how we could 'put a foot in the door' early on, before coming up with entire browser or OS replacements.

What's next?

The next step would be to add a capability system that allows user to give an app access to systems features (e.g. reading/writing files, making network connections, capturing camera/microphone feeds, etc.)

Exposing a solid core library and a standard UI API to developers would also be greatly beneficial in easing the creation of new Orca apps. In particular, I think that a UI API that also creates a structured description of itself as a by-product would yield a great deal of value, and serve as a base for hyperlinking/embedding/accessibility features, automation, etc.

Not much visible progress on &orca today, but I did a lot of background work!

I now load and run real wasm modules in the tab processes, and compile shaders and call gles from them.
I also wrote a crappy python script to autogenerate (most of) the API binding code (some of them are more tricky to automate, because the require knowledge of the semantics of the arguments, e.g. glShaderSource, so I write binding code for them manually).
Oh and I also designed a simple messaging protocol between my processes, and can send input events to the tabs. Then if the wasm module provides a handler for that event (i.e. defines a function like OnMouseUp() or OnFrameResize()), this gets called automatically (i.e. no need to register input handlers).

So now I can click to change the direction of rotation of my triangle. Phew! But to be fair, I warned you that there's wasn't much eye candy today!