The Caja Compiler is a tool for making third party HTML, CSS and JavaScript safe to embed in your website.
What Caja does
Say you are running a social site where people can meet with their friends, but your users want to run games on the site. The games are code written by people you have never met and have no particular reason to trust. These games need to interact with the users via their UI. They might also want to interact with your site to post high scores. You have some idea of what interactions a game is allowed to have with your site: for example, it may only post one "high score" notification per day, but may have no other access to the user's resources on your site.
Clearly, just running untrusted code on your site exposes you and
your users to all the bugs or malware that a game may harbor. You
could put each game in its own <iframe>
on a
separate domain, but that means you would need to host a separate, safe
domain for the games to be served from, and you incur the bother of
having to implement cross-domain messaging between the games and your
site, either via client-side hacks or server-side proxying.
Caja provides a better way. Using Caja, you can embed each game in
a <div>
directly within your site's page, and
interact with it via normal JavaScript objects. Caja lets you do this
safely and gives you a place to stand to enforce your policies (such
as the "one notification per day" rule) via regular JavaScript code.
- Host page
- A web page that acts as a container for code it has no particular reason to trust. This page uses the Caja API to use that code safely and limit its authority.
- Host code
- Code running in the host page.
- Guest code
- Code that the host page includes from some external source it may or may not trust. This code needs to run with limited authority within the host page.
- Guest page
- For Caja, guest code is represented as a regular HTML page, possibly containing CSS and JavaScript. When an HTML page is used as guest code, it is called a guest page.
- Policy
- Your decisions about what you wish to allow each piece of guest code to do.
To use Caja, you arrange your host page such that Caja creates a secure boundary around the guest code. Among the objects in your host page, you construct some which act defensively -- in other words, their API assumes that their clients may be buggy or malicious. You give the guest code access to these objects, and subsequently, you and the guest code use these objects to cooperate. Caja provides a taming of your defensive objects to make sure that guest code can only use their published API.
- Defensive object
- An object in your host page that is constructed with caution to provide only a limited authority to its clients. Your host page grants limited authority to guest code by supplying it with the appropriate defensive objects. Taken together, the code in your defensive objects implements -- and thus enforces -- your policy.
- Taming
- Current JavaScript objects are not able to make themselves tamper-proof. Taming is the process of registering your defensive objects with Caja as you provide them to guest code. Caja ensures that only the published API of objects is available to guest code and guest code cannot modify the object in ways you did not intend.
How it works
To use Caja effectively, you need to know some basics about how it works. The following shows the major steps in how a page uses Caja. Not all of them have to happen in the exact order shown.
Include Caja, prepare target <div> and defensive objects
The host code includes the main Caja script, prepares the defensive
objects, and constructs a <div>
to receive the
guest code. The Caja script establishes a connection to the Caja
server as specified by its configuration. We supply a Caja server at
http://caja.appspot.com/
, or you can download our code
and run your own.
Tame objects and establish defensive DOM boundary
The host code asks Caja to tame the defensive objects, and uses
Caja to construct a DOM boundary within the <div>
it has chosen.
Ask Caja to cajole and run the code
The host code asks Caja to run some guest code -- represented as
HTML with CSS and JavaScript -- within the supplied
<div>
, supplying to that code the tamed defensive
objects previously constructed.
In order to run the code safely, Caja must transform it to make it
safe. It does this by sending a request to the Caja server, which
GET
s the code and returns the transformed code. We call
the Caja transformation cajoling.
- Cajoling
- The process of making Web content -- HTML with CSS and JavaScript -- safe for inclusion in a host page running Caja. Cajoling involves adding inline checks to make sure the code does not break the invariants Caja needs, and ensuring that the code cannot refer to variables in the host page that are not explicitly given to it.
Caja then runs the code in the supplied <div>
with the tamed defensive objects you specified.
What the guest code sees
From the viewpoint of the guest code, it runs with what seems to be
a W3C DOM compliant document
object and an ECMAScript 5
compliant JavaScript virtual machine. Its document
is
confined to the boundaries of the supplied <div>
,
and its JavaScript globals, like Object
and
Array
, are its own and do not affect code outside it.
The tamed defensive objects are visible to guest code as additional
global variables in its top-level JavaScript context. For example, if
the host page provided a game with a tamed function for reporting a
top score under the name sendTopScore
, the game code
would simply see a sendTopScore
global variable and could
invoke it as a function in the usual manner.
- Imports
- The set of additional global variables provided to guest code allowing it to invoke the services of the host page. The host page specifies the names of the imports, and provides tamed defensive objects as their values.