Firefox WebExtensions: injecting, sending data and detecting AJAX


WebExtensions are the new standard to be used when developing addons for the Firefox browser. Since December 2017, with the release of Firefox 57 and Project Quantum, Mozilla dropped the support of legacy addon APIs like XUL. You might have noticed, since this move was critisized heavyly and it broke some popular addons - of which some have discontinued development. Others ported late and in a haste, still exploring the new API and figuring out what can be done and what is not provided anymore.

The old XUL interface had a lot of possibilities to manipulate the browser, both the interface and all routines working behind the scenes. Although I did not develop any addon for XUL, looking into old source code of the now defunct addons show many XML hooks, injections and what not.

WebExtensions are different. First off, the API is compatible with the addon API of the Chrome browser. Addons which are released as WebExtensions should work in both engines, sharing the same API and specifications. Yet, there are still divergences and some things are still in development and to be released in later versions. The interface also allows for a very clean cut between the browser engine and the addon routines, complemented by a permission system that asks for needed permissions at the time of installation.

For my university project CrowdFilter I created a WebExtension and implemented multiple functionalities. Version 2 dropped a lot of code and so now - as a recap - I'm going to close the chapter "Version 1" by reviewing what worked and what did not work as expected.

Good to know:

  • Manifest: entry point of the XPI package, containing metadata, permissions and scripts to be executed.
  • Content script: a script that is injected in tabs on specified URLs. They can manipulate the page DOM, but have restricted access to storage and XHR.
  • Background script: scripts that run inside their own context. They do not have any access to tab contents but can manipulate the addons storage and fetch remote resources.

A good knowledge resource is the Mozilla Developer Network. There are a lot of code snippets in this article - have a look at the Github repository for the whole picture. Also have a look at a similar blog post by Christian Kaindl - he wrote in much more detail about the basics of WebExtensions.

manifest.json

The Manifest is the root of the addon. It holds meta data about the package, permissions, content and background scripts. Let's look inside the file! Below some basic metadata with author name, description and version. Icons can be SVG vector graphics, they will be scaled by Firefox. The manifest.json reference is pretty good, so I'll show only some parts of my Manifest.

First, I am using the self-hosted publishing method for my addon. This means the XPI file is not hosted on the AMO but on my own webspace. But how then does Firefox know if there is an update? Inside the applications key we put Firefox ("Gecko") specific data, providing a URL to a JSON file which is queried once a day - if you have automatic updates enabled.

1
2
3
4
5
6
  "applications": {
    "gecko": {
      "strict_min_version": "57",
      "update_url": "https://crowdfilter.bitkeks.eu/addon/updates.json"
    }
  },

For reference, this updates.json JSON file looks like the following:

1
2
3
4
5
6
7
8
9
{ "addons": {
    "webextension@crowdfilter.bitkeks.eu": {
        "updates": [
            {
                "version": "0.3.1",
                "update_link": "https://crowdfilter.bitkeks.eu/addon/crowdfilter-0.3.1-an+fx.xpi",
                "update_hash": "sha256:b4464e266bccf71d730e88232c27a62db32240a5e3b5454c305199ce3c592393"
            }
]}}}

Jumping back to the Manifest. Below is the list with permissions the addon will ask the user for when it is installed. In this case my addon has access to the browser storage, can intercept requests, may manipulate the active tab and has permission to act on two URL patterns. You could also use <all_urls> and tabs but I advise to keep the permissions to a minimum.

1
2
3
4
5
6
7
  "permissions": [
    "storage",
    "webRequest",
    "activeTab",
    "https://crowdfilter.bitkeks.eu/*",
    "https://twitter.com/*"
  ],

As mentioned above, WebExtensions know two different types of scripts they execute: content scripts and background scripts. Background scripts are executed as soon as the addon is activated and the browser is running. Content scripts are injected in each tab which URL matches the filter pattern list. The injection can be both, JavaScript and CSS files, and we can decide when they are injected.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
  "background": {
    "scripts": [
      "js/background/ajax_detector.js",
      "js/background/background.js"
    ]
  },

  "content_scripts": [
    {
      "matches": [ "https://twitter.com/*" ],
      "js": ["js/cf-injection.js"],
      "css": ["cf-style.css"],
      "run_at": "document_end"
    }
  ],

And to complete the file, options_ui let's us define content which is displayed inside the options page when you click on the addon in the browser addon list. page_action (or browser_action) provide an icon in the toolbar which can be clicked on.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
  "options_ui": {
    "page": "options.html",
    "browser_style": true
  },

  "page_action": {
    "default_title": "CrowdFilter",
    "default_icon": "icons/logo.svg"
  }
}

Background scripts

Stepping into the code, let's first have a look at background scripts. They run when the browser runs and the addon is enabled. Please note that I use some shortcuts; have a look at the MDN storage docs regarding storage handling.

1
2
3
const stGet = browser.storage.local.get;
const stSet = browser.storage.local.set;
const lang = browser.i18n.getUILanguage().startsWith("de") ? "de" : "en";

Fetching remote content and POSTing

As an example of fetching remote content, here's a snippet of code that fetched a remote config JSON file to be parsed. Have a look at the MDN docs for details of the Request object. This is also the first code snippet which introduces us to async code handling. Using then will hound as a lot more times..

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
function fetchConfig() {
    let req = new Request(collectorHostname + "/config", {
        method: 'GET',
        headers: { 'Accept': 'application/json' },
        redirect: 'follow',
        referrer: 'client'
    });

    fetch(req).then(function(response) {
        // .json returns another promise
        return response.json();
    }).then(function(config) {
        stSet({config: config});
        filters = config.filters;
    }).catch(error => { console.log(error); });
}

Next, we don't want to fetch something but send some JSON data package to a remote server. For this the payload JSON object will be wrapped into another JSON package and then POSTed to a remote endpoint which receives the data.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
function sendData(payload) {
    let json_data = {
        timestamp: Date.now(),
        payload: payload
    };

    var req = new Request(collectorHostname + "/collect/sendto", {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify(json_data),
        redirect: 'follow',
        referrer: 'client'
    });

    fetch(req).then(function(response) {
        return response.json();
    }).then((json) => {
        // Here the content of the response is available to be handled
    }).catch(error => { console.error(error); });
}

Detecting AJAX with request interception

Now, this was a special case I stumbled upon when I wanted to inject code into Twitter's pages. Twitter - and many other websites - do not execute a "normal" load of a new page when you click on a link, but instead replace only some content from the current DOM with DOM elements from the requested page. This speeds up loading times and looks a lot smoother.

The problem is: even though the address bar changes the displayed URL, the WebExtension filter for content scripts is not matched if you do not make a direct page request by putting the URL into the address bar and pressing enter. To solve this, another background script is needed which intercepts requests on certain websites to look for AJAX requests that need code injection.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// Filters are loaded from a config and would look like this:
// filters = {
//    "github": "issues/[0-9]{1,10}\\??",
//    "twitter": "/status/[0-9]*(\\?conversation.*)?"
// }

function url_catcher(details) {
    let url = details.url;
    let regexp;
    for (const key of Object.keys(filters)) {
        regexp = new RegExp(filters[key], "i");
        if (url.match(regexp) != null) {
            inject(key);
            break;
        }
    };
}

browser.webRequest.onCompleted.addListener(
    url_catcher,
    {  // Filter
        urls: [
            "https://github.com/*/*",
            "https://www.heise.de/forum/*",
            "https://twitter.com/*"
        ]
    }
);

To explain the above code we need to consider three steps:

  1. A request for a new resource is fired, probably as an XHR. After this request is done, the browser notifies our script which listenes on webRequest.onCompleted.
  2. If the URL of the AJAX request is matched in the listeners filter list urls, url_catcher is executed with the full request object.
  3. This request object is again examined, but in more detail - it is matched against regular expressions for very specific cases (here: loading a Github issue thread or a Tweet via its permalink).

If all these three steps succeed we know for sure that there was a request we want to inject code into. This is done with the inject function, which receives the key of the identified regex. Inside the addon package are multiple injection scripts in /js/injectors/, e.g. github.js or twitter.js.

1
2
3
4
5
6
7
8
9
function inject(injector) {
    browser.tabs.executeScript({
        file: "/js/injectors/" + injector + ".js",
        runAt: "document_idle"
    }).then(function(result) {
    }, function(error) {
        console.error(error);
    });
}

The result of all this will be an injected script which is written specifically for the identified URL.

Content scripts

Injecting DOM elements

Let's for example use Twitter again. Say our ajax_detector.js background script detected an AJAX request which was loading a Tweet and we want to modify the DOM of this Tweet. The site-specific script which is called with tabs.executeScript looks like this:

1
2
3
4
5
6
var comment_element_id_prefix = null;
var comment_element_classes = ["permalink-tweet", "tweet"];
var injection_element_identifier = ".permalink-header";
var clicked_source = "twitter";

injectButton(injection_element_identifier);

As the function injectButton does not have to be site-specific we spare putting it into the same script. It is instead loaded with the content script which was referenced in the Manifest for URL "https://twitter.com/*":

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
function injectButton(injection_element_identifier) {
    var els = d.querySelectorAll(injection_element_identifier);

    els.forEach(function(val, idx, obj) {
        let di = d.createElement("div");
        di.classList.add("cf-classifier");
        val.appendChild(di);

        let button = d.createElement("button");
        button.classList.add("cf-dontlike");
        di.appendChild(button);
        button.addEventListener("click", handleClickOpen);
Screenshot showing the injected element on Twitter

I'm cutting here because the rest is normal JavaScript element creation and insertion. So, what did we do? A request matched a regex pattern and the site-specific script was executed inside the tab. The executed script used an already injected function, injectButton, with variables that need to be changed for each site - DOM element identifiers, CSS classes and so on. The screenshot on the right shows how the result looked after manipulating the DOM of a Tweet.

Communication channels between background and tabs

To conclude the whole package we have to establish an information exchange channel between the content scripts (running in tabs) and background scripts (running in their own context). Why? Because each has their own permission and feature sets.

There are multiple options to handle messages, have a look at runtime.sendMessage and runtime.Port. I decided to use a one-sided message exchange, meaning my background script listens for messages by executing browser.runtime.onMessage.addListener(handleMessage); on startup, but it does not connect to content scripts by itself. If a content script needs data, it sends a message with a payload which the background script can differentiate and respond with the requested data.

The handler in the background script looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
function handleMessage(message, sender, respond) {
    if (message.src == "popup") {
        // Handle request for client ID to display in popup
        if (message.msg == "getClientId") {
            respond({ msg: client_id });
        }
    }

    if (message.src == "injector") {
        if (message.cmd != null) {
            switch (message.cmd) {
                case "getClassifiers":
                    respond({ type: "getClassifiers", response: classifiers });
                    break;
            }
        }

        if (message.payload != null) {
            // Injector content script send a payload to be saved in remote database
            sendData(message.payload);
        }
    }
}

What JSON structure you use for your messages (e.g. using .src or .cmd) is entirely your implementation.

Propagating input from the options page to the background

The options page is also a provided HTML file, but the context in which it is executed (and calls Javascript) is special since it is embedded into the addon managing list. I solved this exchange by buffering things into the storage. For this the background script needs a listener: browser.storage.onChanged.addListener(handleStorageChange);. The function for this handler in the background script looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
function handleStorageChange(changes, areaName) {
    // Handle TOR checkbox
    if (changes["useTor"] != undefined) {
        if (changes.useTor.newValue == true) {
            toggleTor(true);
            return;
        }
        toggleTor(false);
    }

    // Hack to buffer feedback comments.
    // Does not work with sendMessage in options page, so options page sets
    // a new value for the "setting", which triggers this function.
    // Second condition: handle ONLY new feedbacks, not removal.
    // storage.remove produces an object with no newValue.
    if (changes.feedback != null && changes.feedback.newValue != null) {
        let comment = changes.feedback.newValue;
        if (comment == "") return;
        sendFeedback(comment);
        browser.storage.local.remove("feedback")
            .then(null, error => { console.error(error); });
    }
}

Triggering the storage change can then be done inside a Javascript in the option page.

The end, thanks for reading! Have a look at the Mozilla docs and Christians blog entry to find more info about WebExtensions, links above. Version 2 of my addon removed a lot of code, it is more maintainable and uses the context menu API to construct the functionality the button once did.