Script Blocking

A Collection of Interesting Ideas,

This version:
https://explainers-by-googlers.github.io/script-blocking/
Issue Tracking:
GitHub
Inline In Spec
Editor:
(Google Inc.)

Abstract

User agents may block resource requests in order to protect users. This document suggests a hook in Fetch which would explain these behaviors and allow alignment on timing and impact.

1. Introduction

User agents generally make decisions about whether or not to load resources based on judgements about those resources' impact on user safety. Some of these decisions are widely agreed-upon, and have been codified as normative requirements in Fetch ("bad port" and Mixed Content restrictions, for example), while other decisions diverge between agents in a reflection of their unique and proprietary heuristics and judgements. User agents which rely upon Google’s Safe Browsing; Microsoft’s SmartScreen; or tracking protection lists from Disconnect, DuckDuckGo, etc. will all make different decisions about the specific set of resources they’ll refuse to load. It would be ideal, however, for those decisions to have a consistent impact when made. How are those decisions exposed to the web? How are they ordered vis a vis the standardized decisions discussed above? Are there properties we can harmonize and test?

This document aims to answer those questions in the smallest way possible, monkey-patching Fetch to provide an implementation-defined hook for blocking decisions, and sketching out a process by which widely agreed-upon categories of resource-blocking decisions could be tested at a high level of abstraction.

2. Infrastructure

For many of the blocking behaviors described above, user agents seem to have aligned on a pattern of applying well-defined blocking mechanisms ([CSP], [MIX], etc) first, only consulting a proprietary set of heuristics if the request would generally be allowed. Likewise, agents generally align on treating blockage as a network error, though some browsers will instead generate a synthetic response ("shim") for well-known resources to ensure compatibility.

We can support these behaviors with additions to [FETCH] that define an implementation-defined abstract operation that we can call from Fetch § 4.1 Main fetch.

2.1. UserAgentOverrideResponseForRequest(request)

The abstract operation OverrideResponseForRequest takes a request (request), and returns either a response or null.

This provides user agents to intervene on a given request by returning a response (either a network error or a synthetic response), or to allow the request to proceed by returning null.

By default, this operation has the following trivial implementation:

  1. Return null.

Note: This default implementation is expected to be overridden by a somewhat more complex implementation-defined algorithm. For example, a user agent might decide that its users' safety is best preserved by generally blocking requests to https://mikewest.org/, while shimming the widely-used resource https://mikewest.org/widget.js to avoid breakage. That implementation might look like the following:
  1. If request’s current url’s host’s registrable domain is "mikewest.org":

    1. If request’s current url’s path is « "widget.js" », then:

      1. Let body be [insert a byte sequence representing the shimmed content here].

      2. Return a new response with the following properties:

        type

        cors

        status

        200

        header list

        « Insert content-type, etc as appropriate here »

        ...

        ...

        body

        The result of getting body as a body.

    2. Return a network error.

  2. Return null.

2.2. Monkey-Patching Fetch

Fetch will call into the abstract operation above via a step inserted into Fetch § 4.1 Main fetch, just after the existing step 7 (which performs CSP, MIX, and bad port checks):

  1. If should request be blocked due to a bad port, should fetching request be blocked as mixed content, or should request be blocked by Content Security Policy returns blocked, then set response to a network error.

  2. If response is null, set response to the result of executing OverrideResponseForRequest on request.

  3. If request’s referrer policy is the empty string, then set request’s referrer policy to request’s policy container’s referrer policy.

This might not be the best spot for the new check. Perhaps it should move to the top of HTTP Fetch? [explainers-by-googlers/script-blocking Issue #2]

3. Testing Considerations

It would be ideal to verify the ordering of various restrictions that come into play via the patch to Fetch described above. Content Security Policy, Mixed Content (both blockage and upgrades), and port restrictions are all evaluated prior to checking in with any implementation-defined blockage oracle, and this behavior should be verifiable and consistent across user agents.

There’s likely no consistent way to do this for any and all blocking mechanisms, but specific categories of blocking behavior that have widespread agreement seem possible to test in a consistent way. As a potential path to explore, consider that Google’s Safe Browsing defines a small set of known-bad URLs (see https://testsafebrowsing.appspot.com/) that allow manual verification of browser behavior. Perhaps we could extend this notion to some set of high-level blockage categories that user agents seem to generally agree upon ("phishing", "malware", "unwanted software", "fingerprinting", etc), and define well-known test URLs for each within the WPT framework.

That is, phishing.web-platform.test could be added to user agents' lists of phishing sites, and represented within WPT via substitutions against {{domains[phishing]}}. We’d likely need some Web Driver API to make this possible, but it seems like a plausible approach that would allow us to verify ordering and replacement behaviors in a repeatable way.

Note: Some blocking behaviors (blocking all top-level navigation to an origin, for example) might be difficult to test based only upon web-visible behavior, as network errors and cross-origin documents ought to be indistinguishable. We could rely upon leaks like frame counting, but ideally we’d treat that as a bug, not a feature we can rely upon.

4. Security Considerations

Blocking or shimming subresource requests can put pages into unexpected states that developers are unlikely to have tested or reasoned about. This can happen in any event, as pages might be unable to load specific resources for a variety of reasons (outages, timeouts, etc). Ideally developers would handle these situations gracefully, but user agents implementing resource blocking would be well-advised to take the risk seriously, and carefully evaluate resources' usage before taking action against them.

5. Privacy Considerations

Blocking resources has web-visible implications. If the set of resources blocked for one user differs from the set of resources blocked by another user (based, perhaps, on heuristics that take individual users' browsing behavior into account), that visible delta could be used as part of a global identifier (see e.g. "Information Leaks via Safari’s Intelligent Tracking Prevention" for a variant of this attack [information-leaks]). User agents implementing resource blocking can avoid this risk by ensuring that the set of blocked resources is as uniform as possible across their userbase.

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[CSP]
Mike West; Antonio Sartori. Content Security Policy Level 3. URL: https://w3c.github.io/webappsec-csp/
[FETCH]
Anne van Kesteren. Fetch Standard. Living Standard. URL: https://fetch.spec.whatwg.org/
[HTML]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[INFRA]
Anne van Kesteren; Domenic Denicola. Infra Standard. Living Standard. URL: https://infra.spec.whatwg.org/
[MIX]
Emily Stark; Mike West; Carlos IbarraLopez. Mixed Content. URL: https://w3c.github.io/webappsec-mixed-content/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119

Informative References

[INFORMATION-LEAKS]
Artur Janc; et al. Information Leaks vis Safari's Intelligent Tracking Protection. URL: https://arxiv.org/pdf/2001.07421
[URL]
Anne van Kesteren. URL Standard. Living Standard. URL: https://url.spec.whatwg.org/

Issues Index

This might not be the best spot for the new check. Perhaps it should move to the top of HTTP Fetch? [explainers-by-googlers/script-blocking Issue #2]