Home >Backend Development >Golang >Go-DOM - A headless browser written in Go.

Go-DOM - A headless browser written in Go.

Linda Hamilton
Linda HamiltonOriginal
2024-11-07 18:28:02932browse

Go-DOM - A headless browser written in Go.

Having too little to do sometimes results in a crazy idea, and this time; it was to write a headless browser in Go, with full DOM implementation and JavaScript support, by embedding the v8 engine.

It all started working with writing an HTMX application, and the need to test it that made me curious if there was a pure Go implementation of a headless browsser.

Searching for "go headless browser" only resulted in search results talking about automating a headless browser, i.e. using a real browser like Chrome of Firefox in headless mode.

But nothing in pure Go.

So I started building one.

Why A Headless Browser in Go?

It may seem silly because writing a headless browser will never work like a real browser; and as such wouldn't really verify that your application works correctly in all the browsers you have decided to support. Neither does this allow you to get nice features such as screenshots of the application when things stop working.

So why then?

The Need for Speed!

To work in an effective TDD loop, tests must be fast. Slow test execution discourages TDD, and you loose the efficiency benefits a fast feedback loop provides.

Using browser automation for this type of verification has severe overheads, and such tests are typically written after the code was written; and as such, they no longer serve as a help writing the correct implementation; but are reduced to a maintenance burden after the fact; that only occasionally detect a bug before your paying customers do.

The goal is to create a tool that supports a TDD process. To be usable, it needs to run in-process.

It needs to be written in Go.

Less Flaky Tests

Having the DOM in-process enables writing better wrappers on top of the DOM; which can help providing a less erratic interface for your tests, like testing-library does for JavaScript.

Rather than depending on CSS classnames, element IDs, or DOM structure, you write your tests in a user-centric language, like this.

Type "me@example.com" in the textbox that has the label, "Email"

Or in hypothetical code.

testing.GetElement(Query{
  role: "textbox",
  // The accessibility "name" of a textbox _is_ the label
  name: "Email",
}).type("me@example.com")

This test doesn't care if the label is implemented as

This decouples verification of behaviour from UI changes; but it does enforce that the text, "Email" is associated with the input field in accessible way. This couples the test to how the user interacts with the page; including those relying on screen readers for using your page.

This achieves the most important aspect of TDD; to write tests coupled to concrete behaviour.1

Although it's probably technically possible to write the same tests for an out-of-process browser; the benefit of native code is essential for the type of random access of the DOM you most likely need for these types of helpers.

An example: JavaScript

To exemplify the type of test, I will use a similar example from JavaScript; also an application using HTMX. The test verifies a general login flow from requesting a page requiring authentication.

It's a bit long, as I've combined all setup and helper code in one test function here.

testing.GetElement(Query{
  role: "textbox",
  // The accessibility "name" of a textbox _is_ the label
  name: "Email",
}).type("me@example.com")

In simple terms the test does the following:

  1. Stub out the authentication function, simulating a successful response.
  2. Request a page that requires authentication
  3. Verify that the browser redirects to the login page, and the browser URL is updated. 2
  4. Fill out the form with the expected values, and submit.
  5. Verify that the browser redirects to the originally requested page, and it shows information for the stubbed user.

Internally the test starts an HTTP server. Because the this runs in the test process, mocking and stubbing of business logic is possible. The test use jsdom to communicate with the HTTP server; which both parse the HTML response into a DOM, but also executes client-side script in a sandbox which has been initialised, e.g. with window as the global scope.3

This enables writing tests of the HTTP layer, where validating the contents of the response is not enough. In this case; that the response is processed by HTMX as intended.

But apart from waiting for some HTMX events, so as to not proceed to early (or too late) the test doesn't actually care about HTMX. In fact, if I remove HTMX from the form, resorting to classical redirects, the test still pass.

(If I remove the HTMX