Action Web Components Which Span the Server-Client Divide

HTML-first web development is all the rage with a growing cadre of application authors. Here's an exciting path you can take to a dependency-free, buildless-compatible architecture.

By Jared White

(Note: this content was originally intended for a new course in JavaScript programming, but with my course development on hold for the time being, it’s being released to the public for free. Enjoy!)

I want to preface this article with a little reminder of something which often gets overlooked in discussions like these:

There’s only one web API for executing JavaScript code when a particular tag appears on a page, and that’s the Custom Elements API. Yes, we’re talking about web components—but forget everything you know about “components” for this discussion because we’re going to dive into a very different pattern than you might be accustomed to.

First, a Little Backstory #

In the early 2000s, when I was but a wee lad learning how to build web applications as a greenhorn PHP developer, I happened across a fantastic little library called Xajax (snapshot from 2005).

Xajax brought the still-new concept of Ajax programming to PHP & JavaScript in an exciting way. You could write simple PHP functions on the backend, and then easily call those functions from your frontend. The response of a function could return various commands, things like “append this HTML to that element” or “delete those elements”. You could even write your own custom commands.

Meme of Leanardo DiCaprio pointing at a TV

Now before you start gesturing wildly and shouting how that’s just like [insert modern HTML-first library here], trust me—we’re getting there! 😄

In short order, I ended up taking over maintenance of Xajax—my very first experience with open source collaboration—and used it for a few years (even integrating it into my short-lived PHP fullstack web framework!) until I eventually gave into public peer pressure and learned Ruby on Rails. (I kid, I kid…but you have to realize in the late 2000s, RoR was the new hotness and PHP seemed old and busted.)

Fast forward to 2019, when I heard about a fancy new library for Rails called StimulusReflex. It extended the functionality of the Stimulus library provided by Rails and utilized asynchronous web sockets via an interrelated dependency called CableReady to enable dynamic UI updates pushed from the backend to the frontend. CableReady lets your backend “functions” return various operations, things like “append this HTML to that element” or “delete those elements”. You can even write your own custom operations.

Yes, you can now point at the screen just like Leanardo. 😆

Meme of Leanardo DiCaprio pointing at a TV

I got heavily involved in the StimulusReflex community on Discord, utilizing the library in new Rails application development throughout 2020 and 2021, and even contributed some functionality for what seemed to be an exciting future update to CableReady called CableCar. This would let you return operations from any standard HTTP request/response, not just async functions running via web socket connections. My journey back to the “Xajax” paradigm was nearly complete…

…until Rails’ “new magic” was unveiled in the form of Turbo as part of the launch of HEY, Basecamp’s new email app.

Turbo at heart was an evolution (aka Turbo Drive) of the previous TurboLinks library which itself was a spiritual successor to PJAX (a worthy pattern all in itself, but one we don’t have time to delve into here). In addition to Turbo Drive and Turbo Frames, it came with a paradigm called Turbo Streams. Turbo Streams enables your backend responses or async background jobs to return various actions, things like “append this HTML to that element” or “delete those elements”. You can even write your own custom actions—not a sanctioned concept at the start but one which was eventually merged into the library due to high demand.

There we go again! 🤣

Meme of Leanardo DiCaprio pointing at a TV

Xajax Commands. CableReady Operations. Turbo Actions. Turns out these are all variations on a theme: letting your backend’s HTML rendering pipeline update your frontend dynamically, rather than needing your frontend to take on the responsibility of updating its own UI (which requires an order of magnitude more frontend complexity and introduces performance headaches due to requiring large bundles of JS code to execute in order to generate DOM and run business logic).

I’ll also note that Alpine.js’s Directives and htmx’s Attributes—while different in the sense they both offer an API that is “pull-based” rather than “push-based” (* wut)—offer a similar overall developer experience by putting HTTP and HTML at the center of the UI programming model.

Actions…But Vanilla? #

So we’ve presented these various libraries which can translate various backend “actions” (we’ll go with that generic term from here on) to client-side interactivity. But in keeping with the spirit of The Spicy Web, it’s time to muse on the following question:

How might we write our own actions using nothing but native JavaScript APIs?

We have a handful of initial feature requests to consider:

These actions should be HTML-centric and easy to generate using any backend framework or toolchain.
These actions should be easily processed as part of any request/response cycle or async mechanisms (Server-Sent Events or Web Sockets) without special casing action-specific requirements.
These actions should be easy to write and adapt for various application needs.

The first point has me leaning towards wanting to use HTML as the transport format rather than JSON. JSON necessitates jumping through a lot more hoops on the frontend to do anything useful with the incoming data, whereas HTML can be read, parsed, and injected into a web page with virtually no client-side smarts required.

The second point means we shouldn’t assume anything about the “context” in which actions are delivered to the client. In other words, we shouldn’t combine action execution with any particular function or pattern that was specifically tailored to transport processing.

And the third point means action “handlers” should be standalone, via some sort of fully modular and atomic API which isn’t tied to any singular application shape or build dependency.

So…where does that leave us? If HTML is our transport mechanism and we need actions to be universally applicable, it sounds like we’re saying we should define actions using HTML tags.

Which brings us back around to what I said at the top of the article:

There’s only one web API for executing JavaScript code when a particular tag appears on a page, and that’s the Custom Elements API.

Anything else is much more fiddly. We could try to use a MutationObserver, as does Stimulus and many other libraries looking for special attributes contained within HTML, but that’s a much harder prospect. We could require some kind of JavaScript code to call a function explicitly in order to process a chunk of incoming HTML, but that would require a tighter coupling between action processing and whatever code is pulling down HTML in the first place.

Here’s the great thing about web component technology: we know it will work anywhere, anytime. Literally the moment a custom HTML tag shows up in some DOM somewhere, our client-side connectedCallback function will execute. This turns out to be the perfect mechanism for what I am calling Action Web Components.

Similar to how an HTML Web Component in popular parlance is a custom element which wraps standard HTML and applies interactivity to those child elements when rendered on a page, an Action Web Component performs a one-time operation when it’s rendered on the page and then (not always but typically) removes itself from the DOM upon completion. These actions don’t themselves have any visual appearance—they aren’t content, but commands.

It’s telling that Turbo Stream Actions are built using the custom elements API. In that case, there’s one tag—<turbo-stream>—which is responsible for reading in attributes and using that data to determine which action to execute. We’re however going to go a step further in our quest to stay close-to-the-metal: we’ll stick with a simple 1:1 action type == tag name nomenclature, using an ac prefix.

Let’s explore our first example.

Example Action: Class Toggle #

Let’s say in response to a particular event on the frontend (say, a button click), you want to add a class to an element on the page with a certain ID.

Here’s how that would look in HTML, the idea being that this could be directly rendered out in a response via a fragment (or it could live embedded in a larger template partial).:

<ac-toggle-class target="checkmark-12345" classname="success"></ac-toggle-class>

and this is how we would define the action in JavaScript as a web component:

customElements.define("ac-toggle-class", class extends ActionWebComponent {
  connectedCallback() {
    let force = this.getAttribute("force") || undefined
    if (force) {
      force = force == "true"
    }

    this.target.classList.toggle(this.getAttribute("classname"), force)
    this.remove()
  }
})

Our ActionWebComponent base class is extremely minimal, merely providing a few convenience methods such as using this.target to refer to the DOM element with the ID matching the tag’s target attribute. As you can see here, we get the CSS class via the classname attribute and then either toggle it by default or force it on or off via the presence of force="true" or force="false" attributes. The custom element removes itself as soon as it runs, leaving behind no trace in the DOM upon completion.

And…that’s it! It seems almost too simple to be true. Literally any place you might write <ac-toggle-class> in your codebase, your HTML now has the power to affect your UI.

There are numerous other such actions we could imagine…actions such as updating some HTML element with new children, scrolling to an element, focusing on an element, redirecting to another page after a short delay, or displaying a message in an alert.

Thankfully, you don’t need to imagine. I’ve already done the hard work for you! 🤯

Astro + Alpine.js + Action Web Components = 😍 #

I’ve put together a demo repo of a web application using the Astro framework. It uses Alpine.js as an easy way to handle JavaScript events like clicks in order to fetch HTML fragments, and Pico for good default styling. From there, we sprinkle in our Action Web Components to give Astro server → client superpowers. Advanced UI interactivity in mere lines of code? You betcha.

Demo Repo

Here are the actions I’ve written for this demo. You can find all of these in this JS file in the repo. And again, in case it’s not obvious—there’s absolutely nothing specific to Astro or Alpine.js about how Action Web Components work. I’m just using those tools for some helpful smarts to build the demo site, but you could just as easily go with a hand-rolled zero dependency static .html file and PHP backend. Party like it’s 1999. 🕺🏽

`ac-yellow-fade` aka the “Yellow Fade Technique” #

This one’s an oldie but goodie. When adding a new item on a page in a list or a grid or something to that effect, you temporarily highlight the new item. A yellow fade is still cool IMO, but there are many other design techniques you could use to draw the user’s attention to the new item. Perhaps a more generic take on this could be named ac-highlight.

`ac-toggle-class` aka Switch a class on and off #

Use this one to toggle a CSS class on a particular element.

`ac-portal` aka Render content at the bottom of `<body>` #

This may be less useful now in the age of <dialog> which doesn’t require portal techniques, but there are still times when you may want to inject some content at the very end of <body>.

`ac-scroll-to` aka Scroll the viewport to an element #

Maybe you have an Edit button which triggers an inline editing form…but what if the user can’t see nearly any part of the form because it’s below the viewport? You’ll likely want to scroll down to the form so it’s mostly or completely visible onscreen. This will do just that!

`ac-focus` aka Focus on a particular form field #

Perhaps in tandem with the scroll action, this lets you focus on an interactive element upon the user performing some task. BTW, in real-world usage I’ve also found adding a timeout of, say, 200ms is best if you’re also scrolling.

`ac-children` aka Replace a part of the DOM with a new HTML fragment #

This is a pretty common action, and something developers need to do often when programming using HTML-first techniques. If you’ve already used libraries like Turbo or htmx, this will feel entirely familiar!

`ac-redirect-to` aka Redirect to another URL after a short delay #

You’d be surprised how handy this action can be after various form submissions. I tend to use this technique a lot. (An immediate redirect doesn’t make much sense since you can trigger that server-side with an HTTP 302 status, but using a delay means you can notify the user of something, then after a few seconds take them to their next destination.)

`ac-message-dialog` aka An attempt at a better `alert()` #

There are times when you need to display critical information to the user. I thought it would be helpful to design a simple dialog (using <dialog> of course!) and make that an action, but it’d probably also be helpful to write a custom JavaScript async function to do this so client-side code could also trigger such a dialog.

Guess what? There were a couple of additional actions I thought about writing for this demo, but didn’t…an exercise left for the reader! They were:

Toast Notification: showing a notification after a form save is a very common pattern…but notification components as well as the toast stack can also get very complicated. My recommendation: just use ~~Shoelace~~ Web Awesome!

Attribute Swaps: a pattern I increasingly like is an action which will swap out one or more attributes of an element, typically a web component. This is a pretty powerful combo, as you can write a web component which manages internal state and re-renders DOM accordingly, and with an attribute action you can just pass some new state along to the web component and let it handle the re-render. With an action like this, as well as the children action mentioned above, I almost never feel the need to reach for full-blown DOM morphing! (Which is technique I happen to think borders on anti-pattern…)

Can you think of some additional actions you might like to reach for? (Maybe an ac-delete for removing a DOM node?)

A Word on Progressive Enhancement #

Inevitably when discussions arise of updating UI in a manner which requires JavaScript—even if HTML remains at the center of the approach—you’ll hear the common refrain: but what about progressive enhancement? And for good reason: our industry has fallen down hard when it comes to the best practice that you should be writing web applications in a multi-layered fashion: first HTML, then CSS, then JavaScript.

Unfortunately, some folks tend to overcorrect on this issue, and present progressive enhancement as first HTML, then CSS, then…well, you probably shouldn’t use JavaScript for anything but the slightest of polishing techniques which are totally optional. 🧐

I’m here to make a bold statement: progressive enhancement does not mean a complete JavaScript-optional experience. And before you start throwing tomatoes at me, I’ll simply crib from the MDN documentation on this topic:

Progressive enhancement is a design philosophy that provides a baseline of essential content and functionality to as many users as possible, while delivering the best possible experience only to users of the most modern browsers that can run all the required code.

The word progressive in progressive enhancement means creating a design that achieves a simpler-but-still-usable experience for users of older browsers and devices with limited capabilities, while at the same time being a design that progresses the user experience up to a more-compelling, fully-featured experience for users of newer browsers and devices with richer capabilities.

The most important phrase in this definition: “baseline of essential content and functionality”. And that’s up to you as the application developer…what is the baseline of essential content and functionality you want to make sure is available to the widest possible set of users?

In practical terms, it might mean that some key features of your application work fine without JavaScript, whereas other less-vital features do require JavaScript. Or perhaps you offer a simpler HTML form UX to baseline users and a much-improved interactive HTML+JS form UX to high-end users. See also: Graceful Degradation.

The reason this is all so important to understand is as soon as you start introducing techniques like fetch, server-sent “actions”, etc., you immediately leave behind the world of of no-JS, HTML-only interactions. Which isn’t a bad thing as we’ve just covered, but you should understand the tradeoffs. Action web components, libraries like Alpine or htmx or Turbo, etc., are great for building on top of considerate baseline experiences, but they shouldn’t comprise your only experiences or you will inevitably run into the sorts of issues progressive enhancement is intended to address. Construct your application architecture accordingly.

Conclusion: HTML-First Programming Works #

The reason so many people have fallen in love (again?) with HTML-first programming techniques in web development is because the levels of complexity, performance issues, accessibility concerns, and fragile build tooling required to build with JavaScript-first techniques (aka the React apps of the world) has proven to be wildly against “the grain of the web” for all but the most specific of application types. It may still be true that “nobody ever got fired for using React” (a riff off the old canard about IBM), but at the same time, “you probably don’t need React” is increasingly true for most web projects. (Just ask Zach…he knows!)

Which opens the door to all sorts of other interesting techniques and tools, some of which harken back to years past when the web was still understood to be founded on server-first, HTML-first principals. Today, we have the chance to embrace the best that the web platform has to offer all while continuing to evolve the types of architectures best suited to a dizzying array of application types.

I personally think Action Web Components offers an intriguing paradigm which is compatible with a near-infinite number of client/server topologies. It’s totally framework-independent, and save for a wee bit of client-side JavaScript, it’s totally language-independent as well. Which suits me, as someone who still primarily writes server applications in Ruby, just fine. Give it a try, and let me know what you think in The Spicy Web Discord!

First, a Little Backstory #

Actions…But Vanilla? #

Example Action: Class Toggle #

Astro + Alpine.js + Action Web Components = 😍 #

ac-yellow-fade aka the “Yellow Fade Technique” #

ac-toggle-class aka Switch a class on and off #

ac-portal aka Render content at the bottom of <body> #

ac-scroll-to aka Scroll the viewport to an element #

ac-focus aka Focus on a particular form field #

ac-children aka Replace a part of the DOM with a new HTML fragment #

ac-redirect-to aka Redirect to another URL after a short delay #

ac-message-dialog aka An attempt at a better alert() #

A Word on Progressive Enhancement #

Conclusion: HTML-First Programming Works #

`ac-yellow-fade` aka the “Yellow Fade Technique” #

`ac-toggle-class` aka Switch a class on and off #

`ac-portal` aka Render content at the bottom of `<body>` #

`ac-scroll-to` aka Scroll the viewport to an element #

`ac-focus` aka Focus on a particular form field #

`ac-children` aka Replace a part of the DOM with a new HTML fragment #

`ac-redirect-to` aka Redirect to another URL after a short delay #

`ac-message-dialog` aka An attempt at a better `alert()` #