A powerful markup language parser with jQuery-like fluent API, built from scratch in TypeScript for Bun runtime.
STML (Structured Text Markup Language) is an XML-like language with enhanced features for modern development.
- Full STML Support: Handles elements, attributes, text, comments, processing instructions, and DOCTYPE declarations
- Capture Operators: Forward (
<tag (>), backward (<) tag>), and sandwich (<= tag =>) capture for flexible document structuring - Raw Content: Tags with
!suffix (<tag!>) preserve literal content without parsing - Fragments: Reusable content blocks with
<#id>syntax and transclusion via<<#id>> - jQuery-like Fluent API: Intuitive method chaining for traversal and manipulation
- Namespace Support: Parses and preserves XML namespaces with prefixes
- Tree Structure: Builds a complete DOM-like tree structure
- CSS Selectors: Full CSS selector support including
.classsyntax vialabelsattribute - Error Handling: Provides detailed error messages with line and column information
- Utilities: Includes helper functions for traversing and manipulating the XML tree
- Serialization: Can serialize parsed XML back to string format with STML or XML output
- Zero Dependencies: Built entirely from scratch with no external dependencies
bun installimport { S } from './src'; // jQuery-like API
import { parseStml, getElementsByTagName, getTextContent } from './src'; // Traditional API
// Basic STML
const stml = `
<book id="1" labels="fiction classic">
<title>Example Book</title>
<author>John Doe</author>
</book>
`;
const doc = S(stml);
console.log(doc.find('title').text()); // "Example Book"
console.log(doc.find('.classic').attr('id')); // "1"
// Capture Operators
const capture = `
<div (>This text is captured by div
<ul ((>
<li>Item 1</li>
<li>Item 2</li>
Text before<) wrapper>
<span>Before sandwich</span><= sandwich =><em>After sandwich</em>
`;
// Raw Content
const raw = `
<script!>
const x = 5; // No escaping needed
if (x < 10) { console.log("works!"); }
</script>
`;
// Traditional API
const document = parseStml(stml);
const title = getElementsByTagName(document, 'title')[0];
console.log(getTextContent(title)); // "Example Book"const doc = S(stmlString);
// Traversal
doc.find('book') // Find all book elements
.filter('.classic') // Filter to classic books
.children('title') // Get title children
.text(); // Get text content
// Manipulation
doc.find('#myId')
.addClass('highlight')
.attr('data-modified', 'true')
.append('<note>New content</note>');parseStml(stml: string): Parse STML string into document treeS(stml: string | node): Create fluent wrapper (alias forstml())
getElementsByTagName(node, tagName): Find all elements with given tag namegetElementById(node, id): Find element by ID attributegetTextContent(node): Extract text content from a nodetoStmlString(node, indent?): Serialize node back to STML stringvalidateStml(stml): Validate STML string and return validation result
The parser provides TypeScript types for all node types:
DocumentNode: Root document nodeElementNode: STML element with tag name and attributesTextNode: Text contentCommentNode: XML commentsProcessingInstructionNode: Processing instructions like<?xml ...?>AnyStmlNode: Union type of all node typesDocTypeNode: DOCTYPE declarations
The STML parser includes a powerful CLI tool for working with STML files:
# Format/pretty-print an STML file
bun run cli/stml.ts format input.stml
# Convert STML to XML
bun run cli/stml.ts convert input.stml -o output.xml
# Parse and show structure
bun run cli/stml.ts parse input.stml --json
# Query elements using CSS selectors
bun run cli/stml.ts query input.stml "div.class"
# Validate syntax
bun run cli/stml.ts validate input.stml
# Resolve transclusions
bun run cli/stml.ts resolve input.stml -o expanded.stmlRun the test suite:
bun testRun the examples:
bun run example.ts # Traditional API
bun run fluent-example.ts # Fluent jQuery-like APISTML extends XML with powerful features while maintaining compatibility:
- Capture Operators: Restructure document trees with
<tag (>,<) tag>, and<= tag =>syntax - Raw Content: Use
<tag!>for literal content without escaping - Fragments: Define reusable content blocks with
<#id>and transclude with<<#id>> labelsattribute: Works like HTML'sclassattribute for CSS selectors- Fluent API: jQuery-like interface for intuitive manipulation
- Enhanced selectors: Full CSS selector support optimized for tree structures
Detailed documentation for each STML feature:
- Capture Operators - Forward, backward, and sandwich capture syntax
- Raw Content - Literal content without escaping
- Fragments & Transclusion - Reusable content blocks
- Namespaces - XML namespace support
- Fluent API - jQuery-like manipulation interface
The parser consists of four main components:
- Tokenizer (
src/tokenizer.ts): Lexical analysis - breaks STML into tokens - Parser (
src/parser.ts): Syntactic analysis - builds tree structure from tokens - Utils (
src/utils.ts): Helper functions for working with the parsed tree - Fluent API (
src/stml.ts): jQuery-like wrapper for intuitive manipulation
MIT