Personal fork of
@rocket.chat/message-parser— converts chat messages into structured AST. Dual-engine design: a PEG grammar reference and a fast handwritten parser. Written in TypeScript, tree-shakable, 814 passing tests.
Disclaimer: This is a personal fork of the Rocket.Chat message-parser. Not affiliated with Rocket.Chat.
npm install @rocket.chat/message-parserimport { parse } from '@rocket.chat/message-parser';
const ast = parse('Hello **world**!');const ast = parse('Hello **world**!', {
colors: true,
emoticons: true,
katex: true,
customDomains: ['intranet.example.com'],
});// Use the PEG grammar as the parser engine (default is handwritten)
const ast = parse('Hello **world**!', { engine: 'peggy' });| Param | Type | Default | Description |
|---|---|---|---|
message |
string |
— | Raw chat message |
options |
object |
{} |
Parser options (see below) |
| Option | Type | Default | Description |
|---|---|---|---|
colors |
boolean |
false |
Parse color:#RGB / color:#RRGGBB |
emoticons |
boolean |
false |
Convert :) :D <3 etc. to emoji |
katex |
boolean |
false |
Parse $...$ / $$...$$ KaTeX math |
customDomains |
string[] |
[] |
Extra TLDs for URL auto-linking |
engine |
'peggy' | 'handwritten' |
'handwritten' |
Parser engine |
isNodeOfType(node, kind)— type guard for AST nodesToken,TokenKind— token type definitionstokenize(message)— lexer-only pass (returns token stream)Lexer,Parser,TokenStream— internal classes
parser(function) andMarkdownAST(type) are deprecated aliases forparseandRoot.
| Feature | Syntax | Example |
|---|---|---|
| Quote | > prefix |
> hello |
| Heading | # ## ### #### prefix |
# Title |
| Unordered list | - or * prefix |
- item |
| Ordered list | 1. prefix |
1. item |
| Task list | - [ ] / - [x] prefix |
- [x] done |
| Code fence | ``` triple backtick |
```ts ``` |
| Block spoiler | ` | |
| BigEmoji | 1–3 emoji only in message | 😀 🎉 |
| Feature | Syntax | Example |
|---|---|---|
| Bold | **text** |
**bold** |
| Italic | _text_ |
_italic_ |
| Strikethrough | ~text~ |
~strike~ |
| Inline code | `code` |
`const x = 1` |
| Inline spoiler | ||text|| |
||secret|| |
| Link | [text](url) |
[Rocket.Chat](https://rc.chat) |
| Image |  |
 |
| Angle link | <url|label> |
<https://rc.chat|Chat> |
| Mention (user) | @username |
@john |
| Mention (channel) | #channel |
#general |
| Emoji | :emoji_name: |
:rocket: |
| Emoticon | Text-to-emoji | :) :D :P <3 |
| KaTeX inline | $...$ or \(...\) |
$E=mc^2$ |
| KaTeX block | $$...$$ or \[...\] |
$$E=mc^2$$ |
| URL auto-link | Bare URL | https://rocket.chat |
| Email auto-link | user@domain |
user@rocket.chat |
| Color | color:#RRGGBB |
color:#ff0000 |
| Phone | +1234567890 |
+1234567890 |
| Escaped char | \* \_ \~ |
\*not italic\* |
| Line break | Two trailing spaces + \n |
line 1··\nline 2 |
Accepts Unix timestamps, ISO 8601 dates, and time-only values:
<t:1234567890>— Unix timestamp<t:2025-07-22T10:00:00.000Z>— ISO 8601<t:10:00:t>— time-only (HH:MM or HH:MM:SS)
Optional format modifier:
| Format | Description | Example |
|---|---|---|
t |
Short time | 12:00 AM |
T |
Long time | 12:00:00 AM |
d |
Short date | 12/31/2020 |
D |
Long date | Thursday, December 31, 2020 |
f |
Full date and time | Thursday, December 31, 2020 12:00 AM |
F |
Full date and time (long) | Thursday, December 31, 2020 12:00:00 AM |
R |
Relative time | 1 year ago |
Two parser implementations:
- PEG grammar (
engine: 'peggy') — the reference implementation, a formal grammar ingrammar.pegjs. Comprehensive but slower. - Handwritten (
engine: 'handwritten', default) — hand-coded recursive descent parser. Faster, designed for real-time rendering.
Both produce identical ASTs (verified by 814 tests across both engines). Use 'peggy' when you need strict spec compliance; use 'handwritten' for performance-sensitive rendering.
The handwritten parser includes fixes and improvements not present in the original PEG grammar:
- CommonMark intraword underscore rule —
_does not open emphasis when preceded by a word character.text_hello_andsome_snake_case_textremain plain text rather than being parsed as italic. - Deterministic strike/italic ordering — when
~and_compete at the same nesting level, strike wins predictably. PEG's backtracking can produce inconsistent results depending on input length and token positions. - Performance — ~45x faster on average (5.6x–265x range depending on message type). The handwritten parser avoids PEG's backtracking overhead, especially on emoji, mentions, and code blocks.
Run yarn bench:compare to see the full per-category breakdown.
yarn bench # full suite
yarn bench:parser # parser throughput
yarn bench:lexer # lexer throughput
yarn bench:compare # handwritten vs PEG comparison
yarn bench:size # bundle size- Grammar bugs: edit
grammar.pegjs, then regenerate withyarn build:grammar - Parser bugs: edit
src/parser/Parser.ts - Lexer bugs: edit files in
src/lexer/ - Tests go in
tests/; run withyarn test
Before submitting a PR, ensure no regressions against the reference grammar:
yarn test- Nested lists are unsupported
URLrule doesn't allow whitespace,(, or)in bare URLs