@chronoai/toolkit/output-parser

管道式 AI 输出解析工具。将一系列解析步骤组合成管道，逐步从原始文本中提取结构化数据。不依赖引擎，可在任意上下文中使用。

提示词格式拼装请使用 @chronoai/toolkit/output-format。

安装

已包含在 @chronoai/toolkit 中，无需单独安装。

import {
  createPipeline,
  trim,
  replace,
  stripFences,
  markdownSection,
  extractCodeBlocks,
  matchRegex,
  keyValues,
  xmlTag,
  looseJson,
  splitLines,
  splitBy,
  custom,
  // 引擎集成
  OutputParserFeature,
} from '@chronoai/toolkit/output-parser';

为什么用管道？

AI 的实际输出往往需要多轮处理——先剥离 markdown 代码块，再提取特定 XML 标签，最后把 JSON 字符串解析成对象。管道让你把每一步解耦为独立的步骤函数，按需组合，清晰可控。

核心概念

PipelineContext

管道中所有步骤共享一个上下文对象：

interface PipelineContext {
  raw: string;                    // 原始输入（始终不变）
  text: string;                   // 当前处理中的文本（可被步骤修改）
  data: Record<string, unknown>;  // 累积提取出的数据
}

每个步骤接收当前上下文，返回新的上下文。raw 始终保留原始输入不变，text 是可被各步骤修改的工作文本，data 中逐步积累提取结果。

创建管道

const pipeline = createPipeline([
  stripFences(),
  xmlTag('tool_calls', { required: false }),
  looseJson({ required: false }),
]);

const { data, raw, text } = pipeline.parse(aiRawOutput);

createPipeline(steps) 接收步骤数组，返回带 parse(raw) 方法的对象。调用 parse 时，步骤按顺序执行。

内置步骤

`trim()` 和 `replace(search, replaceBy)`

基础文本处理工具，分别用于去除首尾空白、执行文本替换。

createPipeline([
  replace(/^[^\n]*\n/, ''), // 剔除第一行
  trim(),
])

`stripFences()`

剥离 markdown 代码块包裹（如 ```json ... ``` → 内部文本）。作用于 ctx.text。

createPipeline([stripFences()]).parse('```json\n{"a":1}\n```');
// ctx.text → '{"a":1}'

`markdownSection(heading, opts?)`

提取指定 Markdown 标题下的内容，写入 ctx.data[heading]。匹配从标题下一行开始，遇到同级/更高级标题或文档边界时停止。

参数	类型	默认值	说明
`heading`	`string`	—	标题文本（不含 `#`，如 `'Thinking'`）
`opts.required`	`boolean`	`true`	缺失时是否抛出错误

const pipeline = createPipeline([
  markdownSection('Thinking', { required: false }),
  markdownSection('Answer'),
]);

pipeline.parse(`
## Thinking
Step 1...
## Answer
The final result.
`);
// data.Thinking → 'Step 1...'
// data.Answer   → 'The final result.'

`extractCodeBlocks(opts?)`

提取所有满足条件的 Markdown 代码块，组成的字符串数组写入 ctx.data[opts.key]。

参数	类型	默认值	说明
`opts.lang`	`string`	`undefined`	语言过滤（如 `'python'`），不传则提取所有
`opts.key`	`string`	`'codeBlocks'`	存入 `data` 的键名

const pipeline = createPipeline([
  extractCodeBlocks({ lang: 'python', key: 'python_scripts' })
]);
// data.python_scripts → ["print('hello')", "x = 1"]

`matchRegex(pattern, keys, opts?)`

在 ctx.text 上执行正则表达式匹配，将捕获组（(...)）的内容按照 keys 的顺序映射并吸入 ctx.data。

参数	类型	默认值	说明
`pattern`	`RegExp`	—	正则（务必包含捕获组）
`keys`	`string[]`	—	长度应与捕获组对应
`opts.required`	`boolean`	`true`	匹配失败时是否抛出错误

const pipeline = createPipeline([
  matchRegex(/Score:\s*(\d+).*?Reason:\s*(.*)/is, ['score', 'reason'])
]);

pipeline.parse('Here is the evaluation.\nScore: 95\nReason: Excellent work');
// data.score  → '95'
// data.reason → 'Excellent work'

`keyValues(opts?)`

将简单的 “键名 分隔符 键值” 的行文解析为对象属性，浅合并到 ctx.data。适用于需要平铺收集 AI 自由发挥的字段时。

参数	类型	默认值	说明
`opts.separator`	`string`	`':'`	键值分隔符

const pipeline = createPipeline([
  keyValues({ separator: ':' })
]);

pipeline.parse('Name: Alice\nStatus: Active\nIgnoredLine');
// data.Name   → 'Alice'
// data.Status → 'Active'

`xmlTag(name, opts?)`

从 ctx.text 中提取指定 XML 标签的内容，写入 ctx.data[name]。

参数	类型	默认值	说明
`name`	`string`	—	标签名
`opts.required`	`boolean`	`true`	缺失时是否抛出错误

const pipeline = createPipeline([
  xmlTag('saved_memories', { required: false }),
  xmlTag('response'),
]);

pipeline.parse(`
<saved_memories>["a","b"]</saved_memories>
<response>你好！</response>
`);
// data.saved_memories → '["a","b"]'
// data.response       → '你好！'

`looseJson(opts?)`

将 ctx.text 作为宽松 JSON 解析，结果浅合并到 ctx.data。自动容忍：

尾随逗号 { "a": 1, }
单行注释 // ...
块注释 /* ... */

参数	类型	默认值	说明
`opts.required`	`boolean`	`true`	解析失败时是否抛出错误

const pipeline = createPipeline([
  stripFences(),
  looseJson(),
]);

pipeline.parse('```json\n{ "answer": "hello", }\n```');
// data.answer → 'hello'

`splitLines(opts?)`

按换行符（\n）将 ctx.text 拆分成字符串数组，写入 ctx.items。 后续管道步骤将对数组中的每个元素依次执行，最终结果收集到 ctx.data.items（每个元素对应一个 data 对象）。

参数	类型	默认值	说明
`opts.filter`	`boolean`	`true`	自动过滤空行；设为 `false` 保留空行

const pipeline = createPipeline([
  splitLines(),
  looseJson(),
]);

pipeline.parse('{"a":1}\n{"b":2}\n{"c":3}');
// data.items → [{ a: 1 }, { b: 2 }, { c: 3 }]

`splitBy(separator, opts?)`

按任意字符或字符串将 ctx.text 拆分成数组，写入 ctx.items。后续步骤同 splitLines，对每个元素依次执行。

参数	类型	默认值	说明
`separator`	`string`	—	分隔符，如 `','`、`'---'`
`opts.filter`	`boolean`	`true`	自动过滤空白片段；设为 `false` 保留

const pipeline = createPipeline([
  splitBy('---'),
  xmlTag('item'),
]);

pipeline.parse('<item>A</item>---<item>B</item>---<item>C</item>');
// data.items → [{ item: 'A' }, { item: 'B' }, { item: 'C' }]

`custom(fn)`

自定义步骤。传入函数接收当前上下文，返回部分上下文字段（浅合并）。

custom(ctx => ({
  data: {
    ...ctx.data,
    tool_calls: ctx.data.tool_calls
      ? JSON.parse(ctx.data.tool_calls as string)
      : undefined,
  },
}))

数组模式

当管道中出现 splitLines() 或 splitBy() 时，管道进入数组模式：

拆分步骤将 ctx.text 切分成 ctx.items: string[]
之后的所有步骤不再作用于整体文本，而是分别作用于每个 item（各自持有独立的子上下文 { raw: item, text: item, data: {} }）
每个元素经过所有后续步骤处理后，其 data 收集到最终的 ctx.data.items 数组

const pipeline = createPipeline([
  stripFences(),    // 先对整体文本操作
  splitLines(),     // ← 从这里开始进入数组模式
  looseJson(),      // 对每行分别解析 JSON
]);

const { data } = pipeline.parse(`
  { "name": "Alice", "score": 90 }
  { "name": "Bob",   "score": 85 }
`);
// data.items → [
//   { name: 'Alice', score: 90 },
//   { name: 'Bob',   score: 85 },
// ]

注意：一条管道只支持一次拆分。进入数组模式后不支持再次嵌套拆分。

完整示例：XML 包裹 JSON

配合 output-format 的 XML 包裹格式，一个典型的解析管道：

import { createPipeline, stripFences, xmlTag, custom } from '@chronoai/toolkit/output-parser';

const pipeline = createPipeline([
  stripFences(),
  xmlTag('saved_memories', { required: false }),
  xmlTag('tool_calls', { required: false }),
  xmlTag('response'),
  custom(ctx => ({
    data: {
      ...ctx.data,
      saved_memories: ctx.data.saved_memories
        ? JSON.parse(ctx.data.saved_memories as string)
        : [],
      tool_calls: ctx.data.tool_calls
        ? JSON.parse(ctx.data.tool_calls as string)
        : [],
    },
  })),
]);

// AI 原始输出
const rawOutput = `
<saved_memories>
[{"text": "用户喜欢简洁回答"}]
</saved_memories>
<tool_calls>
[{"name": "search", "args": {"query": "天气"}}]
</tool_calls>
<response>
今天北京晴，25°C。
</response>
`;

const { data } = pipeline.parse(rawOutput);
// data.saved_memories → [{ text: '用户喜欢简洁回答' }]
// data.tool_calls     → [{ name: 'search', args: { query: '天气' } }]
// data.response       → '今天北京晴，25°C。'

Feature 用法（引擎集成）

如果你在 ChronoAI 引擎中使用，可以用 OutputParserFeature 将管道直接挂载到时间轴上——当 source 时间轴写入新值时，自动触发管道并将解析结果（data）写入 timeline。

agent.use(OutputParserFeature, {
  source: 'ai-response',
  timeline: 'ai-parsed',
  pipeline: [
    stripFences(),
    xmlTag('response'),
    xmlTag('tool_calls', { required: false }),
  ],
});

source 可以引用本 Agent 内任意已声明的时间轴，包括来自其他 toolkit 或用户自定义的时间轴。

处理多条独立响应流时，多次注册即可，每次注册是独立实例：

agent.use(OutputParserFeature, {
  source: 'ai-response',
  timeline: 'ai-parsed',
  pipeline: [stripFences(), xmlTag('response')],
});

agent.use(OutputParserFeature, {
  source: 'summary-response',
  timeline: 'summary-parsed',
  pipeline: [stripFences(), looseJson()],
});

类型导出

类型	说明
`PipelineContext`	管道上下文：`{ raw, text, data, items? }`
`PipelineStep`	步骤函数类型：`(ctx: PipelineContext) => PipelineContext`
`OutputParserFeatureConfig`	`OutputParserFeature` 的配置类型

时间轴

外部时间轴

使用 OutputParserFeature 时产生以下时间轴：

名称	值类型	说明
`config.source`（用户指定，引用）	`string`	监听的源时间轴，有新值时触发管道
`config.timeline`（用户指定，声明）	`Record<string, unknown>`	管道执行后 `data` 的写入目标，供其他 Feature 消费

内部时间轴

无。

配置参数

`OutputParserFeature` 配置

agent.use(OutputParserFeature, config) 的 config 支持以下参数：

参数	类型	必填	默认值	说明
`source`	`string`	是	—	监听的源时间轴名称（引用模式，可跨 toolkit / 跨 Feature）
`timeline`	`string`	是	—	解析结果写入的目标时间轴名称（声明模式）
`pipeline`	`PipelineStep[]`	是	—	管道步骤列表，按顺序执行
`retain`	`number \| 'all'`	否	`1`	目标时间轴的帧保留策略：`1` 仅保留最新帧，`'all'` 保留全部历史帧

`createPipeline(steps)` 步骤参数

createPipeline(steps) 的 steps 为 PipelineStep[]，按顺序传入以下内置步骤函数组合使用：

`trim()`

无参数。去除 ctx.text 首尾空白字符。

`replace(searchValue, replaceValue)`

参数	类型	说明
`searchValue`	`string \| RegExp`	查找目标
`replaceValue`	`string`	替换值

`stripFences()`

无参数。剥离 ``` 代码块包裹，作用于 ctx.text。

`markdownSection(heading, opts?)`

参数	类型	默认值	说明
`heading`	`string`	—	Markdown 标题文本（不含 `#`）
`opts.required`	`boolean`	`true`	缺失时抛出错误

`extractCodeBlocks(opts?)`

参数	类型	默认值	说明
`opts.lang`	`string`	—	仅提取该语言的代码块
`opts.key`	`string`	`'codeBlocks'`	存入 `data` 的键名

`matchRegex(pattern, keys, opts?)`

参数	类型	默认值	说明
`pattern`	`RegExp`	—	正则表达式（须包含捕获组）
`keys`	`string[]`	—	依序对应各捕获组的键名
`opts.required`	`boolean`	`true`	匹配失败时抛出错误

`keyValues(opts?)`

参数	类型	默认值	说明
`opts.separator`	`string`	`':'`	键值分隔符

`xmlTag(name, opts?)`

参数	类型	默认值	说明
`name`	`string`	—	XML 标签名
`opts.required`	`boolean`	`true`	缺失时抛出错误

`looseJson(opts?)`

参数	类型	默认值	说明
`opts.required`	`boolean`	`true`	解析失败时抛出错误

`splitLines(opts?)`

参数	类型	默认值	说明
`opts.filter`	`boolean`	`true`	自动过滤空行

`splitBy(separator, opts?)`

参数	类型	默认值	说明
`separator`	`string`	—	分隔符
`opts.filter`	`boolean`	`true`	自动过滤空白片段

`custom(fn)`

参数	类型	说明
`fn`	`(ctx: PipelineContext) => Partial<PipelineContext>`	自定义变换函数，返回值浅合并到上下文

OutputParser

@chronoai/toolkit/output-parser

安装

为什么用管道？

核心概念

PipelineContext

创建管道

内置步骤

trim() 和 replace(search, replaceBy)

stripFences()

markdownSection(heading, opts?)

extractCodeBlocks(opts?)

matchRegex(pattern, keys, opts?)

keyValues(opts?)

xmlTag(name, opts?)

looseJson(opts?)

splitLines(opts?)

splitBy(separator, opts?)

custom(fn)

数组模式

完整示例：XML 包裹 JSON

Feature 用法（引擎集成）

类型导出

时间轴

外部时间轴

内部时间轴

配置参数

OutputParserFeature 配置

createPipeline(steps) 步骤参数

trim()

replace(searchValue, replaceValue)

stripFences()

markdownSection(heading, opts?)

extractCodeBlocks(opts?)

matchRegex(pattern, keys, opts?)

keyValues(opts?)

xmlTag(name, opts?)

looseJson(opts?)

splitLines(opts?)

splitBy(separator, opts?)

custom(fn)

`trim()` 和 `replace(search, replaceBy)`

`stripFences()`

`markdownSection(heading, opts?)`

`extractCodeBlocks(opts?)`

`matchRegex(pattern, keys, opts?)`

`keyValues(opts?)`

`xmlTag(name, opts?)`

`looseJson(opts?)`

`splitLines(opts?)`

`splitBy(separator, opts?)`

`custom(fn)`

`OutputParserFeature` 配置

`createPipeline(steps)` 步骤参数

`trim()`

`replace(searchValue, replaceValue)`

`stripFences()`

`markdownSection(heading, opts?)`

`extractCodeBlocks(opts?)`

`matchRegex(pattern, keys, opts?)`

`keyValues(opts?)`

`xmlTag(name, opts?)`

`looseJson(opts?)`

`splitLines(opts?)`

`splitBy(separator, opts?)`

`custom(fn)`