

GitHub - unifiedjs/unified: ☔ Text processing umbrella: Parse / Transform / Comp...
source link: https://github.com/unifiedjs/unified
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

readme.md
?Announcing the unified collective! ?
Read more about it on Medium »
unified is an interface for processing text using syntax trees. It’s what powers remark, retext, and rehype, but it also allows for processing between multiple syntaxes.
unified enabled new exciting projects like Gatsby to pull in markdown, MDX to embed JSX, and Prettier to format it. It’s used to check code for Storybook, debugger.html (Mozilla), and opensource.guide (GitHub).
- To read about what we’re up to, follow us on Medium and Twitter
- For a less technical and more practical introduction to unified, visit
unified.js.org
and try its introductory Guides - To help us out, see
contributing.md
, or become a backer or sponsor on Open Collective
Installation
npm:
npm install unified
Usage
var unified = require('unified') var markdown = require('remark-parse') var remark2rehype = require('remark-rehype') var doc = require('rehype-document') var format = require('rehype-format') var html = require('rehype-stringify') var report = require('vfile-reporter') unified() .use(markdown) .use(remark2rehype) .use(doc) .use(format) .use(html) .process('# Hello world!', function(err, file) { console.error(report(err || file)) console.log(String(file)) })
Yields:
no issues found <!doctype html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> </head> <body> <h1>Hello world!</h1> </body> </html>
Table of Contents
Description
unified is an interface for processing text using syntax trees. Syntax trees are a representation understandable to programs. Those programs, called plugins, take these trees and modify them, amongst other things. To get to the syntax tree from input text there’s a parser. To get from that back to text there’s a compiler. This is the process of a processor.
| ....................... process() ......................... |
| ......... parse() ..... | run() | ..... stringify() ....... |
+--------+ +----------+
Input ->- | Parser | ->- Syntax Tree ->- | Compiler | ->- Output
+--------+ | +----------+
X
|
+--------------+
| Transformers |
+--------------+
Processors
Every processor implements another processor. To create a new processor invoke another processor. This creates a processor that is configured to function the same as its ancestor. But when the descendant processor is configured in the future it does not affect the ancestral processor.
When processors are exposed from a module (for example, unified itself) they should not be configured directly, as that would change their behaviour for all module users. Those processors are frozen and they should be invoked to create a new processor before they are used.
NodeThe syntax trees used in unified are Unist nodes: plain
JavaScript objects with a type
property. The semantics of those type
s are
defined by other projects.
There are several utilities for working with these nodes.
List of ProcessorsThe following projects process different syntax trees. They parse text to their respective syntax tree and they compile their syntax trees back to text. These processors can be used as-is, or their parsers and compilers can be mixed and matched with unified and other plugins to process between different syntaxes.
List of PluginsThe below plugins work with unified, unrelated to what flavour the syntax tree is in:
unified-diff
— Ignore messages for unchanged lines in Travis
See remark, rehype, and retext for lists of their plugins.
FileWhen processing documents metadata is often gathered about that document. VFile is a virtual file format which stores data and handles metadata and messages for unified and its plugins.
There are several utilities for working with these files.
ConfigurationTo configure a processor invoke its use
method, supply it a
plugin, and optionally settings.
unified can integrate with the file-system through
unified-engine
. On top of that, CLI apps can be created with
unified-args
, Gulp plugins with unified-engine-gulp
, and
Atom Linters with unified-engine-atom
.
A streaming interface is provided through unified-stream
.
The API gives access to processing metadata (such as lint messages) and supports multiple passed through files:
var unified = require('unified') var markdown = require('remark-parse') var styleGuide = require('remark-preset-lint-markdown-style-guide') var remark2retext = require('remark-retext') var english = require('retext-english') var equality = require('retext-equality') var remark2rehype = require('remark-rehype') var html = require('rehype-stringify') var report = require('vfile-reporter') unified() .use(markdown) .use(styleGuide) .use( remark2retext, unified() .use(english) .use(equality) ) .use(remark2rehype) .use(html) .process('*Emphasis* and _importance_, you guys!', function(err, file) { console.error(report(err || file)) console.log(String(file)) })
Yields:
1:16-1:28 warning Emphasis should use `*` as a marker emphasis-marker remark-lint
1:34-1:38 warning `guys` may be insensitive, use `people`, `persons`, `folks` instead gals-men retext-equality
⚠ 2 warnings
<p><em>Emphasis</em> and <em>importance</em>, you guys!</p>
Processing between syntaxes
The processors can be combined in two modes.
Bridge mode transforms the syntax tree from one flavour (the origin) to another (the destination). Then, transformations are applied on that tree. Finally, the origin processor continues transforming the original syntax tree.
Mutate mode also transforms the syntax tree from one flavour to another. But then the origin processor continues transforming the destination syntax tree.
In the previous example (“Programming interface”), remark-retext
is used in
bridge mode: the origin syntax tree is kept after retext is done; whereas
remark-rehype
is used in mutate mode: it sets a new syntax tree and discards
the original.
API
processor()
Object describing how to process text.
ReturnsFunction
— New unfrozen processor which is configured to
function the same as its ancestor. But when the descendant processor is
configured in the future it does not affect the ancestral processor.
The following example shows how a new processor can be created (from the remark processor) and linked to stdin(4) and stdout(4).
var remark = require('remark') var concat = require('concat-stream') process.stdin.pipe(concat(onconcat)) function onconcat(buf) { var doc = remark() .processSync(buf) .toString() process.stdout.write(doc) }
processor.use(plugin[, options])
Configure the processor to use a plugin and optionally configure that plugin with options.
Signaturesprocessor.use(plugin[, options])
processor.use(preset)
processor.use(list)
plugin
(Plugin
)options
(*
, optional) — Configuration forplugin
preset
(Object
) — Object with an optionalplugins
(set tolist
), and/or an optionalsettings
objectlist
(Array
) — List of plugins, presets, and pairs (plugin
andoptions
in an array)
processor
— The processor on which use
is invoked.
use
cannot be called on frozen processors. Invoke the processor
first to create a new unfrozen processor.
There are many ways to pass plugins to .use()
. The below example gives an
overview.
var unified = require('unified') unified() // Plugin with options: .use(plugin, {}) // Plugins: .use([plugin, pluginB]) // Two plugins, the second with options: .use([plugin, [pluginB, {}]]) // Preset with plugins and settings: .use({plugins: [plugin, [pluginB, {}]], settings: {position: false}}) // Settings only: .use({settings: {position: false}}) function plugin() {} function pluginB() {}
processor.parse(file|value)
Parse text to a syntax tree.
Parametersfile
(VFile
) — Or anything which can be given tovfile()
Node
— Syntax tree representation of input.
parse
freezes the processor if not already frozen.
parse
does not apply transformers from the run phase to the
syntax tree.
The below example shows how the parse
function can be used to create a
syntax tree from a file.
var unified = require('unified') var markdown = require('remark-parse') var tree = unified() .use(markdown) .parse('# Hello world!') console.log(tree)
Yields:
{ type: 'root', children: [ { type: 'heading', depth: 1, children: [Array], position: [Position] } ], position: { start: { line: 1, column: 1, offset: 0 }, end: { line: 1, column: 15, offset: 14 } } }
processor.Parser
Function handling the parsing of text to a syntax tree. Used in the
parse phase in the process and invoked with a string
and
VFile
representation of the document to parse.
Parser
can be a normal function in which case it must return a
Node
: the syntax tree representation of the given file.
Parser
can also be a constructor function (a function with keys in its
prototype
) in which case it’s invoked with new
. Instances must have a
parse
method which is invoked without arguments and must return a
Node
.
processor.stringify(node[, file])
Compile a syntax tree to text.
Parameters Returnsstring
— String representation of the syntax tree file.
stringify
freezes the processor if not already frozen.
stringify
does not apply transformers from the run phase
to the syntax tree.
The below example shows how the stringify
function can be used to generate a
file from a syntax tree.
var unified = require('unified') var html = require('rehype-stringify') var h = require('hastscript') var tree = h('h1', 'Hello world!') var doc = unified() .use(html) .stringify(tree) console.log(doc)
Yields:
<h1>Hello world!</h1>
processor.Compiler
Function handling the compilation of syntax tree to a text. Used in the
stringify phase in the process and invoked with a
Node
and VFile
representation of the document to stringify.
Compiler
can be a normal function in which case it must return a string
:
the text representation of the given syntax tree.
Compiler
can also be a constructor function (a function with keys in its
prototype
) in which case it’s invoked with new
. Instances must have a
compile
method which is invoked without arguments and must return a string
.
processor.run(node[, file][, done])
Transform a syntax tree by applying plugins to it.
Parametersnode
(Node
)file
(VFile
, optional) — Or anything which can be given tovfile()
done
(Function
, optional)
Promise
if done
is not given. Rejected with an error, or
resolved with the resulting syntax tree.
run
freezes the processor if not already frozen.
function done(err[, node, file])
Invoked when transformation is complete. Either invoked with an error or a syntax tree and a file.
Parameters ExampleThe below example shows how the run
function can be used to transform a
syntax tree.
var unified = require('unified') var references = require('remark-reference-links') var u = require('unist-builder') var tree = u('root', [ u('paragraph', [ u('link', {href: 'https://example.com'}, [u('text', 'Example Domain')]) ]) ]) unified() .use(references) .run(tree, function(err, tree) { if (err) throw err console.log(tree) })
Yields:
{ type: 'root', children: [ { type: 'paragraph', children: [Array] }, { type: 'definition', identifier: '1', title: undefined, url: undefined } ] }
processor.runSync(node[, file])
Transform a syntax tree by applying plugins to it.
If asynchronous plugins are configured an error is thrown.
Parameters ReturnsNode
— The given syntax tree.
runSync
freezes the processor if not already frozen.
processor.process(file|value[, done])
Process the given representation of a file as configured on the processor. The
process invokes parse
, run
, and stringify
internally.
Promise
if done
is not given. Rejected with an error or
resolved with the resulting file.
process
freezes the processor if not already frozen.
The below example shows how the process
function can be used to process a
file whether plugins are asynchronous or not with Promises.
var unified = require('unified') var markdown = require('remark-parse') var remark2rehype = require('remark-rehype') var doc = require('rehype-document') var format = require('rehype-format') var html = require('rehype-stringify') unified() .use(markdown) .use(remark2rehype) .use(doc) .use(format) .use(html) .process('# Hello world!') .then( function(file) { console.log(String(file)) }, function(err) { console.error(String(err)) } )
Yields:
<!doctype html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> </head> <body> <h1>Hello world!</h1> </body> </html>
function done(err, file)
Invoked when the process is complete. Invoked with a fatal error, if any, and
the VFile
.
err
(Error
, optional) — Fatal errorfile
(VFile
)
The below example shows how the process
function can be used to process a
file whether plugins are asynchronous or not with a callback.
var unified = require('unified') var parse = require('remark-parse') var stringify = require('remark-stringify') var github = require('remark-github') var report = require('vfile-reporter') unified() .use(parse) .use(github) .use(stringify) .process('@mention', function(err, file) { console.error(report(err || file)) console.log(String(file)) })
Yields:
no issues found
[**@mention**](https://github.com/blog/821)
processor.processSync(file|value)
Process the given representation of a file as configured on the processor. The
process invokes parse
, run
, and stringify
internally.
If asynchronous plugins are configured an error is thrown.
Parametersfile
(VFile
)value
(string
) — String representation of a file
VFile
— Virtual file with modified contents
.
processSync
freezes the processor if not already frozen.
The below example shows how the processSync
function can be used to process a
file if all plugins are known to be synchronous.
var unified = require('unified') var markdown = require('remark-parse') var remark2rehype = require('remark-rehype') var doc = require('rehype-document') var format = require('rehype-format') var html = require('rehype-stringify') var processor = unified() .use(markdown) .use(remark2rehype) .use(doc) .use(format) .use(html) console.log(processor.processSync('# Hello world!').toString())
Yields:
<!doctype html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> </head> <body> <h1>Hello world!</h1> </body> </html>
processor.data(key[, value])
Get or set information in an in-memory key-value store accessible to all phases of the process. An example is a list of HTML elements which are self-closing, which is needed when parsing, transforming, and compiling HTML.
Parameterskey
(string
) — Identifiervalue
(*
, optional) — Value to set. Omit if gettingkey
processor
— If setting, the processor on whichdata
is invoked*
— If getting, the value atkey
Setting information with data
cannot occur on frozen processors.
Invoke the processor first to create a new unfrozen processor.
The following example show how to get and set information:
var unified = require('unified') unified() .data('alpha', 'bravo') .data('alpha') // => 'bravo'
processor.freeze()
Freeze a processor. Frozen processors are meant to be extended and not to be configured or processed directly.
Once a processor is frozen it cannot be unfrozen. New processors functioning just like it can be created by invoking the processor.
It’s possible to freeze processors explicitly, by calling .freeze()
, but
.parse()
, .run()
, .stringify()
, and
.process()
call .freeze()
to freeze a processor too.
Processor
— The processor on which freeze
is invoked.
The following example, index.js
, shows how rehype prevents
extensions to itself:
var unified = require('unified') var parse = require('rehype-parse') var stringify = require('rehype-stringify') module.exports = unified() .use(parse) .use(stringify) .freeze()
The below example, a.js
, shows how that processor can be used and configured.
var rehype = require('rehype') var format = require('rehype-format') // ... rehype() .use(format) // ...
The below example, b.js
, shows a similar looking example which operates on
the frozen rehype interface. If this behaviour was allowed it
would result in unexpected behaviour so an error is thrown. This is
invalid:
var rehype = require('rehype') var format = require('rehype-format') // ... rehype .use(format) // ...
Yields:
~/node_modules/unified/index.js:440
throw new Error(
^
Error: Cannot invoke `use` on a frozen processor.
Create a new processor first, by invoking it: use `processor()` instead of `processor`.
at assertUnfrozen (~/node_modules/unified/index.js:440:11)
at Function.use (~/node_modules/unified/index.js:172:5)
at Object.<anonymous> (~/b.js:6:4)
Plugin
unified plugins change the way the applied-on processor works in the following ways:
- They modify the processor: such as changing the parser, the compiler, or linking it to other processors
- They transform syntax tree representation of files
- They modify metadata of files
Plugins are a concept. They materialise as attacher
s.
move.js
:
module.exports = move function move(options) { var expected = (options || {}).extname if (!expected) { throw new Error('Missing `extname` in options') } return transformer function transformer(tree, file) { if (file.extname && file.extname !== expected) { file.extname = expected } } }
index.js
:
var unified = require('unified') var parse = require('remark-parse') var remark2rehype = require('remark-rehype') var stringify = require('rehype-stringify') var vfile = require('to-vfile') var report = require('vfile-reporter') var move = require('./move') unified() .use(parse) .use(remark2rehype) .use(move, {extname: '.html'}) .use(stringify) .process(vfile.readSync('index.md'), function(err, file) { console.error(report(err || file)) if (file) { vfile.writeSync(file) // Written to `index.html`. } })
function attacher([options])
An attacher is the thing passed to use
. It configures the processor
and in turn can receive options.
Attachers can configure processors, such as by interacting with parsers and compilers, linking them to other processors, or by specifying how the syntax tree is handled.
ContextThe context object is set to the invoked on processor
.
options
(*
, optional) — Configuration
transformer
— Optional.
Attachers are invoked when the processor is frozen: either when
.freeze()
is called explicitly, or when .parse()
, .run()
,
.stringify()
, or .process()
is called for the first
time.
function transformer(node, file[, next])
Transformers modify the syntax tree or metadata of a file. A transformer is a
function which is invoked each time a file is passed through the transform
phase. If an error occurs (either because it’s thrown, returned, rejected, or
passed to next
), the process stops.
The transformation process in unified is handled by trough
, see
it’s documentation for the exact semantics of transformers.
Error
— Can be returned to stop the processNode
— Can be returned and results in further transformations andstringify
s to be performed on the new treePromise
— If a promise is returned, the function is asynchronous, and must be resolved (optionally with aNode
) or rejected (optionally with anError
)
function next(err[, tree[, file]])
If the signature of a transformer includes next
(third argument), the
function may finish asynchronous, and must invoke next()
.
err
(Error
, optional) — Stop the processnode
(Node
, optional) — New syntax treefile
(VFile
, optional) — New virtual file
Preset
Presets provide a potentially sharable way to configure processors. They can contain multiple plugins and optionally settings as well.
Examplepreset.js
:
exports.settings = {bullet: '*', fences: true} exports.plugins = [ require('remark-preset-lint-recommended'), require('remark-comment-config'), require('remark-preset-lint-markdown-style-guide'), [require('remark-toc'), {maxDepth: 3, tight: true}], require('remark-github') ]
index.js
:
var remark = require('remark') var vfile = require('to-vfile') var report = require('vfile-reporter') var preset = require('./preset') remark() .use(preset) .process(vfile.readSync('index.md'), function(err, file) { console.error(report(err || file)) if (file) { vfile.writeSync(file) } })
Contribute
unified is built by people just like you!
Check out contributing.md
for ways to get started.
This project has a Code of Conduct. By interacting with this repository, organisation, or community you agree to abide by its terms.
Want to chat with the community and contributors? Join us in spectrum!
Have an idea for a cool new utility or tool?
That’s great!
If you want feedback, help, or just to share it with the world you can do so by
creating an issue in the unifiedjs/ideas
repository!
Acknowledgments
Preliminary work for unified was done in 2014 for
retext and inspired by ware
. Further incubation
happened in remark. The project was finally externalised
in 2015 and published as unified
. The project was authored by
@wooorm.
Although unified
since moved it’s plugin architecture to trough
,
thanks to @calvinfo,
@ianstormtaylor, and others for their
work on ware
, which was a huge initial inspiration.
License
Recommend
-
32
Pre-processing for TensorFlow pipelines with tf.Transform on Google...
-
71
README.md @thi.ng/umbrella
-
2
Elixir Alchemy Routing in Phoenix Umbrella Apps Miguel Palhas on Apr 16, 2019 “I absolutely love AppSignal.” Discover AppSigna...
-
10
Amazon launches its own Smart TVs in India under the AmazonBasics umbrellaAmazon has introduced its own Smart TV lineup in India under the AmazonBasics umbrella. In case you don't know, AmazonBasics is Amazon's private brand that sells access...
-
5
Say thanks to Sharon Our creators love hearing from you and seeing how you’ve used their photos. Show your appreciation by donating, tweeting, and following! Set a link back to this photo. You can use the following text:
-
8
波卡去中心化预言机项目Umbrella Network即将上线BitMax交易所BitMax2021-03-10热度: 25187Umbrella 是一个基于 Polkadot 构建的一个真正去...
-
7
HashQuark 与 Umbrella Network 达成战略合作HashQuark 宣布与去中心化 oracle 解决方案 Umbrella Network 达成合作 ,作为验证节点加入其生态,为其提供专业的基础设施服务。此次合作是 Umbrella 主网启动前的重要里程碑之一。 HashQuark 作为...
-
8
A Junco Goes "Umbrella Fishing" One memorable sequence from Sir David Attenborough's stellar Life of Birds do...
-
7
Oracles, P2P Insurance, And DeFi - An Interview with Umbrella Network, Polkacover & Bridge MutualMarch 16th 2021 new story10
-
6
GOGOCODE 全网最简单易上手,可读性最强的 AST 处理工具! 官网:https://gogocode.io 简介:阿里妈妈出的新工具,给批量修改项目代码减轻了...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK