A PHP library that parses Microsoft Word .docx manuscripts and builds structured JATS (Journal Article Tag Suite) XML ready for scholarly publishing platforms and indexing services.
π DOCX β π Parse β π§± Build β π JATS XML
JATS Engine is the conversion core behind the Wizdam publishing ecosystem. It takes a Microsoft Word .docx manuscript and automatically builds a fully structured JATS XML document β the industry standard format for journal article interchange (ANSI/NISO Z39.96).
The engine is designed specifically to integrate with Open Journal Systems (OJS) 2.x via its native DAO layer, but its modular architecture allows it to be adapted for any PHPβbased publishing workflow.
| π§± Builder | π Responsibility |
|---|---|
MetadataBuilder |
Reads article, author, journal, and issue data from OJS 2.x DAOs and builds the JATS <front> element β including journal meta, article meta, publication history, and citation list. |
BodyBuilder |
Opens the .docx archive, parses the WordprocessingML body, and builds the JATS <body> with full section hierarchy, tables, figures, math, and inline formatting. |
| π Parser | π Responsibility |
|---|---|
TextParser |
Detects heading levels via Word outline styles, parses paragraph content recursively (deep-diving through textboxes, shapes, and alternate content wrappers), and preserves bold/italic/underline formatting. |
TableParser |
Converts Word tables into JATS <table-wrap> elements β including header detection, colspan/rowspan merging, and structured <thead>/<tbody> output. |
MathParser |
Transforms Office Math Markup Language (OMML) into MathML using XSLT, then wraps it as JATS <inline-formula> or <disp-formula>. |
ImageHandler |
Extracts images from the .docx zip, converts legacy EMF/WMF metafiles to PNG via PHP Imagick, and generates JATS <graphic> references. |
| Software | Version |
|---|---|
| PHP | β₯ 8.1 |
| PHP Extensions | zip, xsl, dom, imagick (optional, for EMF/WMF conversion) |
| OJS | 2.4.x (for native DAO integration) |
composer require wizdam/jats-engineuse Wizdam\JatsEngine\Builders\MetadataBuilder;
use Wizdam\JatsEngine\Builders\BodyBuilder;
$articleId = 123;
$docxPath = '/path/to/manuscript.docx';
// 1. Create DOM document with JATS root
$dom = new DOMDocument('1.0', 'UTF-8');
$root = $dom->createElement('article');
$root->setAttribute('xmlns:xlink', 'http://www.w3.org/1999/xlink');
$root->setAttribute('dtd-version', '1.1');
$dom->appendChild($root);
// 2. Build front matter from OJS database
$metadataBuilder = new MetadataBuilder($articleId);
$metadataBuilder->buildFront($dom);
// 3. Build body from DOCX
$bodyBuilder = new BodyBuilder();
$bodyBuilder->setArticleId($articleId);
$bodyBuilder->setDocxPath($docxPath);
$bodyBuilder->buildBody($dom);
// 4. Output JATS XML
echo $dom->saveXML();<?xml version="1.0" encoding="UTF-8"?>
<article xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="1.1">
<front>
<journal-meta>
<journal-title-group>
<journal-title>Journal of Applied Sciences</journal-title>
</journal-title-group>
<issn publication-format="print">1234-5678</issn>
</journal-meta>
<article-meta>
<title-group>
<article-title>Solar Panel Adoption in Rural Java</article-title>
</title-group>
<contrib-group>...</contrib-group>
<pub-date date-type="pub">
<year>2026</year>
</pub-date>
</article-meta>
</front>
<body>
<sec id="s1">
<title>Introduction</title>
<p>This study examines...</p>
</sec>
</body>
</article>ββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Wizdam Editorial β
β (OJS 2.x based publishing platform) β
β β
β βββββββββββββββ βββββββββββββββββββββββββββββ
β β Submission βββββΆβ JATS Engine ββ
β β (DOCX) β β MetadataBuilder ββ
β βββββββββββββββ β BodyBuilder ββ
β β Parsers/Docx/* ββ
β ββββββββββββ¬βββββββββββββββββ
β β β
β βΌ β
β βββββββββββββββββββββββββββββ
β β JATS XML Output ββ
β β (Ready for PubMed, ββ
β β CrossRef, DOAJ) ββ
β βββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββ
Contributions are welcome! Please review our Contributing Guidelines before submitting a pull request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/new-parser) - Commit your changes (
git commit -m 'Add new parser') - Push to the branch (
git push origin feature/new-parser) - Open a Pull Request
This project follows the Contributor Covenant Code of Conduct.
Do not publicly disclose vulnerabilities.
- Report to: security@sangia.org
- Response time: Within 48 hours
- Advisories: GitHub Security Advisories
Full details: SECURITY.md
This project is licensed under the GNU General Public License v3.0 (GPLβ3.0).
| Permission | Condition |
|---|---|
| β Free to use (commercial & nonβcommercial) | |
| β Free to modify & redistribute |
| π·οΈ Attribution | π Reference |
|---|---|
| JATS Standard | ANSI/NISO Z39.96 β Journal Article Tag Suite |
| OMML2MML XSLT | Office Math to MathML transformation stylesheet |
| Lead Developer | Rochmady (mokesano) |
| Ecosystem | Wizdam Editorial |
| Sangia Publishing House | sangia.org |
Built with β€οΈ for the scholarly publishing community
Β© 2026 Rochmady. Licensed under GPLβ3.0.