Koha ISBD Cataloging Assistant is a Koha staff-client plugin that helps catalogers find and fix ISBD punctuation problems in MARC21 bibliographic records. It adds a side panel to Koha's cataloging editor, validates fields with deterministic rules, explains expected punctuation, can block saves when serious findings remain, and can optionally call an AI provider for guided cataloging help.
The rule engine is the source of truth. AI is optional, advisory, and constrained by the deterministic MARC21/ISBD rules. The plugin helps catalogers work faster; it does not replace cataloger judgment, local policy, authority control, or final review.
- Who This Is For
- What The Plugin Does
- What The Plugin Does Not Do
- ISBD And MARC21 Basics
- ISBD Areas Covered By The Plugin
- Requirements
- Installation From KPZ
- Local Koha Override Warning
- First Run Checklist
- Daily Use
- Configuration Reference
- AI Setup
- AI Prompts
- AI Safety And Context
- AI Tuning And Limits
- ISBD Guardrails
- Training Guide And Internship Mode
- Custom Rules
- Coverage Report
- Troubleshooting
- Testing
- Building From Source
- Limitations
- Donations
- Project Notes
- License
This plugin is for Koha libraries that catalog MARC21 bibliographic records and expect ISBD-style punctuation in descriptive fields. It is useful for catalogers, cataloging supervisors, trainers, interns, and Koha administrators who need consistent punctuation checks inside the normal staff-client workflow.
The plugin is ISBD-first, but it can be used in AACR2, RDA, and local MARC21 workflows where the library still records or displays ISBD punctuation.
- Adds an ISBD assistant panel to
cataloguing/addbiblio.pl. - Validates MARC fields and subfields while a record is being edited.
- Shows findings with severity, expected value, and apply/undo controls.
- Applies deterministic punctuation fixes when allowed.
- Supports ghost text and AI Assist for guided suggestions.
- Tracks guide progress for training workflows.
- Lets administrators restrict trainees through internship mode.
- Lets administrators add local JSON rules without editing plugin code.
- Provides a coverage report that shows which active MARC framework fields are covered, excluded, or missing deterministic rules.
The plugin does not decide bibliographic description by itself. It does not replace source-of-information review, authority records, subject cataloging policy, classification policy, URL maintenance, local field policy, or a cataloger's final judgment.
It also does not freely rewrite controlled headings, identifiers, URLs, coordinates, structured notes, or local fields. Those areas are guarded because automatic punctuation edits can damage meaning, authority control, access, or local data.
MARC21 records are made of fields. A field has a three-digit tag such as 245, 260, 264, or 300. Many fields have indicators and subfields. A subfield has a one-character code such as $a, $b, or $c.
Example:
245 10 $a The great Gatsby $b a novel $c F. Scott Fitzgerald
In this example:
245is the title statement field.- The two characters after the tag are indicators.
$ais the title proper.$bis other title information.$cis the statement of responsibility.
ISBD uses prescribed punctuation to show the boundaries between areas and elements. That punctuation matters because it tells readers and systems how to interpret the description. A colon, slash, semicolon, comma, equals sign, or period is not just decoration; it often marks a change from one descriptive element to another.
The most important practical rule is that boundary punctuation must live in one place only. Some boundaries are stored as a prefix on the current subfield. Other boundaries are stored as a suffix on the previous subfield. The plugin follows the baseline rule pack in Koha/Plugin/Cataloging/AutoPunctuation/rules/isbd_baseline.json.
Examples:
245$a The great Gatsby
245$b : a novel
245$c / F. Scott Fitzgerald.
For common title boundaries, the baseline rules use prefix-on-current. The colon belongs at the start of 245$b, and the responsibility slash belongs at the start of 245$c.
260$a New York :
260$b Scribner,
260$c 1925.
For publication data, some punctuation is suffix-on-previous. The colon after the place can be attached to 260$a, and the comma before the date can be attached to 260$b.
300$a xii, 180 pages :
300$b illustrations ;
300$c 23 cm.
For physical description, the colon and semicolon mark the boundary between extent, other physical details, and dimensions. The plugin avoids adding the same boundary punctuation twice.
Repeated subfields are targeted with tag, field occurrence, subfield code, and subfield_index. That is why a suggestion can apply to the second $a in a repeated field without changing the first $a.
The audit source for this project is the IFLA ISBD 2011 Consolidated Edition 2021 Update PDF included in this repository as ISBD_Update 2021 to Consolidated ed 2011.pdf. The detailed coverage matrix is in docs/ISBD_COVERAGE.md.
Area 0, content form and media type, is partial/handoff. MARC 336, 337, and 338 can provide useful context, but production process is not fully represented by standard MARC fields. The plugin does not generate a complete Area 0 display unless a library adds local mapping rules.
Area 1, title and statement of responsibility, is automated for common 245 and selected 246 patterns. This includes title proper, other title information, responsibility statements, and dependent title punctuation where deterministic.
Area 2, edition area, is automated for 250$a and 250$b where terminal punctuation and edition-statement boundaries are deterministic.
Area 3, material- or type-specific area, is mixed. 254$a, 255$a, and 362$a have deterministic checks. Coordinates, projection, equinox, magnitude, and some material-specific details are guarded because internal punctuation depends on the data structure.
Area 4, publication, production, distribution, manufacture, and copyright, is automated for common 260 and 264 $a, $b, and $c patterns. For 264, the second indicator is used to distinguish production, publication, distribution, manufacture, and copyright. Manufacture details in $e, $f, and $g are guarded where parenthetical grouping cannot be inferred safely.
Area 5, physical description, is automated for common 300$a, $b, $c, $e, $f, and $g boundaries.
Area 6, series, is automated for transcribed series fields such as 440 and 490 where deterministic. Controlled series access points in 8XX are treated conservatively because authority control governs them.
Area 7, notes, is automated for simple note punctuation in selected 5XX fields. Structured notes are guarded when punctuation is part of an internal syntax.
Area 8, resource identifier and terms of availability, is automated for fields such as 020, 022, 024, and 028 where punctuation can be safely removed or normalized. Identifiers are not treated like free prose.
- Koha
25.11or newer. - Koha plugins enabled.
- Staff permissions to upload, configure, and run plugins.
- A MARC21 bibliographic framework.
- Optional AI features require an OpenAI or OpenRouter API key.
- API key storage requires Koha
encryption_keyconfigured inkoha-conf.xml.
Recommended before installation:
- Test first on a staging Koha instance.
- Back up Koha configuration and database data.
- Confirm who is allowed to configure plugins.
- Decide whether save blocking should be enabled immediately or only after staff training.
Build the KPZ package from the repository root:
bash scripts/build_kpz.shThe expected package path is:
dist/Koha_ISBD_Cataloging_Assistant-1.0.0.kpz
Install it in Koha:
- Open the Koha staff client.
- Go to Administration.
- Open Plugins.
- Open Manage plugins.
- Upload the generated KPZ file.
- Enable the plugin if Koha does not enable it automatically.
- Open the plugin configure page.
- Save the initial configuration.
- Open the plugin tool page if you want the standalone tool view.
The normal cataloging interface is cataloguing/addbiblio.pl. That is where the side panel appears.
The files Auth.pm, Handler.pm, and run.pl at the repository root are not normal plugin files. They are local Koha override references for installations that need dispatch or authentication changes before plugin API calls return predictable JSON responses.
They map to Koha core paths:
Auth.pmmaps to/usr/share/koha/lib/C4/Auth.pm.Handler.pmmaps to/usr/share/koha/lib/Koha/Plugins/Handler.pm.run.plmaps to/usr/share/koha/intranet/cgi-bin/plugins/run.pl.
Use these files only when your Koha deployment needs them. They are risky because Koha upgrades can change login CSRF handling, session handling, permission checks, plugin method dispatch, or request parameter handling. Copying an older override over a newer Koha file can break staff login or plugin dispatch.
The repository includes a helper script for explicit backup, restore, and apply operations:
bash scripts/kohafilesbackup.sh backup
bash scripts/kohafilesbackup.sh apply
bash scripts/kohafilesbackup.sh restoreSafe workflow:
- Back up the installed Koha files before applying any override.
- Compare the repository file with the installed Koha version.
- Merge only the minimal dispatch/auth change that is needed.
- Test staff login.
- Test plugin configuration save.
- Test plugin API calls from
cataloguing/addbiblio.pl. - After every Koha upgrade, compare again before reapplying anything.
Do not overwrite your .bak recovery files with repository copies. Those backups are how you recover if an override becomes incompatible with the installed Koha version.
- Open the plugin configure page.
- Confirm
Enable ISBD Intellisenseis on. - Confirm
Enable live validationis on. - Leave
Auto-Apply Fixesoff for the first test. - Leave
Block Save on Errorsoff until staff understand the findings. - Review required fields for your local MARC framework.
- Review excluded tags and local-field settings.
- Save configuration.
- Open
cataloguing/addbiblio.pl. - Confirm the ISBD assistant panel appears.
- Open or create a test bibliographic record.
- Enter a simple
245,260or264, and300. - Confirm findings appear in the side panel.
- Apply one suggestion.
- Undo the suggestion.
- Save only after confirming the record still matches local cataloging policy.
The side panel watches the cataloging form. When live validation is enabled, the panel shows findings as fields change. A finding normally includes the affected tag, subfield, severity, message, current value, and expected value.
Use Apply when the suggestion is correct. Use Undo if the edit is not wanted. When auto-apply is enabled, deterministic punctuation fixes may be applied without pressing each individual button, so use that setting only after staff are comfortable with the rule behavior.
Ghost text and AI Assist are guidance tools. They can help explain a punctuation issue or suggest cataloging values, but deterministic rules and cataloger review still control what is applied.
If save blocking is enabled, unresolved ERROR findings or required-field guardrails can stop the record from being saved. Resolve the findings or change configuration if the rule does not match local policy.
enabled, shown as Enable ISBD Intellisense, defaults to on. It controls whether the cataloging-page assistant runs. Beginners should leave it on. Turn it off only to disable the plugin without uninstalling it.
auto_apply_punctuation, shown as Auto-Apply Fixes, defaults to off. It allows deterministic fixes to be applied automatically. Beginners should leave it off. Turn it on only after testing the rules against local records.
default_standard defaults to ISBD. It records the active cataloging standard label. Keep ISBD unless future versions add supported alternatives.
enforce_isbd_guardrails defaults to on. It keeps protected fields from being freely rewritten and ensures AI punctuation suggestions cannot override deterministic checks. Leave it on.
enable_live_validation defaults to on. It validates as staff edit. Turn it off only if the cataloging page becomes too slow in a local environment.
block_save_on_error defaults to off. It blocks saving when serious findings remain. Use it after staff training and local rule review. Turning it on too early can interrupt cataloging.
required_fields defaults to:
0030,0080,040*,040c,942c,100a,245a,260c,300a,050a
Tokens are comma-separated. A token such as 245a means field 245 subfield $a. 040* means a 040 field with any subfield. Control-field positions are represented by normalized tokens such as 0030 and 0080. Review this list for your framework because not every library requires the same access points or classification fields.
excluded_tags defaults to empty. Add tags here when local policy says the plugin should ignore them. This is useful for fields that are managed by another workflow.
strict_coverage_mode defaults to off. It makes missing rule coverage more visible. Use it for audits and release testing, not for a first production rollout.
enable_local_fields defaults to off. It prevents local 9XX fields from being checked unless explicitly allowed. Keep it off unless your library has documented local punctuation rules.
local_fields_allowlist defaults to empty. Use it with local fields, for example 9XX,952a, when you want only specific local tags or subfields included.
enable_guide defaults to on. It enables the step-by-step ISBD training guide.
guide_users defaults to empty. Selected users are excluded from the guide. Use this for experienced staff who do not need training prompts.
guide_exclusion_list defaults to empty. It is a manual exclusion list for guide behavior when user selection is not enough.
The progress table updates as staff use the guide. The configure page can filter progress and export visible rows as CSV or Excel.
internship_mode defaults to off. It applies trainee restrictions to selected users.
internship_users defaults to empty. Select the staff accounts that should be treated as trainees.
internship_exclusion_list defaults to empty. Use this field as an additional comma-separated trainee user list when the selector is not enough. Users listed here receive the same internship restrictions as users selected in internship_users.
intern_allow_assistant_toggle defaults to off. Trainees cannot toggle the assistant unless this is enabled.
intern_allow_autoapply_toggle defaults to off. Trainees cannot toggle auto-apply unless this is enabled.
intern_allow_cataloging_panel defaults to on. Trainees can show the cataloging assistant panel by default.
intern_allow_ai_assist_toggle defaults to off. Trainees cannot toggle AI Assist unless this is enabled.
intern_allow_panel_apply_actions defaults to off. Trainees cannot apply or undo panel suggestions unless this is enabled.
intern_allow_ai_cataloging defaults to off. Trainees cannot make AI cataloging requests unless this is enabled.
intern_allow_ai_punctuation defaults to off. Trainees cannot make AI punctuation requests unless this is enabled.
intern_allow_ai_apply_actions defaults to off. Trainees cannot apply AI panel actions, including subjects, classification, call numbers, or AI punctuation patches, unless this is enabled.
debug_mode defaults to off. It writes plugin debug messages to Koha server logs with the prefix AutoPunctuation debug:. Turn it on only while diagnosing a problem.
ai_debug_include_raw_response defaults to off. It can include sanitized raw provider text for debugging. Use it carefully because AI output can contain record text.
AI is optional. The plugin works without an AI provider.
Supported providers:
- OpenRouter, the default and recommended provider in the configure page.
- OpenAI.
Basic setup:
- Configure Koha
encryption_keyinkoha-conf.xml. - Open the plugin configure page.
- Enable
AI Assist. - Choose OpenRouter or OpenAI.
- Enter the API key.
- Select or refresh the model list.
- Save configuration.
- Use
Test connection.
The configure page stores API keys server-side. A stored key is not printed back into the password field. Use Clear key when you need to remove it.
AI feature toggles:
ai_enabledefaults to on. It enables the AI panel features when provider setup is complete.ai_punctuation_explaindefaults to on. It allows field-level punctuation guidance.ai_subject_guidancedefaults to on. It allows subject-heading guidance.ai_callnumber_guidancedefaults to on. It allows call-number guidance.lc_class_targetdefaults to050$a. Change it if your library records LC classification in another target.
Model selection is provider-specific. The configure page remembers OpenRouter and OpenAI model choices separately where possible.
ai_prompt_default is the punctuation guidance prompt. It should keep the model focused on explaining and suggesting punctuation-compatible changes.
ai_prompt_cataloging is the cataloging prompt for classification and subject suggestions. It supports {{source_text}}.
Customize prompts only when you have a clear local policy need. Do not remove instructions that require JSON shape, punctuation-only behavior for punctuation patches, deterministic-rule compliance, or safe handling of record content. A weaker prompt can increase rejected patches or unsafe advice, but it cannot bypass server-side guardrails.
Prompt length is limited by ai_prompt_max_length, which defaults to 16384 characters.
ai_redaction_rules defaults to:
9XX,952,5XX
These rules reduce what record data is sent to the provider. Keep local fields redacted unless your library has approved sending them.
ai_redact_856_querystrings defaults to on. It removes query strings from 856$u URLs before AI requests. Keep it on because query strings can contain tokens, session data, or patron-related parameters.
ai_context_mode defaults to tag_only. Available modes are:
tag_only: send only the active tag context.tag_plus_neighbors: include neighboring fields for more context.full: include the redacted full record.
Start with tag_only. Use broader context only when cataloging guidance needs it and local privacy policy allows it.
ai_payload_preview defaults to off. It lets administrators inspect the AI payload before sending. Turn it on when validating redaction behavior or debugging provider requests.
AI punctuation patches must use the supported patch operation, target the exact live field and repeated subfield, match the original text, and pass deterministic guardrails before they can be applied.
Defaults:
ai_reasoning_effort:low.ai_timeout:60seconds.ai_max_output_tokens:10000.ai_temperature:0.1.ai_confidence_threshold:0.85.ai_rate_limit_per_minute:30.ai_cache_ttl_seconds:300.ai_cache_max_entries:1000.ai_retry_count:1.ai_circuit_breaker_threshold:5.ai_circuit_breaker_timeout:45seconds.ai_circuit_breaker_window_seconds:180.ai_circuit_breaker_failure_rate:0.5.ai_circuit_breaker_min_samples:6.ai_openrouter_response_format: off by default.
Recommended beginner/admin choices:
- Keep temperature low for consistent cataloging advice.
- Keep the rate limit conservative until you understand provider costs.
- Keep cache enabled to reduce repeat requests.
- Increase timeout only if the selected model regularly needs more time.
- Change reasoning effort only for models that support it.
- Treat circuit breaker settings as operational safety controls. They stop repeated provider failures from slowing cataloging.
Cache effects can make repeated AI requests return quickly without a fresh provider call until the TTL expires.
Deterministic rules are the source of truth. AI cannot override them.
The plugin intentionally guards:
- Controlled headings and authority-controlled access points.
- URLs and electronic access fields such as
856. - Coordinates, projections, and other structured cartographic data.
- Local
9XXfields unless explicitly enabled. - Structured notes where punctuation has local or field-specific meaning.
- Standard identifiers where punctuation is not prose punctuation.
This is why some fields are shown as partial, handoff, or guarded in the coverage matrix. A safe assistant should avoid damaging data it cannot understand deterministically.
The guide is for learning and reinforcement. It can remain enabled for new catalogers while experienced staff are excluded through guide_users.
Internship mode is stricter. It lets supervisors choose trainee accounts and decide whether those users can toggle the assistant, toggle auto-apply, show the panel, use AI, apply panel actions, or apply AI-generated actions.
Suggested rollout:
- Enable the guide.
- Keep auto-apply off.
- Keep save blocking off.
- Put trainees in internship mode.
- Allow panel visibility but restrict apply actions at first.
- Review progress exports during training.
- Gradually allow apply/undo after staff demonstrate consistency.
Custom rules are JSON stored in the configure page. The value may be {} or an object with a rules array.
Custom rules are appended to the baseline rule pack; they do not replace baseline rules. Avoid duplicating a built-in rule unless the duplication is intentional and has been tested.
Example local rule:
{
"rules": [
{
"id": "CUSTOM_949A_LOCAL_NOTE_PERIOD",
"tag": "949",
"subfields": ["a"],
"severity": "INFO",
"rationale": "Local policy: locally owned 949$a processing notes should end with terminal punctuation.",
"checks": [
{
"type": "punctuation",
"suffix": ".",
"end_in": [".", "?", "!"],
"severity": "INFO",
"message": "Add terminal punctuation to local processing note."
}
],
"fixes": [
{
"label": "Apply local note punctuation",
"patch": [
{
"op": "replace_subfield",
"value_template": "{{expected}}"
}
]
}
],
"examples": [
{
"before": "Local processing note",
"after": "Local processing note."
}
]
}
]
}This example assumes local fields are enabled and local_fields_allowlist includes 949 or 949a.
Supported check types include punctuation, separator, no_terminal_punctuation, spacing, normalize_punctuation, and fixed_field. Severities are ERROR, WARNING, and INFO.
Safe rule-writing guidance:
- Use a unique
id. - Target the narrowest tag and subfield that matches local policy.
- Prefer explicit
tagandsubfieldsover broad regexes. - Avoid duplicating baseline rules unless you intend to change local behavior.
- Test custom rules on sample records before production use.
- Export rules before major edits.
Custom rules are merged after baseline rules. A conflicting custom rule can create confusing or contradictory advice.
The configure page supports importing and exporting JSON rules.
The coverage report compares active MARC framework fields with the active rule pack.
Statuses:
Covered: at least one deterministic rule matches the field/subfield.Excluded: the field is intentionally ignored by settings or guardrails.Not covered: no active deterministic rule covers it.
Missing coverage does not always mean a bug. It may mean the field requires authority control, source inspection, local policy, or structured syntax that should not be automated.
The report also provides recommended rule stubs. Treat those stubs as starting points, not production-ready policy.
Panel does not appear:
- Confirm the plugin is installed and enabled.
- Confirm
Enable ISBD Intellisenseis on. - Confirm you are on
cataloguing/addbiblio.pl. - Hard-refresh the browser.
- Check browser console errors.
- Check Koha server logs.
Configuration will not save or API calls fail:
- Confirm staff session is valid.
- Log out and back in.
- Check CSRF/session errors in browser and Koha logs.
- If local overrides are installed, compare
Auth.pm,Handler.pm, andrun.plagainst the current Koha version. - Restore backups if staff login or plugin dispatch broke after an override.
AI is disabled or unavailable:
- Confirm
AI Assistis enabled. - Confirm Koha
encryption_keyexists. - Confirm an API key is stored.
- Confirm provider is correct.
- Refresh the model list.
- Use
Test connection. - Check firewall/proxy access from the Koha server.
- Check provider quota, rate limits, and API status.
AI model list fails:
- Check the key and provider.
- Check Koha logs for
ai_modelserrors. - Confirm the Koha server can reach OpenRouter or OpenAI.
AI request fails repeatedly:
- The circuit breaker may be open after repeated failures.
- Wait for the circuit breaker timeout.
- Lower request rate.
- Try a different model.
- Check provider errors and Koha logs.
Save is blocked:
- Open the assistant panel.
- Resolve
ERRORfindings. - Check required-field findings.
- Confirm the field is not intentionally absent under local policy.
- Adjust
required_fieldsor save-blocking configuration only when the rule does not match local practice.
Suggestion targets the wrong repeated subfield:
- Refresh the record page.
- Run validation again.
- Confirm the finding includes the correct repeated subfield target.
- Report the tag, occurrence, subfield,
subfield_index, current value, and expected value if it still mis-targets.
Koha upgrade broke the plugin:
- Disable local overrides if staff login or plugin dispatch fails.
- Restore backups made by
scripts/kohafilesbackup.sh. - Recompare the repository override files with the upgraded Koha files.
- Reapply only the minimal dispatch/auth change after testing.
Useful debug evidence:
- Koha version.
- Plugin version.
- Browser console error.
- Koha server log excerpt.
- A redacted MARC field example.
- The setting values for live validation, save blocking, excluded tags, local fields, AI provider, and context mode.
- Sanitized raw AI response only when
ai_debug_include_raw_responseis needed.
Run the documentation and rule checks from the repository root:
node tests/docs_examples.js
node tests/guide_consistency.js
node tests/rules_engine_regression.js
perl tests/rules_backend_regression.plBuild the package:
bash scripts/build_kpz.shConfirm the artifact exists:
dist/Koha_ISBD_Cataloging_Assistant-1.0.0.kpz
Some Perl compile checks require Koha modules in @INC; run those inside a Koha environment.
Build a local KPZ package from a checkout of the project:
bash scripts/build_kpz.shThe build writes the installable plugin package to:
dist/Koha_ISBD_Cataloging_Assistant-1.0.0.kpz
Before sharing a package with another Koha site, run the tests in the previous section and install the KPZ in a staging Koha instance.
- The plugin does not replace cataloger judgment.
- The plugin does not replace authority control.
- ISBD Area 0 production process is only partial/handoff unless local mapping is added.
- Complex structured notes are intentionally conservative.
- Local fields are excluded unless explicitly enabled.
- Custom rules can conflict with baseline rules.
- Browser behavior depends on Koha cataloging form markup exposing stable MARC tag and subfield controls.
- AI results depend on provider availability, model behavior, rate limits, and local privacy policy.
If this plugin is useful to your library, donations would help support ongoing maintenance and testing.
- Non-crypto donation: Here
- BTC:
19JSzRPB5qp3TKZVBeVUR8xmgntxKui5cc - ETH (ERC20):
0x5cc9f67d0f8328a46b9f9e12a1cfbf1a379e5947 - USDT (ERC20):
0x5cc9f67d0f8328a46b9f9e12a1cfbf1a379e5947 - USDC (ERC20):
0x5cc9f67d0f8328a46b9f9e12a1cfbf1a379e5947 - LTC:
LesDgPh9BVp8SgbXqk8GbyCzHwnrgn7tDv
This plugin is a refactor of Koha_AACR2_Cataloging_Assistant onto an ISBD-first foundation. Upstream-friendly Koha plugin hooks for dispatch/auth behavior would make the integration easier to maintain across upgrades.
GPL-3.0. See LICENSE.