Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 12 additions & 6 deletions .github/workflows/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,20 +11,26 @@ Currently retrieves:
- Software module list from [modules-list](https://github.com/nesi/modules-list).
- Glossary, spellcheck dictionary and snippets from [nesi-wordlist](https://github.com/nesi/nesi-wordlist)

It then runs [link_apps_pages.py](#link_apps_pagespy).
It then runs [compile_tags.py](#compile_tagspy).

All modified files are added to a new branch called `new-assets` and merged into main.

In theory, all this could be done at deployment, but I wanted to make sure that changes to these remote files didn't break anything.

## [link_apps_pages.py](link_apps_pages.py)
## [compile_tags.py](compile_tags.py)

A Python script used to add a link to the appropriate documentation to [modules-list.json](../../docs/assets/module-list.json).
Replaces the old `link_apps_pages.py`.

The script checks all titles of input files, and sets the `support` key to be equal to the pages url.
It also adds whatever tags are on that page to the `domains` key.
Validates page tags against the canonical vocabulary in [`docs/assets/tags.yml`](../../docs/assets/tags.yml), writes two compiled indexes, and links app pages to the module list:

_One day I would like to simplify this whole thing._
- **`docs/assets/tag-index.json`** — maps each canonical tag to the list of pages that carry it. Used by the `pages_with_tag()` macro at render time.
- **`docs/assets/module-list.json`** — updated with support-page URLs and canonical domain tags for each application.

Any tag not present in `tags.yml` (as a key or alias) produces a CI warning. Unknown tags are silently dropped from the index.

### Tag vocabulary

Tags are defined in [`docs/assets/tags.yml`](../../docs/assets/tags.yml). Each entry has a canonical key (snake\_case), a display label, and optional aliases. Pages should always use canonical keys; aliases are accepted for backwards compatibility but are normalised at compile time.

## [checks.yml](checks.yml)

Expand Down
11 changes: 10 additions & 1 deletion .github/workflows/checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -162,15 +162,24 @@ jobs:
run: |
shopt -s globstar extglob
python3 checks/run_slurm_lint.py ${{needs.get.outputs.filelist}}
tagcheck:
name: Check tags
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@v6
- run: pip3 install pyyaml
- run: python3 .github/workflows/compile_tags.py

testBuild:
name: Test build
if: ${{github.event_name != 'workflow_dispatch' || inputs.testBuild}}
runs-on: ubuntu-24.04
needs: get
needs: [get, tagcheck]
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0
- run: pip3 install -r requirements.txt
- run: python3 .github/workflows/compile_tags.py
- run: ./checks/run_test_build.py
- run: export NO_MKDOCS_2_WARNING="1"; python3 checks/run_aria_check.py
109 changes: 109 additions & 0 deletions .github/workflows/compile_tags.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
#!/usr/bin/env python3

"""
Validates page tags, links app pages to the module list, and writes compiled indexes.

Run during CI lint checks to surface tag warnings.
Run on deploy to ensure tag-index.json and module-list.json are up to date before building.

Replaces: link_apps_pages.py
"""

import os
import re
import json
import yaml
import sys
from pathlib import Path


TAGS_VOCAB_PATH = os.getenv("TAGS_VOCAB_PATH", "docs/assets/tags.yml")
TAG_INDEX_PATH = os.getenv("TAG_INDEX_PATH", "docs/assets/tag-index.json")
MODULE_LIST_PATH = os.getenv("MODULE_LIST_PATH", "docs/assets/module-list.json")
DOC_ROOT = os.getenv("DOC_ROOT", "docs")
APPS_PAGES_PATH = os.getenv("APPS_PAGES_PATH", "Software/Available_Applications")
BASE_URL = os.getenv("BASE_URL", "https://www.docs.nesi.org.nz")


def load_vocabulary(path):
vocab = yaml.safe_load(open(path))
alias_map = {}
for canonical, entry in vocab.items():
alias_map[canonical.lower()] = canonical
for alias in (entry.get("aliases") or []):
alias_map[alias.lower()] = canonical
return vocab, alias_map


def parse_frontmatter(path):
content = path.read_text()
match = re.match(r"---\n([\s\S]*?)---", content)
if not match:
return None
return yaml.safe_load(match.group(1)) or {}


def title_from_path(md_file):
name = md_file.stem.replace("_", " ")
return name[0].upper() + name[1:]


vocab, alias_map = load_vocabulary(TAGS_VOCAB_PATH)
module_list = json.load(open(MODULE_LIST_PATH))

tag_index = {canonical: [] for canonical in vocab}
warnings = 0

for md_file in sorted(Path(DOC_ROOT).rglob("*.md")):
rel = str(md_file.relative_to(DOC_ROOT))
meta = parse_frontmatter(md_file)

if meta is None:
print(f"::warning file={md_file},title=meta.parse::Meta block missing or malformed.")
warnings += 1
continue

raw_tags = meta.get("tags") or []
title = meta.get("title") or title_from_path(md_file)
canonical_tags = []

for tag in raw_tags:
canonical = alias_map.get(str(tag).lower())
if canonical is None:
print(f"::warning file={md_file},title=tag.unknown::Unknown tag '{tag}' on '{title}'. Add to {TAGS_VOCAB_PATH} or use an existing alias.")
warnings += 1
else:
entry = {"title": title, "path": rel}
if entry not in tag_index[canonical]:
tag_index[canonical].append(entry)
canonical_tags.append(canonical)

# For app pages: update support URL and merge canonical tags into module domains.
is_app_page = str(md_file.relative_to(DOC_ROOT)).startswith(APPS_PAGES_PATH)
if is_app_page and md_file.name != "index.md":
app = meta.get("title") or title_from_path(md_file)
if app in module_list:
page_link = f"{BASE_URL}/{APPS_PAGES_PATH}/{app}"
existing = module_list[app].get("support", "")
if existing and existing != page_link:
print(f"::warning file={md_file},title=docpath.change::Support URL for '{app}' changed from '{existing}' to '{page_link}'.")
module_list[app]["support"] = page_link
for canonical in canonical_tags:
if canonical not in module_list[app]["domains"]:
module_list[app]["domains"].append(canonical)
else:
print(f"::warning file={md_file},title=missing.module::'{md_file.name}' has no corresponding module in {MODULE_LIST_PATH}.")
warnings += 1

tag_index = {k: v for k, v in tag_index.items() if v}

with open(TAG_INDEX_PATH, "w") as f:
f.write(json.dumps(tag_index, indent=4))

with open(MODULE_LIST_PATH, "w") as f:
f.write(json.dumps(module_list, indent=4))

print(f"tag-index.json: {len(tag_index)} tags, {sum(len(v) for v in tag_index.values())} entries.")
print(f"module-list.json: updated support URLs and domains for app pages.")
if warnings:
print(f"::warning::{warnings} warning(s) issued. Review and address before merging.")
2 changes: 2 additions & 0 deletions .github/workflows/deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,8 @@ jobs:
run: pip install -r requirements.txt
- name: Fetch Remote Files
run: bash .github/fetch_includes.sh
- name: Compile tag index and link app pages
run: python3 .github/workflows/compile_tags.py
- name: Build documentation
run: |
mkdocs build --clean --quiet
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,7 @@
description: A page sharing the details of reduced support hours over Easter and ANZAC break
created_at: '2024-03-20T01:58:22Z'
tags:
- easter
- holidays
- announcement
title: Accessing REANNZ HPC Support during the Easter and ANZAC holidays
search:
boost: 0.1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ status: new
search:
boost: 10
tags:
- identity
- email
- access
- slurm
---

## What is happening
Expand Down
7 changes: 3 additions & 4 deletions docs/Announcements/Known_Issues_HPC3.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
---
created_at: 2025-04-28
description: List of features currently missing from Mahuika (HPC3).
tags:
- hpc3
- refresh
- mahuika
tags:
- release_notes
- announcement
---

Below is a list issues that we're actively working on. We hope to have these resolved soon. This is intended to be a temporary page.
Expand Down
3 changes: 2 additions & 1 deletion docs/Announcements/Release_Notes/index.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
created_at: '2021-02-23T19:52:34Z'
tags: []
tags:
- release_notes
title: Release Notes
---

Expand Down
7 changes: 3 additions & 4 deletions docs/Announcements/Slurm_Job_email.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
---
created_at: 2026-02-11
description: Email from Slurm Jobs now available
tags:
- hpc3
- email
- mahuika
tags:
- release_notes
- slurm
---

Sending email from Slurm jobs is now available on Mahuika. Here is an example of the Slurm parameters required to send email:
Expand Down
2 changes: 1 addition & 1 deletion docs/Batch_Computing/Batch_Computing_Guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ created_at: 2025-12-19
description: Guide to batch computing
tags:
- slurm
- ondemand
- interactive
---

Batch jobs can be submitted via several methods. The most basic is a [simple Slurm job](#slurm-job-basics).
Expand Down
2 changes: 1 addition & 1 deletion docs/Batch_Computing/Checking_resource_usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
created_at: '2022-02-15T01:13:51Z'
tags:
- slurm
- accounting
- account
status: deprecated
---

Expand Down
12 changes: 4 additions & 8 deletions docs/Batch_Computing/Fair_Share.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,10 @@
---
created_at: '2019-02-05T03:58:21Z'
tags:
- accounting
- Slurm
- Fairshare
- Fair Share
- Job priority
- Long queue time
- Queing
- long wait time
- account
- slurm
- fairshare
- troubleshooting
description: How balancing your workload lets you make the most of your allocation.
---

Expand Down
1 change: 0 additions & 1 deletion docs/Batch_Computing/Hardware.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ created_at: '2022-06-13T04:54:38Z'
description: This page below outlines the available hardware.
tags:
- gpu
- compute
---

A list of the currently available hardware.
Expand Down
3 changes: 1 addition & 2 deletions docs/Batch_Computing/Job_Arrays.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
---
created_at: 2025-12-09
description: How to utilise job arrays.
tags:
tags:
- slurm
- parallel
- array
---


Expand Down
6 changes: 3 additions & 3 deletions docs/Batch_Computing/Job_Limits.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
---
created_at: 2025-07-17
description: What limits are there on running jobs.
tags:
- Slurm
- accounting
tags:
- slurm
- account
---

These are open for review if you find any of them unreasonable or inefficient.
Expand Down
6 changes: 3 additions & 3 deletions docs/Batch_Computing/Job_prioritisation.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
---
created_at: '2018-05-17T23:35:36Z'
description: What factors are used to determine a jobs prioroty.
tags:
- Slurm
- accounting
tags:
- slurm
- account
---

Each queued job has a priority score. Jobs start when sufficient
Expand Down
1 change: 0 additions & 1 deletion docs/Batch_Computing/SLURM-Best_Practice.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
created_at: '2019-01-18T01:56:15Z'
tags:
- slurm
- tips
title: 'SLURM: Best Practice'
description: Some tips on how to get more out of the job sceduler.
---
Expand Down
6 changes: 1 addition & 5 deletions docs/Batch_Computing/Temporary_directories.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,7 @@
---
created_at: '2023-07-21T04:10:04Z'
tags:
tags:
- storage
- tmpdir
- tmp
- temp
- localscratch
description: How temporary files are utilised on the REANNZ cluster.
---

Expand Down
8 changes: 4 additions & 4 deletions docs/Batch_Computing/Using_GPUs.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
created_at: '2020-04-19T22:59:58Z'
tags:
- gpu
- Slurm
- slurm
---

This page provides generic information about how to access GPUs through the Slurm scheduler.
Expand Down Expand Up @@ -307,6 +307,6 @@ To record the GPU utilisation and GPU memory, see [Measuring GPU efficiency afte

## Application and toolbox specific support pages

See the [Supported Applications](../Software/Available_Applications/index.md) for more information on what softwares have GPU support, as well as programming toolkits:

- [NVIDIA GPU Containers](../Software/Containers/NVIDIA_GPU_Containers.md)
{% for p in pages_with_tag("gpu") %}
- [{{ p.title }}]({{ p.path }})
{% endfor %}
7 changes: 2 additions & 5 deletions docs/Data_Transfer/Checksums.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,8 @@
---
created_at: '2020-01-14T22:10:50Z'
tags:
tags:
- checksum
- md5
- sha
- hash
- digest
- announcement
---

Applying a *checksum function* to a file will return its *message digest* (also simply referred to as a _checksum_), which is akin to a digital fingerprint.
Expand Down
2 changes: 2 additions & 0 deletions docs/Data_Transfer/Data_Transfer_Overview.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
---
created_at: '2018-11-20T22:41:32Z'
tags:
- file_transfer
---

!!! prerequisite
Expand Down
4 changes: 2 additions & 2 deletions docs/Data_Transfer/Data_Transfer_Using_MobaXterm.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
---
created_at: 2026-02-04
description: How to copy files to the REANNZ HPC using MobaXterm.
tags:
- data transfer
tags:
- file_transfer
title: MobaXterm (Windows)
---

Expand Down
Loading
Loading