SafetyLM - Open Source WHS AI

███████╗ █████╗ ███████╗███████╗████████╗██╗   ██╗██╗     ███╗   ███╗
██╔════╝██╔══██╗██╔════╝██╔════╝╚══██╔══╝╚██╗ ██╔╝██║     ████╗ ████║
███████╗███████║█████╗  █████╗     ██║    ╚████╔╝ ██║     ██╔████╔██║
╚════██║██╔══██║██╔══╝  ██╔══╝     ██║     ╚██╔╝  ██║     ██║╚██╔╝██║
███████║██║  ██║██║     ███████╗   ██║      ██║   ███████╗██║ ╚═╝ ██║
╚══════╝╚═╝  ╚═╝╚═╝     ╚══════╝   ╚═╝      ╚═╝   ╚══════╝╚═╝     ╚═╝

The open-source AI that WHS practitioners can actually trust.

Grounded in Australian & New Zealand WHS law. Aligned to the frameworks professionals actually use. Transparent about every source.

v0.1.0 building in public · AU + NZ · runs 100% local

type help or tap a command below.

JavaScript is off, so the interactive terminal is disabled - here is the full SafetyLM story as plain text.

help

type a command and hit enter, or tap a chip. ↑/↓ recalls history.

guide how it works, the principles, and the docs

why the problem, the vision, and what it is not

roadmap the phased build-in-public plan

pricing free, open-source, self-hosted

contribute the three ways to help

talk reach a human - Neet

help list the commands

clear clear the scrollback

demo a walkthrough (coming soon)

license how SafetyLM is licensed

ask where to ask a question

changelog what has shipped so far

eval the SafetyLM-Eval benchmark

neetsingh Neet's personal site (new tab)

store stickers and merch (coming soon)

Guide

how it works

SafetyLM uses Retrieval-Augmented Generation (RAG): instead of trusting a model's memory, it retrieves the relevant WHS documents at query time, reasons over what's actually in front of it, then cites them.

query      a WHS practitioner asks a jurisdiction-specific question
  │
  ▼
analyse    detect jurisdiction + hazard
  │
  ▼
retrieve   hybrid semantic + keyword search,
             filtered to the right jurisdiction          ◀── corpus
  │
  ▼
rerank     cross-encoder precision pass
  │
  ▼
assemble   WHS reasoning rules + retrieved sources
  │
  ▼
reason     local open-weight LLM - no API, no cloud
  │
  ▼
answer     grounded response + a Sources block, each
             citation linked, with a currency caveat

corpus     indexed once · primary AU/NZ sources only:
             legislation · regulations · codes of
             practice · Body of Knowledge

Everything runs locally and open - no API dependency, no per-query cost, full data sovereignty.

Target experience (illustrative): you ask "What are a PCBU's primary duties for psychosocial hazards in NSW?" - SafetyLM answers with the duty under the Work Health and Safety Act 2011 (NSW), notes that "health" expressly includes psychological health, references the relevant Code of Practice, and ends with a Sources block plus a reminder to verify currency with SafeWork NSW.

principles

Domain depth over breadth - one regulatory environment done properly beats ten done shallowly. AU/NZ WHS, end to end.
Source transparency - every response surfaces which document, which jurisdiction, and when it was last reviewed.
Jurisdiction precision - the right jurisdiction is a first-class filter; a WA query retrieves WA instruments, and flags that WA is non-harmonised.
Framework alignment - reasons through ICAM, bowtie, critical-control logic, and the WHS duty hierarchy, not just about them.
Conservative confidence - calibrated to express uncertainty rather than confabulate. "I couldn't find a specific source" is a feature.
Open methodology - corpus criteria, retrieval design, and the evaluation benchmark are all published, so others can reproduce and critique.

docs

repogithub.com/whosneet/SafetyLM

visiondocs/00-vision.md - goals, users, success criteria

architecturedocs/01-architecture.md - system + data flow

ragdocs/04-rag-pipeline.md - retrieval, reranking, prompt

evaluationdocs/05-evaluation.md - benchmark + scoring

roadmapdocs/06-phased-roadmap.md - phases + acceptance criteria

governancedocs/08-governance.md - licensing, liability, attribution

The full doc index lives in the repo.

license

codeApache-2.0

docs + dataCC BY 4.0 - including the benchmark dataset

model weightsobtained by users from the model provider, under that provider's own licence

corpussource documents remain under original Crown copyright / open-access terms; attribution preserved

Pricing

Free. Open-source. Self-host SafetyLM at no cost - no accounts, no API fees, runs 100% local.

A future hosted option: TBD.

demo.mov

A short walkthrough will land with the Phase 5 interface. The system is mid-build - follow the changelog for progress.

License

SafetyLM uses a layered licensing model: code, docs, data, and corpus are each covered separately.

codeApache-2.0 - permissive, commercial use allowed

docs + dataCC BY 4.0 - including the evaluation benchmark dataset

model weightsobtained by users from the model provider, under that provider's own licence

corpussource documents remain under original Crown copyright / open-access terms; attribution preserved

Attribution is required when redistributing docs or data under CC BY 4.0. Cite SafetyLM and link to the repo.

governancedocs/08-governance.md - full licensing, liability, and attribution policy

Talk to a human

githubgithub.com/whosneet

linkedinlinkedin.com/in/whosneet

emailcontact@safetylm.ai

issuesopen an issue on SafetyLM - the best way to start a conversation

Built by Avneet (Neet) Singh - WHS practitioner (COHSProf), building the tool he wanted to exist.

Ask a question

Best places to ask about SafetyLM:

GitHub Discussions
Open an issue

Why SafetyLM?

General AI fails WHS practitioners in ways a non-expert never catches. SafetyLM is the open-source alternative built for the job.

the problem

When a WHS practitioner asks a general-purpose AI to interpret legislation, draft a SWMS, or analyse an incident, it fails in ways a non-expert would never catch:

Hallucinated citations - confidently inventing section numbers and regulations that don't exist.
Jurisdiction confusion - quoting NSW regulations in a Western Australian context, where the law is fundamentally different.
Generic advice - a one-size-fits-all hierarchy-of-controls template where a bowtie analysis or ICAM investigation was needed.
Model-vs-jurisdiction blindness - unable to distinguish the model WHS Act from the specific variations each state and territory enacted.

In a safety-critical domain, a confident wrong answer erodes trust and can contribute to poor decisions. No open-source AI is built and grounded specifically for AU/NZ WHS practice. SafetyLM exists to fill that gap.

the vision

Every answer traces back to a real document, in the right jurisdiction, with a currency caveat.

When the system can't find a grounded source, it says so - rather than generating something plausible.

That honesty is the product.

"Grounded + cited" means: retrieved from primary sources at query time, filtered to your jurisdiction, and surfaced with the document, the jurisdiction, and when it was last reviewed - never a black box.

who it is for

primaryWHS consultants doing cross-jurisdictional research · early-career practitioners who need a reliable starting point · small businesses without a dedicated safety team · students working toward Cert IV, Diploma, or postgraduate WHS qualifications.

secondaryWHS software vendors embedding domain AI · researchers studying AI in occupational health & safety · organisations wanting to self-host a WHS AI on their own infrastructure.

what it is not

Setting expectations is part of earning trust. SafetyLM is not:

a replacement for professional WHS advice or a qualified practitioner.
a legal interpretation service.
a general-purpose chatbot.
a compliance checker that guarantees legislative currency - the corpus has a published date, and users must verify the current version with the regulator.

These are design decisions that shape how the system responds, not disclaimers buried in fine print.

Roadmap

Built in public, phase by phase. Each phase has explicit acceptance criteria in the repo.

[✓] done0 · Planning & documentation - architecture, corpus, evaluation, governance design

[»] next1 · Corpus build - a catalogued manifest of every AU/NZ WHS source, processed

[ ] planned2 · Embedding & vector store - semantic + keyword retrieval, jurisdiction-filtered

[ ] planned3 · RAG pipeline & system prompt - first end-to-end: query in → cited answer out

[ ] planned4 · Benchmark evaluation - a 500-question WHS benchmark + scored results, published

[ ] planned5 · Interface & public launch - a clean chat UI, install guide, public release

[~] future6 · v2 - fine-tuning - weights & LoRA adapters that internalise WHS reasoning

A standout deliverable regardless of outcome: SafetyLM-Eval - 500+ validated WHS questions with ground-truth answers, published under CC BY 4.0, so any WHS AI (open or commercial) can be measured against it.

Changelog

What has actually shipped. The forward plan lives in the roadmap.

v0.1.0Planning & documentation complete - architecture, corpus criteria, evaluation design, and governance published in the repo.

siteThis reference site, built in the open.

Corpus build is in progress. Follow the roadmap for what is next.

SafetyLM-Eval

A purpose-built evaluation benchmark for WHS AI systems: 500+ validated questions with ground-truth answers, published openly so any model can be measured.

what it is

SafetyLM-Eval is a dataset of 500+ WHS questions covering AU/NZ jurisdictions, hazard types, duty holder roles, and regulatory instruments. Each question has a verified ground-truth answer drawn from primary sources.

questions500+ validated WHS questions with ground-truth answers

coverageAU/NZ jurisdictions, hazard types, duty hierarchy, codes of practice

licenceCC BY 4.0 - free to use, share, and adapt with attribution

purposebenchmark any WHS AI, open or commercial, against a consistent bar

why it matters

Without a domain-specific benchmark, WHS AI quality is unmeasurable. SafetyLM-Eval gives practitioners and researchers a shared yardstick, published regardless of how SafetyLM itself performs.

The benchmark is a standalone contribution, valuable even if you are not using SafetyLM.

docsdocs/05-evaluation.md - methodology, scoring, and dataset structure

neetsingh.com

Opens in a new tab: neetsingh.com

Store

No store yet. Open-source project stickers and a few bits are planned - nothing to buy today.

Contribute

SafetyLM is open to contribution in three areas:

Corpus - propose missing AU/NZ WHS source documents (with complete metadata and a verified URL).
Evaluation - contribute jurisdiction- or hazard-specific questions to the benchmark dataset.
Code - improve the pipeline, retrieval, or interface.

If you're a WHS practitioner, your domain judgement is the most valuable contribution of all. Open an issue to start a conversation.

Full guidelines arrive with the public launch in Phase 5.