Standards

VOAF-M: turning audit trails into training data.

Vigil Engineering·Feb 22, 2026·9 min read

VOAF-M is an extension of the Verifiable Output Audit Format that produces training-ready data for personal AI models, locally, without sending anything to a cloud provider. The motivation is direct: audit data is the cleanest training data a user will ever own. The infrastructure question is how to make it usable for fine-tuning without compromising the verifiability properties that make audit data valuable in the first place.

This is an engineering-authored post. The product strategy that VOAF-M enables is described elsewhere. Here we focus on the format and the mechanics.

If audit is data you cannot verify, it is not audit. If training data is data the user did not consent to, it is not personal. VOAF-M is the format that holds both properties at the same time.

The two modes of VOAF

VOAF, in its base form, is a cryptographic chain that records every AI agent interaction the user passes through Vigil. Each entry contains a structured representation of the request, the response, the policy decisions, the detection signals, and the resulting action, with a hash linking it to the previous entry and a signature binding the entry to the user's local key.

The base format is optimized for audit. It is designed to be read by a verifier, walked end-to-end, and confirmed against an independent signing root. The reference verifier, vigil-verify, runs locally and produces a deterministic verification result. The format is human-readable JSON with cryptographic envelopes around each entry.

VOAF-M is the same chain, projected through a transformation that produces training-ready records. The chain itself does not change. The base VOAF entries are unmodified. VOAF-M is a derivation, not a replacement. A user who exports VOAF-M is exporting a view of the same chain, formatted for fine-tuning consumption.

This separation matters because it means the audit property is preserved regardless of whether the user ever uses VOAF-M for training. The chain is whole. The view is optional.

The format

VOAF-M output is JSONL: one record per line, each record a self-contained JSON object suitable for ingestion by standard fine-tuning pipelines. The structure is:

{
  "id": "<voaf entry hash>",
  "timestamp": "<ISO 8601>",
  "context": {
    "session_id": "<opaque session identifier>",
    "preceding_turns": [...],
    "user_state": {...}
  },
  "input": {
    "role": "user",
    "content": "<request content>",
    "metadata": {...}
  },
  "output": {
    "role": "assistant",
    "content": "<response content>",
    "actions": [...],
    "metadata": {...}
  },
  "annotations": {
    "policy_decisions": [...],
    "detection_signals": {...},
    "user_feedback": "<accepted | corrected | rejected>",
    "correction": "<corrected output if applicable>"
  },
  "verification": {
    "voaf_chain_hash": "<hash>",
    "signature": "<signature>",
    "signing_root": "<root identifier>"
  }
}

The fields decompose along three axes.

The instruction-tuning surface is the input and output fields. These are the prompt-completion pairs that fine-tuning pipelines consume directly. Standard supervised fine-tuning over JSONL data ingests these without modification.

The context window is the context.preceding_turns and context.user_state fields. These provide the multi-turn context that determines whether a given input-output pair makes sense in isolation. Many user interactions only have meaning in light of what came before. VOAF-M captures this without forcing the fine-tuning pipeline to reconstruct it from raw chain data.

The annotation layer is the annotations field. This is where VOAF-M departs from a generic conversation log. Every entry carries a record of what the user actually thought of the response. If the user accepted the output and used it, the annotation reflects that. If the user corrected it, both the original and the correction are present. If the user rejected it entirely, that is recorded.

This last layer is what makes VOAF-M training data, not just conversation history. Fine-tuning a personal model on conversation history teaches the model to mimic past behavior, including the parts the user did not like. Fine-tuning on annotated VOAF-M teaches the model to converge on the user's actual preferences, weighted by their actual feedback.

Why the format is verifiable

The training records are derived from VOAF chain entries. Each VOAF-M record carries the chain hash and signature of its source entry. A consumer of the training data, including the user themselves, can confirm that:

The record was derived from a legitimate VOAF entry that the user authored.
The entry has not been modified between its original signing and its appearance in the training set.
The annotations are consistent with what was recorded at the time, not with what the user (or someone with access to the user's machine) would prefer to claim now.

This matters in three scenarios.

The user's own confidence in their training data. A personal model trained on VOAF-M records is trained on records the user can confirm are genuine. There is no possibility that the training set has been seeded with adversarial examples or that the annotations have been retroactively edited.

Enterprise context. An organization fine-tuning a model on the audit trail of an agent's behavior in production wants the training data to represent what the agent actually did, not what someone after the fact wishes the agent had done. The verification chain prevents both accidental and deliberate tampering.

Cross-organization audit, in regulated environments. A regulator examining a fine-tuned model can request the training corpus and verify that the corpus matches the audit trail of the system being modeled. The model's behavior becomes traceable to the underlying interactions that produced it.

Why training data, specifically

We get asked why we built this for training rather than for analytics, summarization, or other downstream uses of audit data. Three reasons.

Training is the use case where data quality matters most. Analytics tolerates noise. Summarization tolerates approximation. Fine-tuning a model on noisy data produces a noisy model, and the noise compounds through subsequent inference. Audit data that has been verified end-to-end is the quietest training data a user can produce, because every entry has been signed at the moment of authorship and cannot be silently modified.

Personal models are the natural endpoint of this product line. Vigil's product strategy converges on a layer that knows the user well enough to predict, prevent, and personalize at the level no cloud model can match without the user surrendering the training data to the provider. VOAF-M is the format that makes this possible without the surrender. The user retains the training data on their machine. The training run can happen locally or on a cloud provider that the user selects, with the data scoped to that single run.

Existing fine-tuning formats do not solve this. OpenAI's JSONL format, Anthropic's message format, and the various open-source instruction-tuning formats are all designed for cloud providers receiving training data from customers. None of them carry the verification fields that make the data trustworthy beyond the moment of upload. A training format designed around personal data ownership has different requirements.

What VOAF-M does not do

VOAF-M does not encrypt the training data. The chain is signed but not encrypted. If you transmit a VOAF-M file to a third party, the third party can read it. Encryption is a separate concern, handled by transport security if you choose to move the data, and by disk encryption if it stays local. Verifiability and confidentiality are orthogonal.

VOAF-M does not select training records for you. The export is the full chain, projected. If you want to train on a subset (one date range, one project, one type of interaction), you filter the JSONL before passing it to your fine-tuning pipeline. We do not build the filter into the format because filter semantics are use-case specific.

VOAF-M does not produce the fine-tuned model. It produces training data. The training itself happens in whatever pipeline you use. We have tested with the standard PEFT and LoRA pipelines on Apple Silicon with MLX. The format is designed to be agnostic.

The export tiers

Because users have different comfort levels with what an export looks like, the Vigil product surfaces three export modes:

VOAF audit (JSON) exports the raw chain entries with signatures. This is the format for handing data to a regulator, an auditor, or your own forensic review. Verifiable, complete, formal.

VOAF-M training-ready (JSONL) exports the projection described in this post. Suitable for direct ingestion by fine-tuning pipelines.

Conversations (Markdown ZIP) exports human-readable transcripts of the underlying interactions, with metadata stripped, signatures removed, and the cryptographic envelope discarded. This is the format for reviewing your own history, sharing a specific session with someone, or migrating your conversation context into another tool that does not understand VOAF.

The three exports are derived from the same chain. The chain is the source of truth. The exports are views.

Why this matters now

The personal model layer is not a thing most users have built yet. Two things make it real soon: the cost of fine-tuning a useful 7B model has dropped to single-digit dollars per run, and Apple Silicon now runs LoRA adapters on consumer hardware at usable speeds. The bottleneck has shifted from compute to data quality.

Users who have spent six months running every AI request through a Vigil chain have, at the end of those six months, a clean, verified, annotated training corpus that they own and that nobody else has. This is the asymmetric advantage of local-first audit. The data accumulates as a side effect of the security function. By the time the user is ready to fine-tune, the corpus is already there.

VOAF-M is the format that makes the corpus usable.

← Back to The Vigil Journal