From model card to cryptographic receipt.

Not "we have logs." Not "here's our process doc." Cryptographic proof. Third-party verifiable. Zero data egress.

Why evaluations don’t travel

Red-teaming, adversarial testing, and safety evaluations produce real artifacts inside the lab — transcripts, eval harness outputs, model-behavior traces. The gap is at the handoff: those artifacts weren’t designed to be independently verified at the regulator, the customer, or the standards body. PDFs and screenshots carry the claim; nothing carries the proof.

The EU AI Act requires "appropriate logging." The White House voluntary commitments mention "red teaming." None of them say what counts as proof.

This gap exists industry-wide: evaluations produced inside the lab rarely travel intact to the regulator or customer.

What proof actually looks like

GLACIS creates cryptographic evidence that your safety testing happened — without your test data ever leaving your environment.

1

You run your evaluation

Red team test, safety eval, whatever. GLACIS wraps the call and captures what happened.

2

We hash, you keep

Your prompts and outputs are hashed locally. Only the cryptographic commitment leaves your environment.

3

Third party witnesses

An independent witness network timestamps and signs. Anchored in a transparency log. Verifiable forever.

from glacis import Glacis

glacis = Glacis()

# Your red team evaluation
receipt = glacis.attest(
    service_id="safety-eval",
    operation_type="red_team_test",
    input={"prompt": adversarial_prompt},   # Hashed locally, never sent
    output={"response": model_output},       # Hashed locally, never sent
    metadata={
        "model": "llama-3-70b",
        "test_suite": "harmbench",
        "evaluator": "safety-team"
    }
)

# Share this with auditors, regulators, the public
print(receipt.verification_url)
# → https://glacis.io/verify/att_7f3k...

What you can prove

Red team testing happened

Cryptographic evidence that adversarial prompts were actually evaluated by your model, at a specific time, with specific results.

Auditors can verify without seeing your test data.

Model cards are real

Your safety claims link to verifiable attestations. "Tested on HarmBench" becomes a checkable fact, not a marketing claim.

Model cards with teeth.

Experiments are reproducible

Prove you ran this exact model on this exact data at this exact time. Timestamped, witnessed, logged.

For papers, audits, or your own records.

Data lineage is clean

Attest the provenance of your training data without exposing the data itself. Ready for dataset audits.

Prove what went in without showing what went in.

Your data never leaves.

Not "we don't store it." Not "we delete it after." It never leaves your environment at all.

Prompts and outputs

SHA-256 hash only. The content stays with you.

Training data

Never transmitted. Ever.

Model weights

Never transmitted. Ever.

Like notarizing a document without the notary reading it.

You did the safety work.
Now prove it.

Open source SDK. No commitment to start. Production-grade cryptography.

$ pip install glacis