From Model Card to Cryptographic Receipt

Why evaluations don’t travel

Red-teaming, adversarial testing, and safety evaluations produce real artifacts inside the lab — transcripts, eval harness outputs, model-behavior traces. The gap is at the handoff: those artifacts weren’t designed to be independently verified at the regulator, the customer, or the standards body. PDFs and screenshots carry the claim; nothing carries the proof.

The EU AI Act requires "appropriate logging." The White House voluntary commitments mention "red teaming." None of them say what counts as proof.

This gap exists industry-wide: evaluations produced inside the lab rarely travel intact to the regulator or customer.

What proof actually looks like

GLACIS creates cryptographic evidence that your safety testing happened — without your test data ever leaving your environment.

1

You run your evaluation

Red team test, safety eval, whatever. GLACIS wraps the call and captures what happened.

2

We hash, you keep

Your prompts and outputs are hashed locally. Only the cryptographic commitment leaves your environment.

3

Third party witnesses

An independent witness network timestamps and signs. Anchored in a transparency log. Verifiable forever.

from glacis import Glacis

glacis = Glacis()

# Your red team evaluation
receipt = glacis.attest(
    service_id="safety-eval",
    operation_type="red_team_test",
    input={"prompt": adversarial_prompt},   # Hashed locally, never sent
    output={"response": model_output},       # Hashed locally, never sent
    metadata={
        "model": "llama-3-70b",
        "test_suite": "harmbench",
        "evaluator": "safety-team"
    }
)

# Share this with auditors, regulators, the public
print(receipt.verification_url)
# → https://glacis.io/verify/att_7f3k...

What you can prove

Red team testing happened

Cryptographic evidence that adversarial prompts were actually evaluated by your model, at a specific time, with specific results.

Auditors can verify without seeing your test data.

Model cards are real

Your safety claims link to verifiable attestations. "Tested on HarmBench" becomes a checkable fact, not a marketing claim.

Model cards with teeth.

Experiments are reproducible

Prove you ran this exact model on this exact data at this exact time. Timestamped, witnessed, logged.

For papers, audits, or your own records.

Data lineage is clean

Attest the provenance of your training data without exposing the data itself. Ready for dataset audits.

Prove what went in without showing what went in.

Your data never leaves.

Not "we don't store it." Not "we delete it after." It never leaves your environment at all.

Prompts and outputs

SHA-256 hash only. The content stays with you.

Training data

Never transmitted. Ever.

Model weights

Never transmitted. Ever.

Like notarizing a document without the notary reading it.

You did the safety work.
Now prove it.

Open source SDK. No commitment to start. Production-grade cryptography.

Get the SDK View on GitHub

$ pip install glacis

Navigate

Solutions

Industries

From model card
to cryptographic receipt.

Why evaluations don’t travel

What proof actually looks like

You run your evaluation

We hash, you keep

Third party witnesses

What you can prove

Red team testing happened

Model cards are real

Experiments are reproducible

Data lineage is clean

Your data never leaves.

You did the safety work.
Now prove it.

From model card to cryptographic receipt.

Why evaluations don’t travel

What proof actually looks like

You run your evaluation

We hash, you keep

Third party witnesses

What you can prove

Red team testing happened

Model cards are real

Experiments are reproducible

Data lineage is clean

Your data never leaves.

You did the safety work.Now prove it.

From model card
to cryptographic receipt.

You did the safety work.
Now prove it.