The American Bar Association's Formal Opinion 512 on Generative Artificial Intelligence Tools, issued in July 2024, lays out the ethical framework U.S. lawyers operate under when they use AI. It covers six obligations under the Model Rules: competence, confidentiality, communication, candor toward the tribunal, supervisory responsibilities, and fees.

The opinion is written for lawyers. It tells them what they have to do. It does not tell engineers what they have to build.

But every one of the six obligations has technical implications that fall on the AI tool. If your product makes it impossible (or even hard) for a lawyer to comply with one of these duties, your product gets banned by the firm's IT committee or, worse, ends up cited in the next sanctions opinion. If your product makes compliance straightforward, it becomes the firm's preferred vendor.

This guide is the engineering translation. Each section takes one ABA obligation, identifies the technical surface area it touches, and gives a concrete implementation pattern. It also covers the state-by-state landscape, since ABA opinions are not binding and the actual rules vary by jurisdiction.

Why this matters now

In Q1 2026 alone, U.S. courts imposed roughly $145,000 in sanctions for AI-related filing errors. Oregon issued a $110,000 sanction, Nebraska handed down the first license suspension tied to AI fabrications, and in April 2026 Sullivan & Cromwell apologized to a federal bankruptcy judge for a brief containing AI-generated citations.

Every one of those incidents is, at root, an ABA 512 violation: a lawyer failed to verify AI output (competence), failed to disclose AI use to the tribunal (candor), or failed to supervise the work product the tool produced. The defense in each case rested on what the lawyer could reconstruct about how the AI was used. Tools that produce audit trails make that defense possible. Tools that do not produce audit trails make the lawyer the easy target.

Firms have noticed. AI procurement at most large firms now runs through a 50-question vendor security assessment, and ABA 512 alignment is on it. Building this in is no longer optional for products selling into BigLaw or AmLaw 100.

The six obligations, translated

1. Competence (Model Rule 1.1)

ABA's requirement. Lawyers must understand the capabilities and limitations of the AI tools they use. They cannot blindly rely on outputs.

Engineering translation. Your tool needs to give lawyers the information they need to spot failures. Three specific things:

Confidence signals. When the model is uncertain, surface that. Do not present low-confidence output the same way you present high-confidence output. Visual treatment, explicit "verify this citation" prompts, or refusal-with-explanation when confidence falls below a threshold.
Source visibility. Every AI output that draws on external authority should show which authorities. Not as a footnote the lawyer can ignore, but as a structural part of the UI that makes "what did this come from" the obvious next click.
Eval coverage transparency. Document what your tool has been evaluated on and what it has not. Harvey publishes information about its BigLaw Bench performance. Spellbook publishes its accuracy on standard contract clauses. This is not marketing fluff; it is competence documentation that lawyers cite when their use of the tool gets challenged.

The competence obligation is also why public, domain-specific evals matter as a category. A lawyer who can point to "this tool scored X on the standard legal grounding benchmark" has a stronger competence defense than a lawyer who can point to "the vendor told me it works."

2. Confidentiality (Model Rule 1.6)

ABA's requirement. Lawyers must make reasonable efforts to prevent inadvertent or unauthorized disclosure of client information. The opinion specifically calls out that if the AI tool retains user input, lawyers should locate a more advanced tool, negotiate stronger confidentiality protections, or only input non-sensitive information.

Engineering translation. This is the obligation with the most technical surface area, and the one where most legal AI products differentiate.

Zero data retention (ZDR). Your tool must not retain the prompts and outputs containing client data. This means contracting with model providers that offer ZDR endpoints (Anthropic, OpenAI, and Google all do for enterprise) and routing every request through those endpoints. If your gateway or routing layer does not enforce ZDR provider selection, you have a confidentiality leak.
No-train clauses. Vendor agreements with your model providers must include explicit no-train language. The model provider cannot use the law firm's prompts and outputs to improve models for other customers. This is an MSA-level negotiation, not a technical setting.
PII and identifier redaction. Even with ZDR, defense in depth means stripping client identifiers before they leave your environment when the matter does not require them. Names, account numbers, social security numbers, case numbers. A redaction layer at the gateway is the standard pattern.
Tenant isolation. Multi-tenant SaaS legal AI must isolate firm data. One firm's prompts and outputs never end up in another firm's retrieval context, even by accident. Test for this with an explicit cross-tenant retrieval test in your CI.
Encryption at rest and in transit. Table stakes for any enterprise SaaS. Document the encryption standards in your SOC 2 report.

A working pattern using a gateway that handles routing, redaction, and audit:

# Routing pattern that enforces ZDR + redaction
import respan
from respan.gateway import Gateway
 
client = Gateway(
    routing_policy="zdr-only", # only providers with ZDR contracts
    pii_redaction="aggressive", # strip client identifiers before send
    no_train_required=True,    # fail closed if no-train not contractually guaranteed
    audit_log=True,            # full request log persisted for compliance
)
 
@respan.trace(matter_id_param="matter_id", privilege="attorney-client")
def draft_motion(query, exhibits, matter_id):
    response = client.complete(
        model="claude-opus",
        messages=[{"role": "user", "content": query}],
        metadata={"matter_id": matter_id},
    )
    return response

The privilege tag on the trace decorator is what lets compliance review the audit log without breaking privilege. Logs containing privileged content get access-controlled, not exposed.

3. Communication (Model Rule 1.4)

ABA's requirement. Lawyers must communicate with clients about AI use, especially when AI use materially affects the engagement (cost, scope, or risk).

Engineering translation. This is mostly a product decision, but engineering supports it.

Per-matter usage logging. Your tool needs to track which matters used AI and how. When a client asks "did you use AI on my case," the firm needs to be able to answer with specifics, not vibes.
Disclosure UI. Help lawyers add AI use disclosure to engagement letters and matter notes. A template that pulls from the matter's actual AI usage is more useful than a boilerplate that says "we may use AI."
Output provenance. When a paragraph in a draft was AI-generated and edited by a lawyer, that history should be visible to the lawyer (and exportable on request). Not surfaced to the client by default, but available if disclosure becomes necessary.

4. Candor Toward the Tribunal (Model Rules 3.1, 3.3, 8.4(c))

ABA's requirement. Lawyers must not present false or misleading material to a court. If they discover after the fact that material in a filing was false, they have to correct it. Some local rules now require proactive disclosure of AI use in filings.

Engineering translation. This is where the audit trail obligation becomes hard requirements rather than nice-to-haves.

Filing-grade audit logs. For any AI output that ends up in a court filing, you need a complete reconstruction of how it was generated. Query, retrieval set, model version, prompt template, full output, post-processing applied, lawyer edits. Persisted with cryptographic integrity (hash chains or signed entries) so the audit cannot be altered after the fact.
Citation verification logs. Specifically for citations, log the verification status at the time of generation. A cite that passed existence and alignment checks but was later determined to be fabricated should show up in the audit as "verification passed at time T1, regression discovered at time T2" with the reason. This is the difference between a defensible position and a sanctionable one.
Proactive disclosure helpers. Some courts now require parties to disclose AI use. Your tool should make it easy to generate a court-compliant disclosure statement from the matter's AI usage log. This is product-engineering work, not a feature lawyers will build themselves.

The Charlotin database, which tracks legal AI hallucination cases, has now logged over 1,353 incidents globally. The pattern in nearly every case: the lawyer cannot reconstruct what the AI did, so cannot defend why the bad cite ended up in the brief. Tools that produce reconstructable traces protect their users in exactly this scenario.

5. Supervisory Responsibilities (Model Rules 5.1, 5.3)

ABA's requirement. Managerial and supervisory lawyers must establish policies governing AI use. Senior lawyers are responsible for the work of juniors, including AI-assisted work.

Engineering translation. Your tool needs to support supervision as a workflow primitive, not as an afterthought.

Approval gates. Multi-step AI agents (M&A due diligence, fund formation, complex motion drafting) should support explicit human-in-the-loop checkpoints. The agent pauses at defined milestones and waits for senior lawyer review before continuing. Skipping the gate without explicit override should be impossible.
Reviewer attribution. Every AI output in a draft should track who reviewed it and when. A junior associate accepts an AI-generated clause; a senior partner reviews and either accepts or revises. Both events are logged. When the brief gets challenged, the supervisory chain is reconstructable.
Policy enforcement. Firms set policies (no AI for criminal defense, no AI without partner review for M&A above $X). Your product should let firm administrators encode those policies and enforce them automatically. A user who tries to use AI on a matter type the firm has restricted should hit a hard block, not a polite warning.

6. Fees (Model Rule 1.5)

ABA's requirement. Fees and disbursements must be reasonable. Lawyers billing hourly must bill for actual time worked. Lawyers passing AI costs through to clients must do so as legitimate disbursements, not general overhead.

Engineering translation. Cost attribution.

Per-matter cost tracking. Every API call, every model token, every storage operation should attribute to a matter. Firms cost-allocate AI infrastructure across matters; your billing data should make that allocation accurate rather than approximate.
Cost transparency to lawyers. When a lawyer runs a complex agent (a long-horizon due diligence agent, for example), they should see what it cost. Not necessarily during the interaction, but accessible later when they need to justify the disbursement.
Pricing model alignment. This is a product decision rather than engineering, but worth flagging. Per-seat pricing aligns with hourly billing. Per-action or per-matter pricing aligns with flat-fee billing. Tools that pretend their pricing model does not affect the lawyer's billing decisions are creating downstream conflicts.

The state-by-state landscape

ABA opinions are not binding. They are guidance that state bars cite when issuing their own ethics opinions. As of mid-2026, the state landscape is fragmented. A non-exhaustive map:

Florida issued early guidance (2024) tracking the ABA framework closely.
North Carolina 2024 Formal Ethics Opinion 1 codified specific requirements for confidentiality and supervision when using AI.
Illinois Supreme Court AI Policy (effective January 2025) requires courts and counsel to follow specific AI use disclosures.
District of Columbia Bar Ethics Opinion 388 (April 2024) covered confidentiality and competence in AI use.
Kentucky Bar Association Ethics Opinion KBA E-457 (March 2024) addressed similar territory.
Michigan State Bar JI-155 (October 2023) addressed judicial use of AI.
New Jersey issued preliminary guidelines for lawyers' AI use.
California, New York, Texas have not yet issued comprehensive ABA 512-equivalent opinions, but draft guidance is circulating in their respective bar committees.

For tools selling nationally, the practical implication is to build to the strictest interpretation. If Illinois requires AI use disclosure in court filings and California does not, your tool should support disclosure as a feature; the California user simply does not enable it. Building two products (a strict-state version and a permissive-state version) is more expensive than a single configurable product.

The European Union's AI Act, fully effective in August 2026, adds another layer for any tool with EU deployment. Legal AI is not classified as "high-risk" under the Act, but transparency and record-keeping requirements apply. The audit trail you build for ABA 512 candor obligations doubles as your AI Act conformity record. There is no good reason to build these separately.

A compliance-aligned architecture

Putting it all together, the architecture for a legal AI tool that aligns with ABA 512 from day one looks like this.

[User: Lawyer]
      |
      v
[Authentication + Firm Policy Check]
      |  (denies if matter type restricted by firm)
      v
[Tracing layer: matter_id, attorney, privilege tag]
      |
      v
[PII Redaction Gateway]
      |
      v
[Model Routing: ZDR providers only]
      |
      v
[Generation + Citation Grounding]
      |
      v
[Verification: existence + alignment]
      |
      v
[Output + Source UI + Confidence Signals]
      |
      v
[Lawyer Review + Edit Tracking]
      |
      v
[Audit Log: full reconstruction, signed]

Each layer maps to an obligation. Authentication and policy check serves supervisory responsibilities. Tracing serves candor. The redaction gateway and ZDR routing serve confidentiality. Output UI and confidence signals serve competence. Edit tracking serves communication and supervision. The audit log serves all of them.

In Respan, the tracing and gateway pieces compose like this:

import respan
from respan.gateway import Gateway
 
# Firm-level policy enforced at the gateway
gateway = Gateway(
    routing_policy="zdr-only",
    pii_redaction="aggressive",
    audit_log=True,
    firm_policies=load_firm_policies("acme-firm"),
)
 
@respan.trace(
    matter_id_param="matter_id",
    attorney_param="attorney_id",
    privilege="attorney-client",
    require_review=["draft_filing", "final_motion"], # gates that need human approval
)
def legal_workflow(matter_id, attorney_id, query):
    # All outputs from this workflow inherit privilege tag
    # Audit log captures full chain
    # Approval gates pause the workflow at named checkpoints
    response = gateway.complete(...)
    return response

The decorators handle the boring compliance plumbing (audit logs, privilege tagging, approval gates) so the application logic can focus on the legal task. The audit log is queryable per matter, per attorney, per date range; the standard discovery and sanctions-defense use cases are direct queries against this index.

What firms ask for in vendor reviews

If you are selling legal AI into a BigLaw firm in 2026, the security review will ask roughly these questions. Have answers ready.

Do you offer ZDR with all upstream model providers? (Required.)
Are there any cases in which firm prompts or outputs are retained? (Be specific. Eval data, debugging data, customer support copies.)
Is there a no-train clause in your MSA with each model provider? (Show the clauses.)
Can the firm export full audit logs for any matter on demand? (Format, retention period, encryption.)
Can the firm administrator configure policies (which matter types allow AI, which require senior approval, which prohibit specific tool usage)?
What is your SOC 2 status? Can we see the latest report?
Where is the data stored? Is there a U.S.-only or EU-only hosting option?
How do you handle subpoenas or government requests for firm data? (Notify-the-firm policy or not?)
What is the breach notification SLA?

Building these in is meaningfully cheaper than retrofitting them. ABA 512 is the architecture, not a checklist you add at the end.

How Respan fits

ABA 512 compliance is fundamentally an instrumentation problem: every prompt, retrieval, citation, and lawyer edit has to be reconstructable, and the only way to get that cleanly is to make it the substrate underneath your legal AI app. Respan is built to be that substrate.

Tracing: every legal AI workflow captured as one connected trace, from authentication and policy check through redaction, ZDR routing, generation, citation grounding, verification, and lawyer review. Auto-instrumented for LangChain, LlamaIndex, Vercel AI SDK, CrewAI, AutoGen, OpenAI Agents SDK. For ABA 512, tracing is the candor obligation in code: when a sanctioned cite shows up in a brief, the matter-tagged, privilege-tagged trace is what lets the firm reconstruct exactly what happened and when.
Evals: ten built-in evaluators (faithfulness, citation accuracy, refusal correctness, harmfulness) plus LLM-as-judge and custom Python evaluators. Production traffic flows directly into datasets. CI-aware experiments block regressions on fabricated citations, hallucinated holdings, broken pin cites, and confidentiality leaks before deploys ship.
Gateway: 500+ models behind an OpenAI-compatible interface, semantic caching, fallback chains, per-customer spending caps. The gateway is where you enforce ZDR-only routing, no-train providers, aggressive PII and client-identifier redaction, and per-matter cost attribution, so confidentiality and fee obligations get handled at the routing layer rather than scattered across application code.
Prompt management: versioned registry, dev/staging/prod environments with approval workflows, A/B testing in production with one-click rollback. Disclosure templates, motion-drafting prompts, due diligence agent prompts, and citation verification prompts all belong in the registry so the firm can audit which version of a prompt produced which filing.
Monitors and alerts: citation verification failure rate, ZDR routing violations, cross-tenant retrieval hits, approval gate skips, per-matter cost overruns. Slack, email, PagerDuty, webhook. A spiking citation verification failure rate is a leading indicator of the next sanctions opinion, and you want to know within minutes, not when the judge finds out.

A reasonable starter loop for legal AI builders:

Instrument every LLM call with Respan tracing including matter_id, attorney_id, privilege tag, retrieval set, citation verification status, and lawyer edit spans.
Pull 200 to 500 production motions, memos, and contract review outputs into a dataset and label them for citation accuracy, holding fidelity, and confidentiality compliance.
Wire two or three evaluators that catch the failure modes you most fear (fabricated citations, misstated holdings, leaked client identifiers).
Put your motion-drafting, due diligence, and disclosure prompts behind the registry so you can version, A/B, and roll back without a deploy.
Route through the gateway so ZDR routing, no-train enforcement, and per-matter cost attribution are guaranteed at the network boundary rather than trusted to application code.

Without this loop, the next $110,000 Oregon-style sanction or the next Sullivan and Cromwell apology to a federal judge lands on a tool that cannot produce an audit trail, and the firm's defense collapses to "the vendor told me it works."

To wire any of the patterns above on Respan, start tracing for free, read the docs, or talk to us.

Why Legal AI Still Hallucinates Citations: the specific failure mode that drives the candor obligation
Building a Citation Grounding Eval for Legal AI: eval methodology that supports the competence obligation
Building an AI Contract Review Agent: applying the architecture in code
How Legal AI Teams Build LLM Apps in 2026: pillar overview

Get the checklist. Download the ABA 512 Engineering Compliance Checklist, a vendor-review-aligned worksheet that maps each ABA obligation to the technical controls auditors look for, plus sample audit log schemas. To talk through your specific compliance architecture, book a call with our team.

The opinion is written for lawyers. It tells them what they have to do. It does not tell engineers what they have to build.

Why this matters now

The six obligations, translated

1. Competence (Model Rule 1.1)

ABA's requirement. Lawyers must understand the capabilities and limitations of the AI tools they use. They cannot blindly rely on outputs.

Engineering translation. Your tool needs to give lawyers the information they need to spot failures. Three specific things:

Confidence signals. When the model is uncertain, surface that. Do not present low-confidence output the same way you present high-confidence output. Visual treatment, explicit "verify this citation" prompts, or refusal-with-explanation when confidence falls below a threshold.
Source visibility. Every AI output that draws on external authority should show which authorities. Not as a footnote the lawyer can ignore, but as a structural part of the UI that makes "what did this come from" the obvious next click.
Eval coverage transparency. Document what your tool has been evaluated on and what it has not. Harvey publishes information about its BigLaw Bench performance. Spellbook publishes its accuracy on standard contract clauses. This is not marketing fluff; it is competence documentation that lawyers cite when their use of the tool gets challenged.

2. Confidentiality (Model Rule 1.6)

Engineering translation. This is the obligation with the most technical surface area, and the one where most legal AI products differentiate.

Zero data retention (ZDR). Your tool must not retain the prompts and outputs containing client data. This means contracting with model providers that offer ZDR endpoints (Anthropic, OpenAI, and Google all do for enterprise) and routing every request through those endpoints. If your gateway or routing layer does not enforce ZDR provider selection, you have a confidentiality leak.
No-train clauses. Vendor agreements with your model providers must include explicit no-train language. The model provider cannot use the law firm's prompts and outputs to improve models for other customers. This is an MSA-level negotiation, not a technical setting.
PII and identifier redaction. Even with ZDR, defense in depth means stripping client identifiers before they leave your environment when the matter does not require them. Names, account numbers, social security numbers, case numbers. A redaction layer at the gateway is the standard pattern.
Tenant isolation. Multi-tenant SaaS legal AI must isolate firm data. One firm's prompts and outputs never end up in another firm's retrieval context, even by accident. Test for this with an explicit cross-tenant retrieval test in your CI.
Encryption at rest and in transit. Table stakes for any enterprise SaaS. Document the encryption standards in your SOC 2 report.

A working pattern using a gateway that handles routing, redaction, and audit:

# Routing pattern that enforces ZDR + redaction
import respan
from respan.gateway import Gateway
 
client = Gateway(
    routing_policy="zdr-only", # only providers with ZDR contracts
    pii_redaction="aggressive", # strip client identifiers before send
    no_train_required=True,    # fail closed if no-train not contractually guaranteed
    audit_log=True,            # full request log persisted for compliance
)
 
@respan.trace(matter_id_param="matter_id", privilege="attorney-client")
def draft_motion(query, exhibits, matter_id):
    response = client.complete(
        model="claude-opus",
        messages=[{"role": "user", "content": query}],
        metadata={"matter_id": matter_id},
    )
    return response

The privilege tag on the trace decorator is what lets compliance review the audit log without breaking privilege. Logs containing privileged content get access-controlled, not exposed.

3. Communication (Model Rule 1.4)

ABA's requirement. Lawyers must communicate with clients about AI use, especially when AI use materially affects the engagement (cost, scope, or risk).

Engineering translation. This is mostly a product decision, but engineering supports it.

Per-matter usage logging. Your tool needs to track which matters used AI and how. When a client asks "did you use AI on my case," the firm needs to be able to answer with specifics, not vibes.
Disclosure UI. Help lawyers add AI use disclosure to engagement letters and matter notes. A template that pulls from the matter's actual AI usage is more useful than a boilerplate that says "we may use AI."
Output provenance. When a paragraph in a draft was AI-generated and edited by a lawyer, that history should be visible to the lawyer (and exportable on request). Not surfaced to the client by default, but available if disclosure becomes necessary.

4. Candor Toward the Tribunal (Model Rules 3.1, 3.3, 8.4(c))

Engineering translation. This is where the audit trail obligation becomes hard requirements rather than nice-to-haves.

Filing-grade audit logs. For any AI output that ends up in a court filing, you need a complete reconstruction of how it was generated. Query, retrieval set, model version, prompt template, full output, post-processing applied, lawyer edits. Persisted with cryptographic integrity (hash chains or signed entries) so the audit cannot be altered after the fact.
Citation verification logs. Specifically for citations, log the verification status at the time of generation. A cite that passed existence and alignment checks but was later determined to be fabricated should show up in the audit as "verification passed at time T1, regression discovered at time T2" with the reason. This is the difference between a defensible position and a sanctionable one.
Proactive disclosure helpers. Some courts now require parties to disclose AI use. Your tool should make it easy to generate a court-compliant disclosure statement from the matter's AI usage log. This is product-engineering work, not a feature lawyers will build themselves.

5. Supervisory Responsibilities (Model Rules 5.1, 5.3)

ABA's requirement. Managerial and supervisory lawyers must establish policies governing AI use. Senior lawyers are responsible for the work of juniors, including AI-assisted work.

Engineering translation. Your tool needs to support supervision as a workflow primitive, not as an afterthought.

Approval gates. Multi-step AI agents (M&A due diligence, fund formation, complex motion drafting) should support explicit human-in-the-loop checkpoints. The agent pauses at defined milestones and waits for senior lawyer review before continuing. Skipping the gate without explicit override should be impossible.
Reviewer attribution. Every AI output in a draft should track who reviewed it and when. A junior associate accepts an AI-generated clause; a senior partner reviews and either accepts or revises. Both events are logged. When the brief gets challenged, the supervisory chain is reconstructable.
Policy enforcement. Firms set policies (no AI for criminal defense, no AI without partner review for M&A above $X). Your product should let firm administrators encode those policies and enforce them automatically. A user who tries to use AI on a matter type the firm has restricted should hit a hard block, not a polite warning.

6. Fees (Model Rule 1.5)

Engineering translation. Cost attribution.

Per-matter cost tracking. Every API call, every model token, every storage operation should attribute to a matter. Firms cost-allocate AI infrastructure across matters; your billing data should make that allocation accurate rather than approximate.
Cost transparency to lawyers. When a lawyer runs a complex agent (a long-horizon due diligence agent, for example), they should see what it cost. Not necessarily during the interaction, but accessible later when they need to justify the disbursement.
Pricing model alignment. This is a product decision rather than engineering, but worth flagging. Per-seat pricing aligns with hourly billing. Per-action or per-matter pricing aligns with flat-fee billing. Tools that pretend their pricing model does not affect the lawyer's billing decisions are creating downstream conflicts.

The state-by-state landscape

ABA opinions are not binding. They are guidance that state bars cite when issuing their own ethics opinions. As of mid-2026, the state landscape is fragmented. A non-exhaustive map:

Florida issued early guidance (2024) tracking the ABA framework closely.
North Carolina 2024 Formal Ethics Opinion 1 codified specific requirements for confidentiality and supervision when using AI.
Illinois Supreme Court AI Policy (effective January 2025) requires courts and counsel to follow specific AI use disclosures.
District of Columbia Bar Ethics Opinion 388 (April 2024) covered confidentiality and competence in AI use.
Kentucky Bar Association Ethics Opinion KBA E-457 (March 2024) addressed similar territory.
Michigan State Bar JI-155 (October 2023) addressed judicial use of AI.
New Jersey issued preliminary guidelines for lawyers' AI use.
California, New York, Texas have not yet issued comprehensive ABA 512-equivalent opinions, but draft guidance is circulating in their respective bar committees.

A compliance-aligned architecture

Putting it all together, the architecture for a legal AI tool that aligns with ABA 512 from day one looks like this.

[User: Lawyer]
      |
      v
[Authentication + Firm Policy Check]
      |  (denies if matter type restricted by firm)
      v
[Tracing layer: matter_id, attorney, privilege tag]
      |
      v
[PII Redaction Gateway]
      |
      v
[Model Routing: ZDR providers only]
      |
      v
[Generation + Citation Grounding]
      |
      v
[Verification: existence + alignment]
      |
      v
[Output + Source UI + Confidence Signals]
      |
      v
[Lawyer Review + Edit Tracking]
      |
      v
[Audit Log: full reconstruction, signed]

In Respan, the tracing and gateway pieces compose like this:

import respan
from respan.gateway import Gateway
 
# Firm-level policy enforced at the gateway
gateway = Gateway(
    routing_policy="zdr-only",
    pii_redaction="aggressive",
    audit_log=True,
    firm_policies=load_firm_policies("acme-firm"),
)
 
@respan.trace(
    matter_id_param="matter_id",
    attorney_param="attorney_id",
    privilege="attorney-client",
    require_review=["draft_filing", "final_motion"], # gates that need human approval
)
def legal_workflow(matter_id, attorney_id, query):
    # All outputs from this workflow inherit privilege tag
    # Audit log captures full chain
    # Approval gates pause the workflow at named checkpoints
    response = gateway.complete(...)
    return response

What firms ask for in vendor reviews

If you are selling legal AI into a BigLaw firm in 2026, the security review will ask roughly these questions. Have answers ready.

Do you offer ZDR with all upstream model providers? (Required.)
Are there any cases in which firm prompts or outputs are retained? (Be specific. Eval data, debugging data, customer support copies.)
Is there a no-train clause in your MSA with each model provider? (Show the clauses.)
Can the firm export full audit logs for any matter on demand? (Format, retention period, encryption.)
Can the firm administrator configure policies (which matter types allow AI, which require senior approval, which prohibit specific tool usage)?
What is your SOC 2 status? Can we see the latest report?
Where is the data stored? Is there a U.S.-only or EU-only hosting option?
How do you handle subpoenas or government requests for firm data? (Notify-the-firm policy or not?)
What is the breach notification SLA?

Building these in is meaningfully cheaper than retrofitting them. ABA 512 is the architecture, not a checklist you add at the end.

How Respan fits

Tracing: every legal AI workflow captured as one connected trace, from authentication and policy check through redaction, ZDR routing, generation, citation grounding, verification, and lawyer review. Auto-instrumented for LangChain, LlamaIndex, Vercel AI SDK, CrewAI, AutoGen, OpenAI Agents SDK. For ABA 512, tracing is the candor obligation in code: when a sanctioned cite shows up in a brief, the matter-tagged, privilege-tagged trace is what lets the firm reconstruct exactly what happened and when.
Evals: ten built-in evaluators (faithfulness, citation accuracy, refusal correctness, harmfulness) plus LLM-as-judge and custom Python evaluators. Production traffic flows directly into datasets. CI-aware experiments block regressions on fabricated citations, hallucinated holdings, broken pin cites, and confidentiality leaks before deploys ship.
Gateway: 500+ models behind an OpenAI-compatible interface, semantic caching, fallback chains, per-customer spending caps. The gateway is where you enforce ZDR-only routing, no-train providers, aggressive PII and client-identifier redaction, and per-matter cost attribution, so confidentiality and fee obligations get handled at the routing layer rather than scattered across application code.
Prompt management: versioned registry, dev/staging/prod environments with approval workflows, A/B testing in production with one-click rollback. Disclosure templates, motion-drafting prompts, due diligence agent prompts, and citation verification prompts all belong in the registry so the firm can audit which version of a prompt produced which filing.
Monitors and alerts: citation verification failure rate, ZDR routing violations, cross-tenant retrieval hits, approval gate skips, per-matter cost overruns. Slack, email, PagerDuty, webhook. A spiking citation verification failure rate is a leading indicator of the next sanctions opinion, and you want to know within minutes, not when the judge finds out.

A reasonable starter loop for legal AI builders:

Instrument every LLM call with Respan tracing including matter_id, attorney_id, privilege tag, retrieval set, citation verification status, and lawyer edit spans.
Pull 200 to 500 production motions, memos, and contract review outputs into a dataset and label them for citation accuracy, holding fidelity, and confidentiality compliance.
Wire two or three evaluators that catch the failure modes you most fear (fabricated citations, misstated holdings, leaked client identifiers).
Put your motion-drafting, due diligence, and disclosure prompts behind the registry so you can version, A/B, and roll back without a deploy.
Route through the gateway so ZDR routing, no-train enforcement, and per-matter cost attribution are guaranteed at the network boundary rather than trusted to application code.

To wire any of the patterns above on Respan, start tracing for free, read the docs, or talk to us.

Why Legal AI Still Hallucinates Citations: the specific failure mode that drives the candor obligation
Building a Citation Grounding Eval for Legal AI: eval methodology that supports the competence obligation
Building an AI Contract Review Agent: applying the architecture in code
How Legal AI Teams Build LLM Apps in 2026: pillar overview

ABA Formal Opinion 512 for Engineers

Why this matters now

The six obligations, translated

1. Competence (Model Rule 1.1)

2. Confidentiality (Model Rule 1.6)

3. Communication (Model Rule 1.4)

4. Candor Toward the Tribunal (Model Rules 3.1, 3.3, 8.4(c))

5. Supervisory Responsibilities (Model Rules 5.1, 5.3)

6. Fees (Model Rule 1.5)

The state-by-state landscape

A compliance-aligned architecture

What firms ask for in vendor reviews

How Respan fits

Built for AI agents.
Break less.
Ship more.

ABA Formal Opinion 512 for Engineers

Why this matters now

The six obligations, translated

1. Competence (Model Rule 1.1)

2. Confidentiality (Model Rule 1.6)

3. Communication (Model Rule 1.4)

4. Candor Toward the Tribunal (Model Rules 3.1, 3.3, 8.4(c))

5. Supervisory Responsibilities (Model Rules 5.1, 5.3)

6. Fees (Model Rule 1.5)

The state-by-state landscape

A compliance-aligned architecture

What firms ask for in vendor reviews

How Respan fits

Built for AI agents.
Break less.
Ship more.

ABA Formal Opinion 512 for Engineers

Why this matters now

The six obligations, translated

1. Competence (Model Rule 1.1)

2. Confidentiality (Model Rule 1.6)

3. Communication (Model Rule 1.4)

4. Candor Toward the Tribunal (Model Rules 3.1, 3.3, 8.4(c))

5. Supervisory Responsibilities (Model Rules 5.1, 5.3)

6. Fees (Model Rule 1.5)

The state-by-state landscape

A compliance-aligned architecture

What firms ask for in vendor reviews

How Respan fits

Related reading

Built for AI agents. Break less. Ship more.

ABA Formal Opinion 512 for Engineers

Why this matters now

The six obligations, translated

1. Competence (Model Rule 1.1)

2. Confidentiality (Model Rule 1.6)

3. Communication (Model Rule 1.4)

4. Candor Toward the Tribunal (Model Rules 3.1, 3.3, 8.4(c))

5. Supervisory Responsibilities (Model Rules 5.1, 5.3)

6. Fees (Model Rule 1.5)

The state-by-state landscape

A compliance-aligned architecture

What firms ask for in vendor reviews

How Respan fits

Related reading

Built for AI agents. Break less. Ship more.

Built for AI agents.
Break less.
Ship more.

Built for AI agents.
Break less.
Ship more.