AI for in-house legal teams: contract review and RAG

The legal department of a mid-sized manufacturing company receives between 40 to 120 contracts for review each month: NDAs, supplier agreements, amendments, orders. Most are standard documents where the lawyer looks for deviations from the company template. Reading them page by page takes dozens of hours monthly, which can be shortened if AI performs the first pass and flags areas requiring attention.

At Cashcrown, we’ve been studying these patterns for several years. Below, we describe what works, where the hard limits are, and what an architecture that doesn’t create regulatory risk looks like.

Triage and contract review

Triage is the first step: before the lawyer opens the file, the system classifies the document by type (NDA, framework agreement, order, amendment), priority, and degree of deviation from the template.

The classifier assigns the document to a category based on content, not the file name. This is important because suppliers often name contracts ambiguously or upload them as scans without titles.

After classification, the system compares clauses with the company’s template for that document type. Deviations are added to a review list with the page number and risk level assessment (low / medium / high based on predefined criteria). The lawyer sees first what truly differs from the standard, not the entire text from the beginning.

The hard limit here: AI flags deviations but does not assess whether the deviation is acceptable in the context of that specific business relationship, negotiation history, and company priorities. That legal assessment belongs to the lawyer.

Clause extraction and obligation tracking

Data extraction from contracts is one of the most mature AI use cases in law. Contracts have predictable structures, repetitive clauses, and defined fields that need to be extracted: effective dates, termination deadlines, contractual penalties, value thresholds, contracting parties, jurisdiction.

For in-house teams, the most valuable output is a register of obligations and deadlines. After processing the contract portfolio, the system creates a table with expiration dates, termination deadlines, and recurring obligations (e.g., quarterly reporting requirements). An alert is sent to the lawyer or process owner 30-60 days before the deadline.

A few caveats we observe in pilots:

Deadlines stated in words rather than dates (e.g., “six months from the signing date”) require separate calculation logic and often need verification, as the signing date may not be in the digital document.
Contracts with conditional expiration clauses (e.g., “the agreement expires when condition X is met”) are flagged by the system instead of being interpreted independently. This is correct behavior.
Amendments without the full text of the modified document result in incomplete extraction. The index must include the main agreement and all amendments linked as a single source document.

The lawyer verifies extraction for clauses marked as “low confidence” and approves the register before data is entered into the deadline tracking system.

RAG on company documents: internal policy assistant

In-house legal teams also answer internal questions: whether an NDA can be signed with a company from jurisdiction X, what the company policy is on non-compete clauses, what the internal procurement regulations say about approval thresholds.

This is repetitive and distracting work. RAG on company documents (policies, contract templates, compliance guidelines) allows employees and managers to get answers to general questions without involving a lawyer, with references to specific documents and paragraphs.

Key technical requirements:

Source citation is mandatory. An answer without a reference to a specific document and paragraph signals that the model is generating general knowledge instead of citing company documents. Such results should go to an escalation queue, not to the user.

Guardrails block questions about legal interpretation. If the question is “can we terminate the contract in this situation,” the system answers what the document says and adds an escalation to the lawyer for questions requiring legal assessment. RAG answers “what our documents say,” not “what we should do.”

Hallucinations are particularly dangerous in a legal context. An answer that looks like a quote from a policy but is a model fabrication can lead to incorrect decisions. The pattern we use: every answer includes a verbatim quote from the original fragment and allows the user to click through to the source. If the fragment doesn’t exist in the index, the system responds “I don’t know” and escalates.

Task comparison: what AI does independently, what requires a lawyer

Task	AI Role	Lawyer Role
Document classification and triage	independently (with log)	sample verification
Detecting deviations from template	independently (with flag)	assessing deviation acceptability
Extracting dates, deadlines, penalties	independently (with confidence score)	verifying low-confidence and conditions
Obligation register and alerts	independently	quarterly register approval
Answers to policy questions	independently (with citation)	escalation of interpretation questions
Legal risk assessment	no	yes, always
Legal advice and contract approval	no	yes, always
Negotiations and representation	no	yes, always

This table describes the pattern derived from repeatedly applied human-oversight design in our pilots. AI provides the material; humans decide wherever the decision has legal or financial significance.

Try it live

▶Design the AI pilot scope for an in-house legal departmentsandbox · reasoning

Company contracts contain personal data of the parties: names, job titles, PESEL numbers in employment or B2B contracts, contact details, sometimes salary information in amendments. Processing this data through an AI system requires a legal basis and technical measures.

Two requirements without which we don’t launch a pilot:

PII masking before indexing. Personally identifiable information is masked or tokenized before document fragments enter the vector database. The model sees “NATURAL_PERSON_1” instead of a specific name. The token-to-real-data mapping is stored outside the index, with access control and operation logs.

Isolation per contract or per project. The index for a client’s or project’s contracts is physically separated from others. A question asked in the context of one project does not access documents from another.

If the system is to process employment contracts, medical data, or other special categories, a DPIA is required before launch. For standard commercial contracts with contact details, a processing activities register and information clauses for data subjects are sufficient.

Detailed regulatory obligations for AI systems processing company documents are discussed in the article AI Act and GDPR 2026. Technical patterns for GDPR compliance in AI implementations can be found in data governance for AI.

Pilot: how to start without risk

Pilots we’ve observed as more successful started with one narrow use case instead of trying to automate the entire department at once.

A good starting point is one highly repetitive document type, e.g., NDAs from new suppliers. Pilot scope: classification (whether it’s an NDA), extraction of 5-7 fields (parties, date, duration, confidentiality clauses, jurisdiction), comparison with the company’s NDA template, and deviation detection.

For the first 4-8 weeks, the lawyer verifies 100% of AI results. This allows calibrating confidence thresholds, identifying types of deviations the model misses, and building a golden set for quality evaluation.

The article AI for law firms describes a similar implementation pattern for external environments; many confidentiality and piloting recommendations apply directly to in-house teams. A broader architecture for document analysis, including due diligence, is covered in AI for document analysis.

For in-house pilots, it’s worth agreeing early with the IT department on access models: who has access to the index, how operations are logged, and whether data processing agreements with infrastructure providers are signed. The data processing agreement pattern is discussed in AI data processing agreement.

FAQ

Can AI independently approve contracts?

No. Contract approval is a legal and business decision that belongs to humans. AI prepares the material: classifies the document, extracts clauses, flags deviations from the template, and highlights risks. The final decision to sign or reject a contract requires assessing the transaction context, relationship with the counterparty, and company priorities—things AI doesn’t know and can’t evaluate.

How does AI handle Polish-language contracts and legal terminology?

Modern multilingual models support Polish without additional fine-tuning. Extraction precision for Polish contracts is higher when the vector database contains documents from the same domain and company templates. For highly specialized clauses (e.g., public procurement or banking law), it’s worth evaluating recall on a test set from your own documents before production deployment.

What to do when AI misses a clause that’s in the document?

This risk should be measured as recall on a pre-labeled test set. If recall for critical clauses (deadlines, penalties) drops below 90-95%, the issue usually lies in how the document is split into chunks or that the clause uses non-standard wording. The solution is to expand the training examples for the classifier or adjust chunking to your documents’ structure. Throughout the system’s operation, verification of critical clauses should remain with the lawyer.

Can an in-house legal department use a cloud API, or is self-hosting required?

It depends on document sensitivity. Contracts containing personal data or covered by NDAs and trade secrets should be processed locally or with PII masking before being sent to an external API. For internal documents without personal data, a cloud API is acceptable if the provider has signed a data processing agreement and declares zero prompt retention. The decision should be consulted with the DPO before launch.

How long does it take to implement a pilot for an in-house team?

A pilot for one document type usually takes 3-6 weeks: one week for ingestion and indexing a test set (100-300 contracts), one week for configuring guardrails and calibrating confidence thresholds, 2-4 weeks for verifying results with lawyers. Expanding to additional document types and integrating with a contract management system (CLM or DMS) takes 2-3 months depending on scope. Process readiness for automation is assessed using the readiness assessment tool.

AI for in-house legal teams: contract review and RAG

Triage and contract review

The classifier assigns the document to a category based on content, not the file name. This is important because suppliers often name contracts ambiguously or upload them as scans without titles.

Clause extraction and obligation tracking

A few caveats we observe in pilots:

Deadlines stated in words rather than dates (e.g., “six months from the signing date”) require separate calculation logic and often need verification, as the signing date may not be in the digital document.
Contracts with conditional expiration clauses (e.g., “the agreement expires when condition X is met”) are flagged by the system instead of being interpreted independently. This is correct behavior.
Amendments without the full text of the modified document result in incomplete extraction. The index must include the main agreement and all amendments linked as a single source document.

The lawyer verifies extraction for clauses marked as “low confidence” and approves the register before data is entered into the deadline tracking system.

RAG on company documents: internal policy assistant

Key technical requirements:

Task comparison: what AI does independently, what requires a lawyer

Task	AI Role	Lawyer Role
Document classification and triage	independently (with log)	sample verification
Detecting deviations from template	independently (with flag)	assessing deviation acceptability
Extracting dates, deadlines, penalties	independently (with confidence score)	verifying low-confidence and conditions
Obligation register and alerts	independently	quarterly register approval
Answers to policy questions	independently (with citation)	escalation of interpretation questions
Legal risk assessment	no	yes, always
Legal advice and contract approval	no	yes, always
Negotiations and representation	no	yes, always

Try it live

▶Design the AI pilot scope for an in-house legal departmentsandbox · reasoning

Two requirements without which we don’t launch a pilot:

Pilot: how to start without risk

Pilots we’ve observed as more successful started with one narrow use case instead of trying to automate the entire department at once.

AI for in-house legal teams: contract review and RAG

Triage and contract review

Clause extraction and obligation tracking

RAG on company documents: internal policy assistant

Task comparison: what AI does independently, what requires a lawyer

Try it live

GDPR and DPIA for documents containing personal data

Pilot: how to start without risk

FAQ

Can AI independently approve contracts?

How does AI handle Polish-language contracts and legal terminology?

What to do when AI misses a clause that’s in the document?

Can an in-house legal department use a cloud API, or is self-hosting required?

How long does it take to implement a pilot for an in-house team?

AI for in-house legal teams: contract review and RAG

Triage and contract review

Clause extraction and obligation tracking

RAG on company documents: internal policy assistant

Task comparison: what AI does independently, what requires a lawyer

Try it live

GDPR and DPIA for documents containing personal data

Pilot: how to start without risk

FAQ

Can AI independently approve contracts?

How does AI handle Polish-language contracts and legal terminology?

What to do when AI misses a clause that’s in the document?

Can an in-house legal department use a cloud API, or is self-hosting required?

How long does it take to implement a pilot for an in-house team?