AI for handling complaints and returns: faster, but legally…

An electronics store handles 1,200 complaints monthly. Half arrive on weekends and evenings when the team is understaffed. A six-month history analysis shows the average first-response time is 38 hours, and 18% of cases escalate due to imprecise initial classification. Implementing an AI agent for classification and preliminary handling can reduce the first-response time to 2-4 hours and cut escalations by 40-60%, provided automation boundaries are correctly designed.

What AI classifies and how it checks eligibility#

Every complaint or return submission has several dimensions that the classifier determines simultaneously.

Case type is the first dimension. The model distinguishes complaints under warranty (physical defect, non-conformity with description), manufacturer’s warranty, statutory 14-day return period for online purchases, and returns outside the legal timeframe or scope. These categories have different handling paths and consumer rights, so misclassification leads to incorrect solution proposals.

Deadline and eligibility are the most legally challenging dimensions. AI checks the purchase date from the order database, submission date, and resulting deadline. In consumer sales (since 1 January 2023), the "non-conformity with the contract" regime under the Consumer Rights Act applies: the trader is liable for 2 years from delivery, and the presumption that the non-conformity existed at the time of delivery covers that entire 2-year period (Art. 43c of the Consumer Rights Act). The Civil Code warranty (rękojmia) period applies to B2B sales. The statutory return period without cause is 14 days from receipt for distance selling. The model does not interpret the law independently; it verifies facts (dates, product condition, sales channel) against hardcoded rules built by a lawyer.

Consumer request is another dimension. The consumer may demand repair, replacement, price reduction, or contract termination, and has the right to change the request under statutory conditions. The classifier reads the request from the submission text and flags inconsistencies between the request and eligibility (e.g., demanding a refund for a product under manufacturer’s warranty that doesn’t meet warranty conditions).

Evidence and gaps. The agent checks if the submission contains required elements: defect description, date of discovery, request. If key information is missing, it sends a personalized request for completion—it does not reject the case. This is critical from a GDPR and legal perspective: refusal due to incompleteness must be preceded by a request for supplementation.

Pipeline architecture: from submission to proposal#

The pattern for e-commerce with 500-2,000 complaints monthly is a structured output pipeline with a legal rules layer.

Step 1: Ingestion and normalization. Submissions from every channel (form, email, chat) are standardized into a common structure: submission text, channel metadata, order ID if provided. OCR processes defect photos and PDF documents (invoices, confirmations).

Step 2: Data enrichment. The agent retrieves from the order database: purchase date, delivery date, order value, customer’s complaint history, sales channel (online or in-store, as this affects rights). Without this data, eligibility classification is impossible.

Step 3: Classification with validation. The model returns JSON: { "claim_type": "warranty|manufacturer_warranty|statutory_return|no_eligibility", "days_since_purchase": ..., "within_legal_deadline": true|false, "customer_request": "repair|replacement|price_reduction|refund", "missing_info": [...], "confidence": 0.0-1.0 }. The schema is validated; low confidence (below 0.70) or missing claim_type automatically routes the case to a manual queue.

Step 4: Solution proposal. Based on the classification result, the system generates a proposal aligned with the decision tree built by a lawyer. Warranty cases within the deadline with a described defect receive a repair or replacement proposal with a schedule. Statutory returns within 14 days receive return instructions with eligibility confirmation. Cases outside eligibility go to a human with context—no automatic refusal is issued.

Step 5: Human-handoff. The consultant sees a brief: classification, justification, AI proposal, customer history, legal signals. They approve, modify, or reject the proposal. Denial decisions always come from a human.

Table: submission type vs AI action vs when human intervenes#

Submission Type	AI Action	When Human Intervenes
Warranty within deadline, defect described, clear request	Proposal for repair or replacement + response template	Approval of proposal before sending
Statutory return within 14 days (online purchase)	Return instructions + eligibility confirmation	Only if product excluded from return rights (e.g., software)
Manufacturer’s warranty, form in service portal	Redirect to service + contact details, timeline info	If customer disputes the referral to service
Missing key information	Request for completion (defect description, date, request)	If customer does not complete after 2 attempts
Claim outside warranty period (over 2 years)	Explanation of statutory limits, paid service offer	Always in case of doubts or disputes over dates
AI refusal of complaint not permitted	Escalation to human with context	Always — AI does not issue denial decisions
Disputed case, threat of court or UOKiK action	Immediate priority escalation	Always, no auto-response
Low classifier confidence (below 0.70)	Manual queue with classification context	Always

Automation boundaries: what AI cannot do#

The biggest legal risk in complaint automation is a denial decision issued without human oversight. The Consumer Rights Act and Civil Code grant consumers specific rights, and violations by an automated system can lead to UOKiK complaints, lawsuits, and reputational costs.

AI should not independently:

reject complaints citing lack of eligibility,
decide that a defect is the user’s fault (this requires expert assessment),
apply presumptions unfavorable to the consumer without human verification,
send responses in disputed cases or when the customer threatens escalation to UOKiK, ombudsman, or court.

The correct pattern is: AI prepares a proposal with justification, and a human approves the decision. Response time still shortens from 38 hours to 4-8 hours, because the consultant receives a ready-made brief instead of a raw submission.

Human-oversight is not just an ethical requirement here—it’s a legal one. Poland’s Consumer Protection Act, EU Directive 2019/771, and AI Act guidelines state that automated decisions in consumer matters must be subject to human challenge.

Complaints contain personal data: name, address, order number, problem description, sometimes product photos with background data. Several GDPR requirements are particularly relevant here.

Legal basis for processing is contract performance (Art. 6(1)(b) GDPR) for data necessary to handle the complaint. Excess data (e.g., lengthy personal circumstances descriptions) should be deleted or anonymized before saving to the system.

Retention. Complaint documentation is stored for 3-5 years for evidentiary purposes (claims limitation). Data beyond this scope must be deleted. If a complaint is used as training data for an AI model, a DPIA and anonymization are required.

PII masking. If using an external API (cloud LLM) for classification, submission data should undergo PII masking before sending to the model. Order number, email, and name can be masked with tokens before API calls and restored on the application side.

An alternative is self-hosting the classification model, which eliminates the data transfer issue. At a scale above 2,000 complaints monthly, self-hosting costs become justified.

Pilot: how to start without risk#

Deploying a complaint classifier in shadow mode eliminates the risk of incorrect decisions at launch. For the first 4-8 weeks, the system classifies and proposes solutions, but the consultant still makes every decision independently. Comparing AI decisions with consultant decisions builds data for model calibration and reveals categories where AI consistently errs or proposes solutions misaligned with the terms.

A good pilot starts with 3 steps:

Collect 300-500 historical complaints with labels (type, decision, justification). This is training data and a benchmark for model evaluation.
Define a decision tree with a lawyer for each case type. The model is only as good as the rules it implements.
Launch shadow mode with a dashboard comparing AI proposals to consultant decisions. Acceptance threshold before switching to auto-assist: over 85% agreement for a given case type.

More on designing agents for multi-step processes: multi-step AI agent planning and AI customer service automation.

Try it live#

▶AI boundaries in e-commerce complaintssandbox · reasoning

FAQ#

Can AI independently decide to reject a complaint?#

No. Denial decisions must always be approved by a human. AI prepares a proposal with justification, but the consultant makes the final decision. Automatic denial without human oversight may violate statutory consumer rights and expose the seller to legal liability before UOKiK and courts.

How long does it take to implement AI for complaint handling?#

A shadow-mode pilot with a few-shot classifier can launch in 3-6 weeks if you have historical labeled data and a ready decision tree agreed with a lawyer. Full implementation with CRM integration, metrics, and escalation procedures takes 8-14 weeks. Most time is spent preparing data and defining rules, not programming the model.

What data do I need to train a complaint classifier?#

A minimum of 200-500 historical complaints per case type with labels (category, decision, justification). Before using data for training or fine-tuning, personal data (name, address, order number) must be anonymized or a DPIA conducted. If using an external API for classification, apply PII masking before sending data.

How does AI handle complaints when the customer doesn’t provide an order number?#

The agent sends a personalized request for the order number or details enabling transaction identification. It does not reject the case due to missing data but suspends classification until completion. If the customer does not respond after 2 contact attempts, the case goes to a manual queue with communication history.

Does AI for complaints fall under the AI Act?#

AI systems supporting consumer complaint handling should be analyzed for AI Act risk classification. If the system affects decisions regarding consumer rights (denial, granting benefits), it may require operational transparency and the ability to challenge decisions. The human-in-the-loop principle for denial decisions practically meets these requirements, regardless of the final risk classification. Consult a lawyer before implementation.#

Related articles: AI classification and routing of submissions, AI customer service automation, AI agent for scheduling meetings, AI in e-commerce. Design your complaint agent architecture with the agent blueprint tool.

What AI classifies and how it checks eligibility#

Every complaint or return submission has several dimensions that the classifier determines simultaneously.

Pipeline architecture: from submission to proposal#

The pattern for e-commerce with 500-2,000 complaints monthly is a structured output pipeline with a legal rules layer.

Table: submission type vs AI action vs when human intervenes#

Submission Type	AI Action	When Human Intervenes
Warranty within deadline, defect described, clear request	Proposal for repair or replacement + response template	Approval of proposal before sending
Statutory return within 14 days (online purchase)	Return instructions + eligibility confirmation	Only if product excluded from return rights (e.g., software)
Manufacturer’s warranty, form in service portal	Redirect to service + contact details, timeline info	If customer disputes the referral to service
Missing key information	Request for completion (defect description, date, request)	If customer does not complete after 2 attempts
Claim outside warranty period (over 2 years)	Explanation of statutory limits, paid service offer	Always in case of doubts or disputes over dates
AI refusal of complaint not permitted	Escalation to human with context	Always — AI does not issue denial decisions
Disputed case, threat of court or UOKiK action	Immediate priority escalation	Always, no auto-response
Low classifier confidence (below 0.70)	Manual queue with classification context	Always

Automation boundaries: what AI cannot do#

AI should not independently:

reject complaints citing lack of eligibility,
decide that a defect is the user’s fault (this requires expert assessment),
apply presumptions unfavorable to the consumer without human verification,
send responses in disputed cases or when the customer threatens escalation to UOKiK, ombudsman, or court.

Complaints contain personal data: name, address, order number, problem description, sometimes product photos with background data. Several GDPR requirements are particularly relevant here.

An alternative is self-hosting the classification model, which eliminates the data transfer issue. At a scale above 2,000 complaints monthly, self-hosting costs become justified.

Pilot: how to start without risk#

A good pilot starts with 3 steps:

Collect 300-500 historical complaints with labels (type, decision, justification). This is training data and a benchmark for model evaluation.
Define a decision tree with a lawyer for each case type. The model is only as good as the rules it implements.
Launch shadow mode with a dashboard comparing AI proposals to consultant decisions. Acceptance threshold before switching to auto-assist: over 85% agreement for a given case type.

More on designing agents for multi-step processes: multi-step AI agent planning and AI customer service automation.

Try it live#

▶AI boundaries in e-commerce complaintssandbox · reasoning

FAQ#

Can AI independently decide to reject a complaint?#

How long does it take to implement AI for complaint handling?#

What data do I need to train a complaint classifier?#

How does AI handle complaints when the customer doesn’t provide an order number?#

Does AI for complaints fall under the AI Act?#

AI systems supporting consumer complaint handling should be analyzed for AI Act risk classification. If the system affects decisions regarding consumer rights (denial, granting benefits), it may require operational transparency and the ability to challenge decisions. The human-in-the-loop principle for denial decisions practically meets these requirements, regardless of the final risk classification. Consult a lawyer before implementation.#

AI for handling complaints and returns: faster, but legally compliant

What AI classifies and how it checks eligibility#

Pipeline architecture: from submission to proposal#

Table: submission type vs AI action vs when human intervenes#

Automation boundaries: what AI cannot do#

Pilot: how to start without risk#

Try it live#

FAQ#

Can AI independently decide to reject a complaint?#

How long does it take to implement AI for complaint handling?#

What data do I need to train a complaint classifier?#

How does AI handle complaints when the customer doesn’t provide an order number?#

Does AI for complaints fall under the AI Act?#

AI for handling complaints and returns: faster, but legally compliant

What AI classifies and how it checks eligibility#

Pipeline architecture: from submission to proposal#

Table: submission type vs AI action vs when human intervenes#

Automation boundaries: what AI cannot do#

Pilot: how to start without risk#

Try it live#

FAQ#

Can AI independently decide to reject a complaint?#

How long does it take to implement AI for complaint handling?#

What data do I need to train a complaint classifier?#

How does AI handle complaints when the customer doesn’t provide an order number?#

Does AI for complaints fall under the AI Act?#

AI for handling complaints and returns: faster, but legally compliant

What AI classifies and how it checks eligibility#

Pipeline architecture: from submission to proposal#

Table: submission type vs AI action vs when human intervenes#

Automation boundaries: what AI cannot do#

GDPR and data in the complaint process#

Pilot: how to start without risk#

Try it live#

FAQ#

Can AI independently decide to reject a complaint?#

How long does it take to implement AI for complaint handling?#

What data do I need to train a complaint classifier?#

How does AI handle complaints when the customer doesn’t provide an order number?#

Does AI for complaints fall under the AI Act?#

AI for handling complaints and returns: faster, but legally compliant

What AI classifies and how it checks eligibility#

Pipeline architecture: from submission to proposal#

Table: submission type vs AI action vs when human intervenes#

Automation boundaries: what AI cannot do#

GDPR and data in the complaint process#

Pilot: how to start without risk#

Try it live#

FAQ#

Can AI independently decide to reject a complaint?#

How long does it take to implement AI for complaint handling?#

What data do I need to train a complaint classifier?#

How does AI handle complaints when the customer doesn’t provide an order number?#

Does AI for complaints fall under the AI Act?#