AI for energy and utilities: forecasting, anomaly detection…

AI for energy and utilities: forecasting, anomaly detection, documentation

A district heating operator with hundreds of nodes, an energy distributor with meter and substation telemetry, a compliance team drowning in regulator reports. These are scenes we see regularly. The common denominator: there’s plenty of data—from sensors, SCADA meters, billing systems—but translating it into timely decisions is tough. That’s where AI makes sense. The same operator might be tempted by the promise of a “self-regulating grid,” but that’s a different conversation: about supply stability, legal liability, and critical infrastructure. Below, we separate one from the other—honestly and without promises we can’t keep.

Demand and load forecasting: realistic accuracy ranges

Load forecasting predicts how much energy or heat customers will consume in the coming hours and days, based on historical data, weather, calendars, and customer profiles. It’s one of the most researched AI applications in the industry—and precisely why we should talk about it honestly.

Accuracy depends on the horizon and input data quality. Short-term forecasts (next few hours) can be very accurate; weekly forecasts carry much greater uncertainty because they depend on weather, which we also don’t know precisely. We don’t promise a single number—we provide ranges and show how to calculate them on your data.

Forecast horizon	Typical error (MAPE, ranges)	Key influencing factors	Supported decision
Short-term (1-24h)	usually ~2-6%	weather, daily profile	balancing, energy procurement
Medium-term (1-7 days)	usually ~5-12%	weather forecast, calendar	source operation planning
Non-typical customer profile	wider, data-dependent	historical representativeness	tariffication, capacity allocation

These are indicative ranges from literature and our observations—not a guarantee. The real error will only be measured on your history. A prerequisite is data extraction from various sources and their organization; how to structure such data is described in preparing company data for AI. Without clean, time-stamped history, the model has nothing to learn from.

Anomaly detection in sensor data

The second strong area is detecting anomalies in telemetry streams: sudden spikes in consumption, atypical voltage profiles, leaks in heating networks, suspicions of illegal consumption or meter failure. The model learns what “normal” operation looks like for a measurement point and flags deviations.

It’s important not to overpromise. AI doesn’t “know” it’s a leak—it signals that the reading profile deviates from the pattern in a way that historically correlated with problems. The decision to dispatch a crew, disconnect a customer, or initiate an inspection is made by a human, because they bear the consequences of error. The system acts as a sieve: it screens thousands of points and flags those worth attention.

Two honest caveats are key here. First, the sensitivity threshold is a business decision: a lower threshold catches more anomalies but generates more false alarms and overwhelms dispatchers; a higher one lets more events through but risks missing something. Second, the model drifts—seasons change, tariffs change, customer structures change—so without monitoring, its effectiveness quietly declines. How to measure this is described in monitoring AI agent quality. The same “AI screens, human decides” pattern is also shown in AI for ticket classification and routing.

Documentation and compliance automation

The least flashy but often most cost-effective area is administration and reporting. Energy and utilities are regulated industries: reports to the Energy Regulatory Office (URE), network technical documentation, inspection protocols, certificates, customer correspondence, settlements. Much of this involves manually copying numbers and searching documents.

Here, AI excels at data extraction: it reads values from forms, protocols, and supplier invoices—including scans and photos thanks to OCR—and transfers them in structured form to the system. A language model can also draft the first version of a change report or network data summary—with the caveat that a legally binding document is signed by a human.

The boundary is simple: AI prepares and organizes, the human approves documents with regulatory consequences. An agent can gather data from multiple systems and draft a report, but legal compliance responsibility remains with the person, not the model.

▶Where should our distribution company start?sandbox · reasoning

Ticket classification and customer service

The fourth practical area is ticket flow: outages, billing complaints, connection requests, tariff inquiries. A classifier can sort tickets by type and urgency and route them to the right team—power outages go to dispatch immediately, while billing questions go to customer service.

This realistically shortens response time, but the same discipline applies as above. Classification supports routing, not judgment: for tickets that may involve safety (gas smell, electrocution risk), the system escalates to a human with the highest priority, rather than trying to “handle” the issue itself. The mechanics of such routing are broken down in AI for ticket classification and routing, and the screening logic for sensor data is also discussed in AI for logistics and warehousing and AI for industrial production.

AI Act and critical infrastructure: where support ends

Precision is required here, because energy and water networks are critical infrastructure—errors have consequences for people’s safety and supply continuity.

As long as AI supports decisions—forecasts for planners, anomaly alerts for dispatchers, report drafts, ticket routing—and a human remains in the loop, legal and operational risk is limited. The problem starts when the system takes actions affecting the grid: changes operating parameters, disconnects customers, or controls power distribution without oversight.

The AI Act identifies as high-risk, among others, AI systems used as safety components in the management and operation of critical infrastructure, including energy and water supply. If AI plays such a role, obligations kick in: technical documentation, risk assessment, human oversight with real intervention capability, logging, and quality monitoring over time. What this means in practice for Polish companies is detailed in AI Act and RODO obligations in 2026.

Three principles we always apply:

Human in the loop for network-critical decisions. AI signals and proposes; switching, disconnections, and parameter changes are approved by dispatchers.
Logging and model quality oversight. Seasonal drift is the rule, not the exception—without monitoring, effectiveness declines unnoticed.
Audit before deployment. Before the system touches operational processes, it’s worth going through an AI assistant security audit to check permissions, data access, and misuse scenarios.

How to start sensibly

An honest deployment sequence looks less spectacular than the slides promise. First, one narrowly defined problem with a measurable cost—most often either short-term load forecasting, anomaly detection on one signal type, or automating one report. Then, verification of whether clean historical data exists to train the model. Next, a pilot running alongside the current process—AI flags and forecasts, the human still decides—to measure effectiveness on real data before anything operates independently. Only when the numbers add up do we gradually expand, with oversight maintained where grid stability is at stake.

This isn’t a shortcut. But it’s a path that doesn’t end with an expensive system no one trusts because it once misforecast on the day it cost the most.

FAQ

How accurate is AI demand forecasting?

It depends primarily on the horizon and data quality. Short-term forecasts (a few to several hours) can be very accurate, with errors usually in the single-digit percentages; weekly forecasts carry greater uncertainty because they depend on weather, which we also don’t know precisely. We provide ranges, and the real error is measured on your history—we don’t promise a single number from a presentation.

Can AI autonomously control the grid or disconnect customers?

Not in our approach. Decisions critical to the grid—switching, parameter changes, disconnections—require human oversight because they involve critical infrastructure and safety. AI flags anomalies and proposes, but the dispatcher approves. Full autonomy in such processes isn’t just operational risk; it’s an area strictly regulated by the AI Act.

Will our telemetry and customer data go to the cloud?

That depends on the architecture you choose—and it’s a decision, not a necessity. Sensitive data (telemetry, customer data, network parameters) can stay in the company’s infrastructure using locally run models, or with a provider guaranteeing EU data location and a data processing agreement. It’s worth making this decision consciously at the start, as it determines the entire project and RODO requirements.

Is an AI system in energy subject to the AI Act as high-risk?

Sometimes yes. If AI is a safety component in the management or operation of critical infrastructure (energy or water supply), it may be a high-risk system under the AI Act—with obligations for documentation, risk assessment, and human oversight. Purely administrative systems, like report generation or document OCR, usually don’t fall into this category, but classification should be confirmed for the specific use case.

How long does anomaly detection deployment take?

The honest answer is “it depends on the data,” not the model. If telemetry is already flowing and there’s a history of events (failures, leaks, fraud), a pilot on one signal type usually takes a few weeks to a few months. If data needs to be organized first and labeled examples collected, the preparation phase can take longer than the modeling itself—and no promise can shorten it.

Related case studyAn LLM gateway for all AI traffic