7SDG 7 · affordable and clean energy 13SDG 13 · climate action

EV Charging Demand Optimisation

in-progress
Since October 2025 Region GB SDGs 7 + 13

An end-to-end ML pipeline that will forecast GB grid carbon intensity and EV charging demand, then schedule individual charging sessions to cut emissions and cost. The carbon forecast is built and live as its own API; the EV session model and the optimiser are next. The project runs as two parallel tracks: a local pipeline for rapid ML iteration, and a Databricks-plus-Kafka track that shows how the same system would run at production scale.

01 · why

Why carbon-aware charging is worth the effort

Carbon intensity on the GB grid varies by a factor of three across a day. A car that always charges from midnight to 06:00 will mostly do so when wind is plentiful, but only because that pattern happens to line up. A car that picks its window based on a forecast can do better, especially on days when the low-carbon hours fall outside the usual overnight slot. The same idea extends to cost when prices are time-of-use.

The question this project answers: given a probabilistic forecast of carbon intensity and a model of when a given charger is likely to have a session, what is the optimal charging schedule? The forecast is already running; the session model and the linear program that consumes it are the next two epics.

02 · pipeline

The pipeline, end to end

Five stages, each one a small thing that does one job and is testable in isolation. The carbon forecast and the EV session model are independent. Both feed into the optimiser.

01 · INGEST · done

Data collection

Carbon Intensity API, generation mix, Open-Meteo weather, planned ACN session data.

httpx REST
02 · FEATURES · done

30-min windows

Settlement-period alignment, weather join, lag features at t-1, t-2, t-48, t-336, rolling means, calendar features.

pandas DuckDB
03 · FORECAST · live

LightGBM quantile

P10, P50, P90 with pinball loss. Time-series CV. Persistence and seasonal-naive baselines. SHAP analysis for explainability.

LightGBM MLflow
04 · SESSIONS + LP · planned

Demand + schedule

GMM over EV session arrivals and energy draw. Linear program picks the half-hour slots that minimise carbon for each session.

scikit-learn LP
05 · API · in-progress

FastAPI

Forecast endpoint is live on Cloud Run. Optimise and session-prior endpoints are planned.

FastAPI Cloud Run
03 · components

Three components, decoupled

Each component can be developed, tested, and (eventually) deployed independently. The carbon forecast is already live as its own API; the session model and the optimiser sit alongside.

Carbon forecast · live

GB Carbon Intensity Forecast API

Half-hourly probabilistic forecast of grid carbon intensity. LightGBM quantile regression for P10, P50, and P90, served as a public FastAPI on Cloud Run. The forecast is what the optimiser reasons over.

See the deep-dive →
Session model · planned

EV session behaviour

A Gaussian Mixture Model over EV session arrival times and energy draw, intended to be fitted on open ACN charging data. The output will be a probabilistic prior over when a given charger is likely to be active. Not started yet.

Optimiser · planned

Linear-programming scheduler

For a given session and forecast, picks the half-hour slots that minimise expected carbon (and, later, time-of-use cost). The benchmark to beat is dumb charging: plug in, pull full power until the battery is full. Solver library not yet chosen.

04 · implementations

Two parallel tracks

The local track is the primary development environment: run everything end-to-end on a laptop, iterate fast on the ML. The Databricks track is the same pipeline at scale, with regional models trained in parallel and the cloud-native version of each component. A separate brief in the repo covers a Kafka-on-UpCloud plus GCP design (BigQuery, Cloud Run, Dataflow) as the production target.

Stage Local Databricks
Ingest collectors/*.py writing Parquet (done) Bronze Delta tables, 14 UK regions (done)
Features pandas, DuckDB (done) Silver and Gold Delta with weather features (done)
Training Single GB model, LightGBM quantile, time-series CV (done) 14 regional models via applyInPandas, retrained with weather features (done)
Tracking MLflow with pinball loss, baselines, SHAP (done) MLflow + Databricks Asset Bundles (planned)
Serving FastAPI on Cloud Run; forecast endpoint live, optimise endpoint planned Databricks Model Serving (planned)
05 · development method

What I built, and what an agent built

I started this project to put my ML skills onto an energy problem I cared about. The ML code, the choice of approach, and the experiments are mine. The data collection pipeline and most of the feature engineering were built using a Ralph Loop, an autonomous multi-agent pattern where an AI agent reads a PRD, implements one task at a time, runs the tests, commits, and iterates until the backlog is empty. Handing the plumbing off has let me spend my time on the parts of the project I am best placed to do: choosing models, evaluating them, and reasoning about the optimisation.

I have written my own implementation of the same pattern, Ralphzilla, which is what I now reach for when I want to run this kind of agentic build loop on a new project.

06 · stack & source

Stack and source

Everything here is open and reproducible. The repo contains both implementations and a separate brief covering a more ambitious cloud-native architecture I am working towards.

Local

Python 3.11 uv httpx pandas DuckDB LightGBM scikit-learn SHAP PuLP FastAPI MLflow

Databricks

PySpark Delta Lake Medallion architecture applyInPandas MLflow Model Registry Model Serving Unity Catalog