InfluxDB Telegraf Grafana – AI Monitoring Setup Guide 2026

Is your AI model drifting silently in production? Are latency spikes going unnoticed? Most ML engineers deploy models with zero observability — and then wonder why accuracy degrades or users complain about slow predictions. This tutorial shows you how to build a complete real-time monitoring stack using InfluxDB, Telegraf, and Grafana (the TIG stack), going from zero to live dashboards in under 30 minutes.

No paid tools, no Kubernetes required — just Docker, Python, and three open-source projects that thousands of teams use in production.

Why AI Model Monitoring Matters

Deploying a model is only the beginning. Without monitoring, you are flying blind. Here are the four problems that hit every production AI system:

Model drift — Your model was 95% accurate on launch day. Three months later, real-world data has shifted and accuracy dropped to 78%. Nobody noticed because there are no metrics.
Latency spikes — Inference normally takes 20ms, but under load it spikes to 500ms. Users feel it. You do not find out until someone files a support ticket.
Resource waste — Memory leaks and CPU overuse are invisible without tooling. You are burning money on infrastructure you cannot optimize.
Silent failures — The worst one. Models can return wrong predictions without raising any errors. No exception, no crash log — just quietly incorrect results going to real users.

Production AI needs observability. Just like you would never run a web server without logs and metrics, you should not run an AI model without monitoring.

Meet the TIG Stack

Three tools, each with one job, working together:

Tool	Role	Port
InfluxDB	Time-series database — stores metrics with nanosecond timestamps	8086
Telegraf	Data collection agent — receives metrics from your app and forwards to InfluxDB	8080
Grafana	Visualization and alerting — queries InfluxDB and renders live dashboards	3000

Together they form the TIG Stack — battle-tested, used by thousands of teams worldwide, and completely free.

Architecture — How Data Flows

The data pipeline is simple and linear:

Your AI model (Python) generates metrics — latency, accuracy, batch size, memory usage.
Telegraf listens on an HTTP endpoint, collects the metrics, and forwards them to InfluxDB on a regular flush interval.
InfluxDB stores every data point with a nanosecond-precision timestamp.
Grafana queries InfluxDB using the Flux query language, renders live dashboards, and evaluates alert conditions continuously.

Prerequisites

Docker Desktop installed and running — everything runs in containers for a clean, reproducible setup.
Python 3.8+ — for the model instrumentation code.
A terminal — PowerShell, Bash, or CMD.
An AI model — scikit-learn, TensorFlow, PyTorch, or any model with a predict() method. If you do not have one, the included monitor.py has a demo model for testing.

Step 1 — Install InfluxDB

Run InfluxDB 2.7 in a Docker container with persistent data volumes:

docker run -d -p 8086:8086 \
  --name influxdb \
  -v influxdb-data:/var/lib/influxdb2 \
  -v influxdb-config:/etc/influxdb2 \
  influxdb:2.7

Verify it is running:

docker ps

Open http://localhost:8086 in your browser. You should see the InfluxDB welcome screen.

Step 2 — Configure InfluxDB

Click Get Started.
Create admin credentials — save these.
Set Organization to myorg.
Set Bucket to ai_metrics.
Click Continue, then Configure Later.
Go to Data → API Tokens → Generate Token → All Access Token.
Copy and save your API token immediately — you will not see it again.

Step 3 — Set Up the Docker Network

Containers need to communicate by name. Create a shared Docker network and connect InfluxDB to it:

mkdir ai-monitoring && cd ai-monitoring

docker network create monitoring

docker network connect monitoring influxdb

Now Telegraf can reach InfluxDB at http://influxdb:8086 instead of needing an IP address.

Step 4 — Configure and Run Telegraf

Create a telegraf.conf file in your project folder with this minimal configuration:

[agent]
    interval = "10s"
    flush_interval = "10s"

[[outputs.influxdb_v2]]
    urls = ["http://influxdb:8086"]
    token = "YOUR_API_TOKEN_HERE"
    organization = "myorg"
    bucket = "ai_metrics"

[[inputs.http_listener_v2]]
    service_address = ":8080"
    paths = ["/metrics"]
    data_format = "influx"

Replace YOUR_API_TOKEN_HERE with the token from Step 2.

What this config does:

[agent] — Collects every 10 seconds and flushes to InfluxDB every 10 seconds.
[[outputs.influxdb_v2]] — Sends data to InfluxDB using the container network name.
[[inputs.http_listener_v2]] — Listens on port 8080 at /metrics for incoming data in InfluxDB line protocol format.

Start the Telegraf container:

# Linux/macOS
docker run -d --name telegraf --network monitoring \
  -v $(pwd)/telegraf.conf:/etc/telegraf/telegraf.conf:ro \
  telegraf

# Windows PowerShell
docker run -d --name telegraf --network monitoring `
  -v ${PWD}/telegraf.conf:/etc/telegraf/telegraf.conf:ro `
  telegraf

Verify with docker ps — you should see both influxdb and telegraf running.

Step 5 — Instrument Your AI Model with Python

This is where the magic happens. Install the InfluxDB Python client:

pip install influxdb-client psutil

Here is the full monitoring wrapper from the companion repository. It works with any model that has a predict() method:

import time
import psutil
import os
from influxdb_client import InfluxDBClient, Point
from influxdb_client.client.write_api import SYNCHRONOUS

INFLUXDB_URL = "http://localhost:8086"
INFLUXDB_TOKEN = "YOUR_API_TOKEN"
INFLUXDB_ORG = "myorg"
INFLUXDB_BUCKET = "ai_metrics"
MODEL_NAME = "my_model_v1"

client = InfluxDBClient(url=INFLUXDB_URL, token=INFLUXDB_TOKEN, org=INFLUXDB_ORG)
write_api = client.write_api(write_options=SYNCHRONOUS)

def predict(model, input_data, ground_truth=None):
    start = time.time()
    error_occurred = False

    try:
        result = model.predict(input_data)
    except Exception as e:
        error_occurred = True
        result = None
        print(f"[monitor] Prediction error: {e}")

    latency_ms = (time.time() - start) * 1000
    memory_mb = psutil.Process(os.getpid()).memory_info().rss / 1024 / 1024
    batch_size = len(input_data) if hasattr(input_data, "__len__") else 1

    point = (
        Point("ai_model_metrics")
        .tag("model", MODEL_NAME)
        .field("latency_ms", latency_ms)
        .field("batch_size", batch_size)
        .field("memory_mb", memory_mb)
        .field("error_rate", 1.0 if error_occurred else 0.0)
    )

    if ground_truth is not None and result is not None:
        correct = sum(p == t for p, t in zip(result, ground_truth))
        accuracy = correct / len(ground_truth)
        point = point.field("accuracy", accuracy)

    write_api.write(bucket=INFLUXDB_BUCKET, record=point)
    return result

How to Use It

Replace your existing model.predict(data) call with the wrapped version:

from monitor import predict
import joblib

model = joblib.load("my_model.pkl")
result = predict(model, X, ground_truth=y_true)

That is it. Your existing code barely changes, and every prediction now emits real-time metrics to InfluxDB.

Metrics Tracked Automatically

Metric	Field Name	What It Measures
Inference latency	`latency_ms`	Time per prediction in milliseconds
Model accuracy	`accuracy`	Correct predictions / total (requires ground truth)
Batch size	`batch_size`	Number of inputs per call
Memory usage	`memory_mb`	Process RSS memory — detects leaks
Error rate	`error_rate`	1.0 on failure, 0.0 on success

Step 6 — Install Grafana

docker run -d -p 3000:3000 \
  --name grafana --network monitoring \
  -v grafana-data:/var/lib/grafana \
  grafana/grafana:latest

Open http://localhost:3000. Default credentials: admin / admin. You will be prompted to change the password.

Step 7 — Connect Grafana to InfluxDB

Go to Connections → Data Sources → Add data source.
Select InfluxDB.
Set URL to http://influxdb:8086 (use the container name, not localhost).
Set Query Language to Flux.
Enter Organization: myorg.
Paste your API Token.
Set Default Bucket: ai_metrics.
Click Save & Test — you should see a green success banner.

Step 8 — Build Your Dashboard

Create a new dashboard and add panels with Flux queries. Here are the queries for the key metrics:

Model Accuracy Over Time

from(bucket: "ai_metrics")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "ai_model_metrics")
  |> filter(fn: (r) => r._field == "accuracy")
  |> aggregateWindow(every: 1m, fn: mean)

Inference Latency

from(bucket: "ai_metrics")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "ai_model_metrics")
  |> filter(fn: (r) => r._field == "latency_ms")
  |> aggregateWindow(every: 1m, fn: mean)

Memory Usage

from(bucket: "ai_metrics")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "ai_model_metrics")
  |> filter(fn: (r) => r._field == "memory_mb")
  |> aggregateWindow(every: 1m, fn: mean)

Add more panels for batch_size and error_rate by changing the field filter. Arrange them in a grid, resize, and rename. Save the dashboard as AI Model Monitoring.

Step 9 — Configure Alerts

Catch model degradation before it impacts users:

Click Alerting → Alert rules → New alert rule.
Set the condition: accuracy IS BELOW 0.80.
Set evaluation: every 1 minute for 5 minutes — the condition must persist for 5 consecutive minutes before firing, which prevents false alarms.
Add a notification channel: Email, Slack, PagerDuty, Discord, or Webhooks.
Click Save rule.

Your model is now protected. The moment accuracy drops below 80% for more than 5 minutes, you get notified automatically.

Docker Commands Cheat Sheet

Command	What It Does
`docker start influxdb telegraf grafana`	Start all containers
`docker stop influxdb telegraf grafana`	Stop all containers
`docker logs telegraf`	View Telegraf logs
`docker logs influxdb`	View InfluxDB logs
`docker network inspect monitoring`	Verify containers are on the same network
`docker volume rm influxdb-data influxdb-config grafana-data`	Full cleanup (removes all stored data)

Troubleshooting Common Issues

“Connection refused” in Grafana or Telegraf

Check containers are running: docker ps
Verify the network: docker network inspect monitoring
Use container names (http://influxdb:8086), not localhost, when connecting from within the Docker network

No data appearing in Grafana

Run your Python script: python monitor.py
Check InfluxDB Data Explorer at http://localhost:8086 — verify data is arriving
Check Telegraf logs: docker logs telegraf
Ensure your Flux query uses the correct bucket name and measurement name

TOML syntax error in telegraf.conf

Ensure the file is UTF-8 encoded without BOM
Make sure the API token is on a single line with no extra quotes
Verify indentation uses spaces, not tabs

Resources

Full source code and config files: github.com/shazforiot/ai-monitoring-with-TIG-stack
InfluxDB Documentation: docs.influxdata.com/influxdb/v2
Telegraf Plugins: docs.influxdata.com/telegraf/latest/plugins
Grafana Dashboards: grafana.com/grafana/dashboards
Flux Query Language: docs.influxdata.com/flux/latest

Frequently Asked Questions

What is the TIG stack and why use it for AI monitoring?

The TIG stack stands for Telegraf + InfluxDB + Grafana. Telegraf collects metrics from your AI models, InfluxDB stores them as time-series data with nanosecond precision, and Grafana visualizes the data in real-time dashboards with alerting. It is fully open-source, runs in Docker, and is battle-tested by thousands of engineering teams for production monitoring.

Can I monitor any AI model with this setup?

Yes. The Python instrumentation wrapper (monitor.py) works with any model that has a predict() method — scikit-learn, TensorFlow, PyTorch, XGBoost, HuggingFace transformers, and more. You simply replace model.predict(data) with predict(model, data) and metrics are automatically sent to InfluxDB.

What metrics should I track for AI model monitoring?

The five most critical metrics are: latency_ms (inference time per prediction), accuracy (model performance if ground truth is available), memory_mb (process memory to detect leaks), batch_size (throughput volume), and error_rate (failed prediction rate). The included monitor.py tracks all five out of the box.

Do I need Kubernetes to run this monitoring stack?

No. The entire stack runs in Docker containers on your local machine or any single server. You do not need Kubernetes, cloud services, or paid tools. Just Docker Desktop and Python 3.8+. The setup works identically on Windows, macOS, and Linux.

How do Grafana alerts detect model drift?

You configure an alert rule in Grafana that evaluates a condition like “accuracy IS BELOW 0.80 for 5 consecutive minutes.” When your model’s accuracy degrades below the threshold — a sign of data drift or concept drift — Grafana fires the alert and sends a notification to Slack, email, PagerDuty, or any configured channel. This catches silent degradation before it impacts users.

Video Chapters — Quick Navigation

00:00 — Intro & Demo
01:24 — Why AI Model Monitoring Matters
04:38 — Architecture Overview (TIG Stack)
06:20 — Prerequisites
08:09 — Install & Configure InfluxDB
13:05 — Docker Network & Telegraf Setup
14:54 — Configure & Run Telegraf Container
17:11 — Instrument Your AI Model (Python)
19:57 — Install & Connect Grafana
21:35 — Connect Grafana to InfluxDB
23:30 — Build the Dashboard & View Metrics
27:06 — Configure Alerts
28:20 — Final Demo & Recap