Dashboards

The dashboards at grafana.nufi.me show the operational side of NUFI — request rate, error rate, latency percentiles, host health.

Where the trace viewer shows what the AI said, the dashboards show how the platform held up while it was saying it.

Use the dashboard credentials set during install. In production, front the dashboards with your reverse proxy so they share the same SSO posture as the rest of NUFI.

Pre-loaded dashboards

NUFI ships with:

Gateway Overview — request rate, error rate, latency P50 / P95 / P99, top models by request count, top users by request count.
Database health — connection count, slow queries, replication lag.
Cache — hit rate, memory usage, slow operations.

All three load on first visit. Click the dashboard name in the left rail to switch.

Add a panel

Dashboards are queries against a metrics database. To add a panel:

Pick a dashboard → Add → Visualisation.
Pick the metrics datasource.
Write a query, e.g. sum by (model) (rate(nufi_total_requests[5m])).
Save.

Alerts

NUFI ships with three default alert rules:

Alert	Trigger	Severity
`GatewayDown`	Gateway not responding for 1 min	critical
`HighErrorRate`	Error rate > 5 % for 5 min	warning
`HighLatencyP95`	P95 latency > 10 s for 5 min	warning

Your operator can edit these in the alert rules file and reload without restarting the metrics service.

Dashboards — is the platform up? Are we slow? Is there an error storm right now?
Trace viewer — what exactly did the AI see and produce for user X at time T?

You usually start in the dashboards (you noticed something), then jump to the trace viewer (to inspect a representative request).

Pre-loaded dashboards

Add a panel

Alerts

Routing alerts to your incident channel

Retention

When to look here vs the trace viewer

On this page

Dashboards

Sign in

Pre-loaded dashboards

Add a panel

Alerts

Routing alerts to your incident channel

Retention

When to look here vs the trace viewer

On this page