Complexity Levels

Level	Tools	Setup Time	Best For
Minimal	UptimeRobot, Healthchecks.io	15 min	Side projects, MVPs
Standard	Uptime Kuma, Sentry, basic Grafana	1-2 hours	Small teams, startups
Professional	Prometheus, Grafana, Loki, Alertmanager	1-2 days	Production systems
Enterprise	Datadog, New Relic, or full OSS stack	Ongoing	Large-scale operations

The Three Pillars

"I just want to know if it's down" → UptimeRobot (free) or Uptime Kuma (self-hosted). See simple.md.

"I need to debug production errors" → Sentry with your framework SDK. 5-minute setup. See apm.md.

"I want real observability" → Prometheus + Grafana + Loki. See prometheus.md.

"I need to centralize logs" → Loki for simple, ELK for complex queries. See logs.md.

Do	Don't
Alert on symptoms (user impact)	Alert on causes (CPU high)
Include runbook link	Require investigation to understand
Set appropriate severity	Make everything P1
Require action	Alert on "interesting" metrics

Alert fatigue kills monitoring. If alerts are ignored, you have no monitoring.

For alert configuration, severities, and on-call setup, see alerting.md.