Back to Blog
ObservabilityMonitoringMetricsLogging

Observability and Monitoring for Production Applications

Monitor applications effectively. From metrics to logging to tracing to alerting strategies.

B
Bootspring Team
Engineering
June 28, 2023
6 min read

Observability lets you understand system behavior from external outputs. Metrics, logs, and traces form the three pillars of observability.

The Three Pillars#

Metrics: - Numerical measurements over time - CPU, memory, request counts, latencies - Aggregated and sampled Logs: - Discrete events with context - Errors, requests, business events - Detailed but voluminous Traces: - Request flow across services - Timing and dependencies - End-to-end visibility

Metrics with Prometheus#

Loading code block...

Business Metrics#

Loading code block...

Structured Logging#

Loading code block...

Distributed Tracing#

Loading code block...

Error Tracking#

Loading code block...

Health Checks#

Loading code block...

Alerting Rules#

Loading code block...

Dashboard Queries#

# Request rate sum(rate(http_requests_total[5m])) by (path) # Error rate sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m])) # P50, P95, P99 latency histogram_quantile(0.50, sum(rate(http_request_duration_seconds_bucket[5m])) by (le)) histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le)) histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket[5m])) by (le)) # Active connections active_connections # Memory usage process_resident_memory_bytes

Best Practices#

Metrics: ✓ Use consistent naming conventions ✓ Add relevant labels ✓ Set appropriate buckets ✓ Monitor cardinality Logging: ✓ Use structured logging ✓ Include correlation IDs ✓ Log at appropriate levels ✓ Don't log sensitive data Tracing: ✓ Propagate context across services ✓ Add meaningful span names ✓ Include relevant attributes ✓ Sample appropriately Alerting: ✓ Alert on symptoms, not causes ✓ Include runbooks ✓ Avoid alert fatigue ✓ Test alerts regularly

Conclusion#

Observability requires metrics, logs, and traces working together. Start with basic health checks and metrics, add structured logging, then implement tracing for complex systems. Good observability reduces mean time to resolution and improves system reliability.

Share this article

Help spread the word about Bootspring

Related articles