Back to Blog
LoggingObservabilityMonitoringDevOps

Logging and Observability Best Practices

Build observable systems with structured logging, distributed tracing, and metrics. Know what's happening in production at all times.

B
Bootspring Team
Engineering
August 18, 2025
6 min read

When something goes wrong in production, observability is the difference between quick resolution and hours of guessing. Good logging, tracing, and metrics let you understand system behavior, debug issues, and prevent problems before users notice.

The Three Pillars of Observability#

Logs#

Discrete events with context. "User 123 placed order 456 at 14:32:05"

Metrics#

Aggregated measurements over time. "Orders per minute: 150"

Traces#

Request flow across services. "Request X took 500ms: API (50ms) → DB (300ms) → Cache (150ms)"

Structured Logging#

Why Structure Matters#

Loading code block...

Structured logs are:

  • Searchable: userId:123 AND level:error
  • Aggregatable: "Count errors by productId"
  • Parseable: Machines can process them

Logger Configuration#

Loading code block...

Log Levels#

Loading code block...

Contextual Logging#

Loading code block...

Distributed Tracing#

OpenTelemetry Setup#

Loading code block...

Manual Spans#

Loading code block...

Metrics#

Key Metrics Types#

Loading code block...

Request Metrics Middleware#

Loading code block...

Business Metrics#

Loading code block...

Alerting Strategy#

Alert Definition#

Loading code block...
Loading code block...

Log Aggregation#

Shipping Logs#

Loading code block...

Query Patterns#

# Find errors for a specific user service:order-api AND level:error AND userId:123 # Find slow requests service:order-api AND duration:>1000 # Trace a request across services traceId:abc123 # Error patterns in last hour service:* AND level:error | stats count by message | sort -count | head 10

Dashboard Design#

Key Dashboards#

Request to AI: Design observability dashboards for an e-commerce API: Dashboards needed: 1. Overview (health at a glance) 2. Request performance 3. Business metrics 4. Infrastructure 5. Errors and debugging For each dashboard: - Key metrics to display - Visualization types - Time ranges - Alert thresholds

Example Overview Dashboard#

┌─────────────────────────────────────────────────────────────┐ │ Request Rate Error Rate P95 Latency │ │ [ 1,234 req/s ] [ 0.3% ] [ 145ms ] │ └─────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────┐ │ Requests per Second (last 6 hours) │ │ ▁▂▃▄▅▆▇█▇▆▅▄▃▂▁▂▃▄▅▆▇█▇▆▅▄▃▂▁▂▃▄▅▆▇█▇▆▅▄▃▂▁ │ └─────────────────────────────────────────────────────────────┘ ┌──────────────────────────┬──────────────────────────────────┐ │ Top Endpoints │ Recent Errors │ │ GET /products 45% │ • PaymentError: timeout │ │ GET /users 25% │ • ValidationError: email │ │ POST /orders 15% │ • NotFoundError: product │ │ Other 15% │ │ └──────────────────────────┴──────────────────────────────────┘

Conclusion#

Observability isn't a feature—it's a capability that enables everything else. Without visibility into system behavior, you're flying blind.

Start with structured logging. Add metrics for key operations. Implement tracing for distributed systems. Build dashboards that answer questions before they're asked. Set up alerts that catch problems before users do.

AI helps implement these patterns correctly, from logger configuration to alert thresholds. The investment in observability pays dividends every time you need to debug production issues—which is always sooner than you expect.

Share this article

Help spread the word about Bootspring

Related articles