Back to Blog
LoggingObservabilityMonitoringDevOps

Logging and Observability Best Practices

Build observable systems with structured logging, distributed tracing, and metrics. Know what's happening in production at all times.

B
Bootspring Team
Engineering
August 18, 2025
6 min read

When something goes wrong in production, observability is the difference between quick resolution and hours of guessing. Good logging, tracing, and metrics let you understand system behavior, debug issues, and prevent problems before users notice.

The Three Pillars of Observability

Logs

Discrete events with context. "User 123 placed order 456 at 14:32:05"

Metrics

Aggregated measurements over time. "Orders per minute: 150"

Traces

Request flow across services. "Request X took 500ms: API (50ms) → DB (300ms) → Cache (150ms)"

Structured Logging

Why Structure Matters

Loading code block...

Structured logs are:

  • Searchable: userId:123 AND level:error
  • Aggregatable: "Count errors by productId"
  • Parseable: Machines can process them

Logger Configuration

Loading code block...

Log Levels

Loading code block...

Contextual Logging

Loading code block...

Distributed Tracing

OpenTelemetry Setup

Loading code block...

Manual Spans

Loading code block...

Metrics

Key Metrics Types

Loading code block...

Request Metrics Middleware

Loading code block...

Business Metrics

Loading code block...

Alerting Strategy

Alert Definition

Loading code block...
Loading code block...

Log Aggregation

Shipping Logs

Loading code block...

Query Patterns

# Find errors for a specific user service:order-api AND level:error AND userId:123 # Find slow requests service:order-api AND duration:>1000 # Trace a request across services traceId:abc123 # Error patterns in last hour service:* AND level:error | stats count by message | sort -count | head 10

Dashboard Design

Key Dashboards

Request to AI: Design observability dashboards for an e-commerce API: Dashboards needed: 1. Overview (health at a glance) 2. Request performance 3. Business metrics 4. Infrastructure 5. Errors and debugging For each dashboard: - Key metrics to display - Visualization types - Time ranges - Alert thresholds

Example Overview Dashboard

┌─────────────────────────────────────────────────────────────┐ │ Request Rate Error Rate P95 Latency │ │ [ 1,234 req/s ] [ 0.3% ] [ 145ms ] │ └─────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────┐ │ Requests per Second (last 6 hours) │ │ ▁▂▃▄▅▆▇█▇▆▅▄▃▂▁▂▃▄▅▆▇█▇▆▅▄▃▂▁▂▃▄▅▆▇█▇▆▅▄▃▂▁ │ └─────────────────────────────────────────────────────────────┘ ┌──────────────────────────┬──────────────────────────────────┐ │ Top Endpoints │ Recent Errors │ │ GET /products 45% │ • PaymentError: timeout │ │ GET /users 25% │ • ValidationError: email │ │ POST /orders 15% │ • NotFoundError: product │ │ Other 15% │ │ └──────────────────────────┴──────────────────────────────────┘

Conclusion

Observability isn't a feature—it's a capability that enables everything else. Without visibility into system behavior, you're flying blind.

Start with structured logging. Add metrics for key operations. Implement tracing for distributed systems. Build dashboards that answer questions before they're asked. Set up alerts that catch problems before users do.

AI helps implement these patterns correctly, from logger configuration to alert thresholds. The investment in observability pays dividends every time you need to debug production issues—which is always sooner than you expect.

Share this article

Help spread the word about Bootspring

Related articles