Skip to content

Pillar 2: Metrics - Know Your System's Vital Signs

Prometheus tracks performance trends and enables threshold-based alerting.

Prometheus Architecture Overview
Prometheus Pull Model

What is Prometheus?

Prometheus is an open-source monitoring and alerting toolkit designed for reliability and scalability. It's the de facto standard for metrics in cloud-native environments.

Key Features
  • Time-Series Database: Stores metrics with timestamps
  • Pull-Based Model: Scrapes metrics from services
  • Powerful Query Language: PromQL for analysis
  • Alerting: Threshold-based alerts
How It Works
  • Services expose /actuator/prometheus endpoint
  • Prometheus scrapes this endpoint every 15 seconds
  • Metrics stored in time-series database
  • Query via PromQL for analysis

The Three Metric Types

Counter

Always Goes Up

Cumulative value that only increases (or resets to zero).

  • Total requests
  • Total errors
  • Total operations

Example: person_operations_total{operation="create"}

Gauge

Goes Up and Down

Current value that can increase or decrease.

  • Current memory usage
  • Active connections
  • Queue size

Example: person_repository_total

Timer

Measures Duration

Tracks duration and count of events.

  • Response time
  • Method execution time
  • Database query time

Example: person_service_execution_time

Why Metrics Matter

📈 Identify Trends

Metrics show patterns over time. Is response time gradually increasing? Is error rate spiking on weekends? Metrics reveal trends that individual logs cannot.

🚨 Enable Alerting

Set thresholds: "Alert me if response time > 500ms for 5 minutes" or "Alert me if error rate > 5%". Proactive alerting prevents incidents.

📊 Capacity Planning

Understand resource usage patterns. How much memory do we need during peak hours? When will we need to scale? Metrics inform infrastructure decisions.

🎯 Business Insights

Track business operations, not just infrastructure. How many orders per hour? What's the conversion rate? Metrics bridge tech and business.

Our Prometheus Configuration

Scrape Configuration
  • Scrape Interval: 15 seconds
  • Targets:
    • Frontend: http://front:8080/actuator/prometheus
    • Backend 1: http://service-1:8081/actuator/prometheus
    • Backend 2: http://service-2:8082/actuator/prometheus
    • Backend 3: http://service-3:8083/actuator/prometheus
  • Retention: 15 days (configurable)
  • Port: 9090

Spring Boot Integration

Micrometer: The Bridge

Spring Boot uses Micrometer as an abstraction layer for metrics. Micrometer provides a vendor-neutral interface that works with Prometheus, Grafana, and others.

  • Auto-Configuration: Spring Boot automatically exposes metrics
  • Actuator Endpoint: /actuator/prometheus endpoint created
  • Default Metrics: JVM, HTTP, database connections, etc.
  • Custom Metrics: Easy to add via MeterRegistry

PromQL Query Examples

Total Operations
person_operations_total

Returns total count of all operations

Rate of Change
rate(person_operations_total[5m])

Operations per second over last 5 minutes

Filter by Label
person_operations_total{operation="create"}

Only create operations

Average Response Time
avg(person_service_execution_time_seconds)

Average execution time across all services

Key Takeaways

  • Prometheus scrapes metrics from all services via /actuator/prometheus
  • Three metric types: Counters (total), Gauges (current), Timers (duration)
  • Time-series data reveals trends and patterns
  • Foundation for alerting and capacity planning
  • Micrometer provides vendor-neutral abstraction in Spring Boot