Monitoring¶

Health checks, structured logging, request tracing, and task monitoring.

Health Check Endpoint¶

GET /api/v1/health

No authentication required. Checks database and Redis connectivity:

{
  "status": "healthy",
  "checks": {
    "database": "ok",
    "redis": "ok"
  }
}

Returns 200 when healthy, 503 when any check fails:

{
  "status": "unhealthy",
  "checks": {
    "database": "connection refused",
    "redis": "ok"
  }
}

Used by Render.com for service health monitoring and Docker container health checks.

Logging¶

Development (Plain Text)¶

[req-uuid] DEBUG 2025-01-15 14:30:00 apps.loans.services Processing loan disbursement

Root level: DEBUG
Django: INFO
Apps: DEBUG

Production (JSON to stdout)¶

{
  "asctime": "2025-01-15T14:30:00",
  "levelname": "INFO",
  "name": "apps.loans.services",
  "message": "Processing loan disbursement",
  "module": "services",
  "request_id": "550e8400-e29b-41d4-a716-446655440000"
}

Root level: WARNING
Django: WARNING
Apps: INFO
Format: JSON via pythonjsonlogger.json.JsonFormatter

JSON logs are designed for log aggregation services (CloudWatch, Datadog, etc.).

Request ID Tracing¶

RequestIDMiddleware generates a UUID for each request:

Checks for incoming X-Request-ID header (preserves from upstream proxy)
Generates a new UUID if not present
Stores in a ContextVar for the request lifetime
Sets X-Request-ID response header
RequestIDFilter injects the ID into all log records

This enables end-to-end request tracing across logs:

# Find all logs for a specific request
grep "550e8400-e29b-41d4-a716" /var/log/app.log

Celery Task Monitoring¶

Task Tracking¶

CELERY_TASK_TRACK_STARTED = True

Tasks report STARTED status in addition to PENDING, SUCCESS, and FAILURE. Results are stored in Redis (DB 2).

Task Logging¶

Each task logs its execution with schema context:

logger.info("accrue_daily_interest [%s]: processing %d loans", schema_name, loan_count)

Beat Schedule Monitoring¶

The beat scheduler logs each task dispatch. Monitor for:

Missing daily task executions (compare expected vs actual)
Task failures in the result backend
Fan-out tasks that fail to dispatch to all tenants

Database Monitoring¶

Connection Health¶

The health check endpoint verifies database connectivity via connection.aensure_connection().

Schema Count¶

Monitor the number of active tenant schemas to track growth:

SELECT count(*) FROM tenants_tenant WHERE status = 'active';

Connection Cleanup¶

Test infrastructure includes automatic connection cleanup to prevent connection leaks. In production, configure CONN_MAX_AGE appropriately for your connection pool.

Webhook Delivery Monitoring¶

Track webhook delivery health:

Delivery logs record status code, response body, and latency for each attempt
Dead-letter subscriptions (5 consecutive failures) should be investigated
Retry queue depth indicates webhook processing backlog

Middleware Stack¶

The full middleware stack provides multiple monitoring touchpoints:

Order	Middleware	Monitoring Aspect
1	`DomainRoutingMiddleware`	Tenant resolution
2	`TenantStatusMiddleware`	Suspended tenant blocks
3	`TenantLocaleMiddleware`	Locale activation
4	`RequestIDMiddleware`	Request tracing
5	Django security middleware	Security headers
6	`ContentRangeMiddleware`	Pagination headers
7	`SunsetMiddleware`	Deprecated endpoint headers

Security Logger¶

A dedicated lms.security logger captures security events:

Encryption operations and key rotation
Authentication failures
Permission denials
Suspicious request patterns