Observability Demo Applications¶
Three workloads that drive the observability stack (logs, metrics, traces). Together they produce realistic telemetry so BRAC can see the whole pipeline end-to-end on Day 6.
What we deploy¶
| Demo | Purpose | Why it matters |
|---|---|---|
| OpenTelemetry Demo (Astronomy Shop) | Canonical multi-language microservice reference with built-in traffic generator | Traces across Java/Go/.NET/Python/Ruby/JS services; rich enough to show real-world distributed-tracing patterns |
| Bookinfo (Istio sample) | 4-service canonical mesh app | Simplest demo for canary routing + service-mesh patterns (if we enable mesh later) |
brac-poc-demo-app (custom) |
Banking-flavored traffic + error generator | Shows BRAC-themed telemetry: loan-approval latencies, payment flows, "account not found" error spikes |
All three run on spoke-dc (and spoke-dr via ApplicationSet), deployed via GitOps.
1. OpenTelemetry Demo (Astronomy Shop)¶
Source: github.com/open-telemetry/opentelemetry-demo
What it is: a reference e-commerce app with ~15 services in 10+ languages. Simulates customers browsing and buying astronomy products. Has its own load generator, feature flags for injecting failures, and full OTel instrumentation.
Deployment:
- Pull the upstream Helm chart, vendor the rendered manifests into openshift-platform-gitops/components/workloads/otel-demo/
- Version-pin to a specific chart version (e.g. 0.36.x — verify current latest)
- Configure to send traces to our OTel Collector (hub-managed Tempo + spoke-local collector)
- Enable the built-in load generator + feature-flag service
What to show on demo day: - In SigNoz: end-to-end trace of "purchase" flow spanning frontend → cart → payment → shipping - In Tempo (OCP Observe console): same trace, Red Hat-native view - In Loki: correlated logs for the same request ID - In Prometheus / COO: RED metrics (Rate, Errors, Duration) per service - Inject a latency failure via feature flag → watch alerts + traces highlight the slow service
Resource footprint: ~2 GB RAM + ~2 CPU across the 15 services. Fits easily on spoke-dc.
2. Bookinfo (Istio sample)¶
Source: github.com/istio/istio/tree/master/samples/bookinfo
What it is: 4 services — productpage, details, reviews (v1/v2/v3), ratings. The reviews service has 3 versions: v1 (no stars), v2 (black stars), v3 (red stars) — perfect for canary/A-B routing.
Deployment:
- Vendor the manifests into openshift-platform-gitops/components/workloads/bookinfo/
- Deploy productpage as a Deployment + Service + OCP Route
- Deploy reviews v1/v2/v3 with traffic-split (10% v2, 10% v3, 80% v1 via ingress or service-mesh rules)
What to show:
- Hit productpage repeatedly — see distribution across v1/v2/v3 reviews
- In SigNoz/Tempo traces: which version served which request
- Shift traffic weights via Git commit (change YAML → ArgoCD syncs → new routing) — demonstrates GitOps-driven canary
Optional: enable OpenShift Service Mesh (Red Hat's Istio) later; Bookinfo is the canonical mesh demo. Out of POC scope unless specifically needed; mesh adds substantial complexity.
3. brac-poc-demo-app — custom traffic generator¶
Source: we write it. Lives in a new repo brac-poc-demo-app on GitLab CE.
Purpose: tell BRAC's story. A small app that mimics banking flows so the observability shows domain-relevant signal, not just "astronomy shop".
Shape¶
- Language: Go (small, fast, easy OTel instrumentation, fits on modest compute)
- Single HTTP server + in-process workers
- ~4 "logical services" simulated via code (not separate pods):
customer-api—GET /customer/:idloan-service—POST /loan/apply,GET /loan/:idpayment-service—POST /paymentreporting-service—GET /report
What it emits¶
| Signal | Content |
|---|---|
| Logs (stdout JSON) | Structured: {ts, level, correlation_id, service, msg, amount, customer_id, ...} — ingested by Loki + forwarded to Splunk |
| Traces (OTLP) | One trace per incoming request, spans for internal "services"; includes simulated DB calls (fake pg_query spans) |
Metrics (Prometheus /metrics) |
requests_total{endpoint}, request_duration_seconds{endpoint}, loan_amount_sum, payment_failures_total |
| Synthetic errors | Configurable failure rate per endpoint (e.g., 2% payment-service → 5xx) — generates error rate visibility |
| Traffic shape | Configurable RPS, with diurnal pattern (higher during "business hours" simulated by minute-of-hour) |
How it runs¶
- Single Go binary, ~500 LOC
- Deployed as a Deployment with 2-3 replicas on spoke-dc (mirrored to spoke-dr via ApplicationSet)
- Uses OTel SDK for traces + metrics; logs to stdout
- Built via GitLab CI pipeline → image pushed to Nexus → ArgoCD pulls from Nexus
- A separate "client" Deployment calls it continuously (basic curl loop or a hey/vegeta-style loadgen)
Demo value¶
- BRAC sees banking-named metrics in Prometheus dashboards
- SigNoz traces show realistic flows: "Customer 1234 applied for a loan → credit check → approved → notification"
- Deliberate error spike demo: flip a feature flag, watch error rate climb in SigNoz, OCP Observe console, and Splunk Enterprise Free dashboard
- Logs in Loki + Splunk show correlated request IDs across services
Repo layout¶
brac-poc-demo-app/
├── main.go
├── handlers/
│ ├── customer.go
│ ├── loan.go
│ ├── payment.go
│ └── reporting.go
├── instrumentation/
│ └── otel.go # one-time OTel setup
├── Dockerfile
├── .gitlab-ci.yml # build + push image to Nexus
├── k8s/
│ ├── deployment.yaml
│ ├── service.yaml
│ └── servicemonitor.yaml
└── README.md
Pipeline: commit → CI builds → pushes to Nexus → webhook to ArgoCD → redeploy on spoke-dc/dr.
Version¶
Self-versioned (we own it). Semver v0.1.0 on Day-1 release. No channel, it's our code.
Integration with the observability stacks¶
All three demos feed both observability stacks (per Decision #031):
| Signal | Red Hat native (on OCP) | SigNoz (VM tier) | Splunk (PCI forwarder) |
|---|---|---|---|
| Logs | Logging + Loki + OCP console Log viewer | via OTel → ClickHouse via Kafka | Via OCP Logging ClusterLogForwarder → Splunk HEC |
| Traces | Tempo + OCP console Tracing viewer | Via OTel → ClickHouse | (Splunk doesn't handle traces) |
| Metrics | Prometheus + COO + Grafana (OCP built-in) | Via OTel → ClickHouse | (Splunk doesn't handle metrics) |
This is why we have both stacks: OCP admins use the Red Hat UIs (familiar, supported, integrated); developers use SigNoz (more feature-rich APM); security uses Splunk (compliance-mandated).
Demo day script (~5 min for observability section)¶
- Hit
brac-poc-demo-appendpoint a few times from a browser → shows live traffic - Open OCP Observe → Traces → drill into one trace → show the call chain
- Open SigNoz → Traces → same request via
correlation_id→ show the graph view - Open Loki → filter by that
correlation_id→ show all logs - Open Splunk dashboard → show same logs are also there for PCI audit trail
- Flip a failure flag via n8n webhook → watch error rate spike in SigNoz in real-time
- Show the ACM Observability UI on hub-dc → consolidated metrics across all managed clusters
Version pins (to confirm at install)¶
- OpenTelemetry Demo: Helm chart 0.36.x (verify latest at install)
- Bookinfo:
release-1.27tag from istio repo (or current) brac-poc-demo-app: we build; pin base image toregistry.redhat.io/ubi9/ubi-minimal:9.6+ Go toolchain via Nexus mirror
Created: 2026-04-24 · Owner: Platform Lead · Decision: #033