Architecture Overview¶
How the BRAC POC's nine components fit together — interactions, data flows, HA posture, and design intent.
High-level architecture¶
flowchart TB
subgraph OCP["OpenShift Container Platform"]
direction TB
subgraph Workloads["Application workloads"]
direction LR
WSO2["WSO2 APIM<br/>+ Identity Server"]
SAMPLE["Sample apps<br/>(OTel-instrumented)"]
JBOSS["JBoss<br/>domain mode"]
end
subgraph Middleware["Middleware + routing"]
MW["NGINX + Open Liberty<br/>TCP LB · canary routing · observability"]
end
subgraph Obs["Observability stack"]
direction LR
OTEL["OTel Collectors<br/>(DaemonSet)"]
SIGNOZ["SigNoz backend"]
CH["ClickHouse<br/>(2-day hot)"]
end
subgraph Sec["Security + compliance"]
direction LR
COMP["Compliance Operator<br/>PCI-DSS + CIS"]
ACS["Advanced Cluster<br/>Security (ACS)"]
TRIVY["Trivy<br/>SCA + SBOM"]
end
subgraph Git["GitOps"]
direction LR
ARGOCD["ArgoCD"]
REPO[("External<br/>Git repo")]
end
Workloads --> Middleware
Workloads --> OTEL
Middleware --> OTEL
end
subgraph External["External / supporting infrastructure"]
direction LR
KAFKA["Kafka KRaft<br/>3 brokers"]
REDIS["Redis Sentinel<br/>3 data + 3 sentinels"]
CICD["GitLab HA +<br/>Jenkins HA"]
NEXUS["Nexus<br/>artifact repo"]
end
OTEL --> KAFKA
KAFKA --> SIGNOZ
SIGNOZ --> CH
CICD --> REPO
REPO --> ARGOCD
ARGOCD --> Workloads
Workloads --> REDIS
Workloads --> KAFKA
classDef critical fill:#c62828,stroke:#b71c1c,color:#fff
classDef external fill:#455a64,stroke:#263238,color:#fff
class OCP critical
class External external
Component descriptions¶
1. OpenShift Container Platform¶
Role: container orchestration, cluster-wide policy enforcement.
- Installed via Assisted Installer — full-ISO, static networking, Tang disk encryption
- FIPS 140-3 mode enabled at install
- Storage: persistent volumes for stateful workloads
- Compliance Operator scanning PCI-DSS v4 + OCP4-CIS in parallel
- ACS blocks deployment of images with Critical CVEs
2. WSO2 API Management + Identity Server¶
Role: API gateway, rate limiting, SSO (SAML + OIDC).
- APIM distributed across multiple tiers (gateway, TM, publisher, devportal, KM)
- Identity Server issues tokens, federates with the platform IdP where needed
- Telemetry exports to OTel collector
3. Middleware — NGINX + Open Liberty¶
Role: application server + routing perimeter.
- Open Liberty hosts Java sample apps
- NGINX performs canary routing (10/90 split example)
- Both emit metrics to the OTel collector
4. OpenTelemetry observability stack¶
Role: unified logs + metrics + traces.
- OTel Collectors run as a DaemonSet — receives OTLP, Prometheus scrapes, logs
- Kafka buffers telemetry by signal type
- SigNoz backend with ClickHouse storage — 2-day hot retention + cold archive
- Dashboards: application, runtime, system, tracing views
5. Kafka (KRaft mode)¶
Role: distributed message broker.
- 3-broker cluster, no ZooKeeper
- Topics:
telemetry.logs,telemetry.metrics,telemetry.traces,telemetry.dlq - Schema Registry validates structured messages
6. Redis (Sentinel HA)¶
Role: caching for API gateway and apps.
- 3 data nodes + 3 sentinels
- Automatic failover on primary loss
- Metrics exported to OTel
7. CI/CD — GitLab HA + Jenkins HA¶
Role: build, test, package, deploy.
- GitLab HA for container registry + code hosting
- Jenkins HA for build orchestration (master + agents)
- Pipeline pattern: commit → GitLab CI → image → Nexus → ArgoCD → OCP
8. Nexus artifact repository¶
Role: centralised artifact storage (Docker, Maven, NPM).
- Integrated with CI/CD pipelines
- Mirrors upstream public artifacts — reduces external pull traffic + enforces trusted-source policy
9. ArgoCD GitOps¶
Role: declarative deployment, Git as source of truth.
- Pull-based sync from the repo
- Handles cluster state drift automatically
- Auditable change log via Git commits
10. Trivy (supply chain)¶
Role: SCA, SBOM generation, CVE reporting.
- Central dashboard for vulnerabilities across images
- Runs on image build + registry push
- SBOM generated per artifact
11. JBoss (domain mode)¶
Role: enterprise app server with centralised management.
- Domain controller + managed servers (server groups)
- Demonstrates enterprise clustering and config propagation
Data flows¶
Application telemetry flow¶
flowchart LR
APP["Sample app<br/>(OTel SDK)"] -->|OTLP| OTELC["OTel Collector<br/>(DaemonSet)"]
APP -.->|Prom scrape| OTELC
APP -.->|log files| OTELC
OTELC -->|logs topic| KAFKA["Kafka"]
OTELC -->|metrics topic| KAFKA
OTELC -->|traces topic| KAFKA
KAFKA --> SIGNOZ["SigNoz backend"]
SIGNOZ --> CH["ClickHouse<br/>2-day hot"]
CH -->|archive| S3["Object storage<br/>cold retention"]
CH --> DASH["Dashboards<br/>Application · Runtime · System · Tracing"]
API request flow¶
flowchart LR
Client -->|HTTPS| NGINX["NGINX<br/>canary 10/90"]
NGINX --> L1["Open Liberty<br/>v1 (stable)"]
NGINX --> L2["Open Liberty<br/>v2 (canary)"]
L1 --> WSO2["WSO2 APIM<br/>rate limit · auth"]
L2 --> WSO2
WSO2 --> Backend["Backend service"]
Backend --> REDIS[("Redis<br/>cache")]
Backend -.->|async| KAFKA[("Kafka<br/>events")]
Backend -->|trace context preserved| NGINX
Deployment flow (GitOps)¶
flowchart LR
Dev[Developer] -->|commit to<br/>feature branch| GL["GitLab"]
GL -->|CI: build + test| Image["Container image"]
Image -->|push| Nexus["Nexus"]
GL -->|merge to develop| Manifests["Update<br/>deploy manifests"]
Manifests -->|push to Git| ArgoCD
ArgoCD -->|detects drift, syncs| OCP["OpenShift<br/>new pods deployed"]
High-availability posture¶
| Component | HA mode | Failover mechanism |
|---|---|---|
| OpenShift | Multi-node control plane | Automatic (Kubernetes) |
| WSO2 APIM | Distributed replicas per tier | Load balancer |
| Redis | Sentinel mode | Automatic via quorum |
| Kafka | KRaft 3-broker | Automatic consensus |
| GitLab | HA deployment | Built-in |
| Jenkins | Master + agents | Agent auto-reconnect |
| SigNoz | Replicated backend | Database persistence |
Security architecture¶
Network¶
- OpenShift
NetworkPolicy/AdminNetworkPolicydefault-deny + per-namespace opt-in - NGINX terminates TLS at the perimeter; only TLS 1.2+ with approved cipher suites
- WSO2 enforces OAuth2/OIDC on every protected API
Compliance¶
- Compliance Operator runs
pci-dss-4+ocp4-cisprofiles on a schedule - Remediation CRs auto-apply where safe; manual review for privileged ones
- Audit logs forwarded via the OTel pipeline to the central observability stack
Image + supply chain¶
Image.spec.registrySources.allowedRegistriesrestricts pulls to:quay.io,registry.redhat.io,registry.connect.redhat.com,ghcr.io, and the internal Nexus mirror- ACS policies block deployment of images with Critical CVEs
- Trivy scans pre-push to the registry; SBOM produced per image
Authentication¶
- Keycloak as the OIDC identity provider for OpenShift console + API (kubeadmin removed day-2)
- MFA enforced in Keycloak (TOTP)
- Password policy meets PCI-DSS 8.3.x
Scaling considerations¶
Horizontal¶
- OTel Collectors scale with node count (DaemonSet pattern)
- Kafka: add brokers; rebalancing automatic
- Redis Sentinels: additional sentinels raise quorum threshold
- NGINX / Open Liberty: replica count configurable per manifest
Vertical¶
- OCP worker node sizing determines per-pod capacity ceiling
- ClickHouse storage expansion extends hot-retention window
- WSO2 heap settings tunable per deployment
Resilience patterns¶
- Circuit breaker — NGINX enforces backend timeouts
- Bulkhead — Kafka partitions isolate traffic streams
- Retry — OTel Collector retries failed exports with backoff
- Dead letter queue — Kafka DLQ captures unprocessable messages
- Health probes — Kubernetes restarts pods that fail liveness/readiness
- Auto-scaling — HPA rules for load-based scaling
References¶
- OpenShift Documentation
- OpenTelemetry Architecture
- WSO2 APIM Documentation
- Kafka Architecture
- Redis Sentinel
- BRAC POC Requirements —
brac_poc_mail.pdfin repo root (view on GitHub) - ADRs — detailed decision rationale
Last Updated: 2026-04-24 · Version: 1.0