Skip to content

Deployment Order

The full sequence of what gets installed, in what order, and how each step is version-controlled. Everything after Phase 0 lives in Git — OpenTofu modules, Ansible playbooks, OCP manifests, ACM Policies, ArgoCD ApplicationSets. All versioned, all auditable, all re-runnable.


Principle: GitOps for everything

Layer Source of truth Applied by
VM provisioning brac-poc-infrastructure repo (OpenTofu) Terrakube
VM configuration brac-poc-ansible repo (Ansible playbooks) AWX
OpenShift cluster config + operators + workloads openshift-platform-gitops repo OpenShift GitOps (ArgoCD) via RHACM pull-mode
ACM policies (compliance baseline) openshift-platform-gitops/policies/ ACM Policy controller
Observability config, service-mesh rules, alerting openshift-platform-gitops/components/ ArgoCD ApplicationSet

Exception: Phase 0 bootstrap — the minimum required to get OpenTofu + Ansible + Git running. Once Phase 0 is done, every subsequent change goes through a MR.


Phased deployment sequence

flowchart TB
    P0["Phase 0 — Manual bootstrap<br/>(only manual phase)"]:::phase0
    P1["Phase 1 — IaC substrate<br/>OpenSSL root CA, Git repos, Vault, MinIO"]:::substrate
    P2["Phase 2 — Automation plane<br/>AWX + Terrakube via their own VMs"]:::automation
    P3["Phase 3 — Platform tools tier<br/>GitLab, Jenkins, Nexus"]:::platform
    P4["Phase 4 — Identity + workflow tier<br/>Keycloak, WSO2, Temporal, n8n"]:::apps
    P5["Phase 5 — Observability tier<br/>Splunk, SigNoz, ClickHouse, Redis"]:::obs
    P6["Phase 6 — OpenShift hub clusters<br/>hub-dc SNO → RHACM → hub-dr"]:::ocp
    P7["Phase 7 — OpenShift spoke clusters<br/>spoke-dc → spoke-dr via ACM/ZTP"]:::ocp
    P8["Phase 8 — OCP operators + workloads<br/>via GitOps ApplicationSets"]:::workload
    P9["Phase 9 — Demo apps<br/>OTel Demo + Bookinfo + custom<br/>traffic generator"]:::demo
    P10["Phase 10 — DR drill rehearsals"]:::dr

    P0 --> P1 --> P2 --> P3 --> P4 --> P5 --> P6 --> P7 --> P8 --> P9 --> P10

    classDef phase0 fill:#424242,stroke:#212121,color:#fff
    classDef substrate fill:#1b5e20,stroke:#0d3812,color:#fff
    classDef automation fill:#004d40,stroke:#00251a,color:#fff
    classDef platform fill:#01579b,stroke:#002f6c,color:#fff
    classDef apps fill:#4a148c,stroke:#12005e,color:#fff
    classDef obs fill:#006064,stroke:#003333,color:#fff
    classDef ocp fill:#b71c1c,stroke:#7f0000,color:#fff
    classDef workload fill:#6a1b9a,stroke:#38006b,color:#fff
    classDef demo fill:#bf360c,stroke:#870000,color:#fff
    classDef dr fill:#ef6c00,stroke:#b53d00,color:#fff

Phase 0 — Manual bootstrap (one-time, ~2 hours)

Only phase where we run commands directly. After this, everything is Git-driven.

Step On What Why
0.1 Mac Generate OpenSSL root CA (4096-bit RSA, 10-year) Trust anchor for Vault intermediate CA
0.2 Mac Create 2 empty Git repos: brac-poc-infrastructure, brac-poc-ansible on the existing staxv GitLab (temporarily) Source of truth starting point
0.3 Mac Provision brac-poc-ops-runner-vm1-dc via kubectl apply of a KubeVirt VM CR (one-time hand-written manifest) We need a jump host before anything else
0.4 ops-runner Install OpenTofu, Ansible, oc, kubectl, helm, vault CLI, mc, openssl, gh, git Tool belt
0.5 ops-runner Import OpenSSL root CA material (transferred from Mac) CA lives here now, per your direction
0.6 ops-runner Clone the 2 Git repos, scaffold OpenTofu module skeletons Ready for Phase 1

Total manual commands: ~30-40 lines of shell/YAML. Captured in Phase0-Bootstrap.md (runbook).


Phase 1 — IaC substrate (~3 hours)

Everything from here on: edit Git → MR → CI → merge → tool applies.

Step Git repo Target Purpose
1.1 brac-poc-infrastructure Terrakube (not yet — still using tofu CLI) ops-runner-vm1-dr (DR twin of ops-runner)
1.2 brac-poc-infrastructure vault-vm1/2/3-dc + vault-vm1/2/3-dr 6 Vault VMs
1.3 brac-poc-ansible Vault cluster init Ansible role vault: install binary, configure Raft, initialise, generate 5 unseal shares (hand out to custodians), unseal, enable PKI secrets engines
1.4 brac-poc-ansible Vault intermediate CA Generate CSR from Vault → sign with OpenSSL root CA on ops-runner → upload signed cert to Vault
1.5 brac-poc-infrastructure minio-vm1/2/3-dc/dr 6 MinIO VMs
1.6 brac-poc-ansible MinIO cluster Install MinIO, configure distributed setup, enable site-replication DC ↔ DR
1.7 brac-poc-ansible Cross-cutting buckets Create buckets: vault-snapshots, acm-hub-backup, velero, gitlab-backup, jenkins-backup, clickhouse-archive, nexus-blobs, splunk-frozen

Why this order: Vault before MinIO because Vault Raft snapshot backups go to MinIO in phase 1.7 cron (chicken-and-egg broken by a 30-min gap with manual snapshot in between).


Phase 2 — Automation plane (~2 hours)

Step Action Why
2.1 Provision awx-vm1-dc/dr + awx-pg-vm1-dc/dr via OpenTofu CLI AWX + its PG
2.2 Install AWX (Ansible role from brac-poc-ansible) GUI for Ansible runs
2.3 Provision terrakube-vm1-dc/dr + terrakube-pg-vm1-dc/dr Terrakube + its PG
2.4 Install Terrakube (Ansible role) GUI for OpenTofu runs
2.5 Configure AWX: connect to brac-poc-ansible repo, create inventory for DC+DR VMs, create job templates per role AWX ready to drive playbooks
2.6 Configure Terrakube: connect to brac-poc-infrastructure repo, create workspaces dc, dr, state stored in MinIO Terrakube ready to drive VM provisioning

Switch over: from 2.7 onwards, no more CLI tofu apply on ops-runner — every future infra change goes through Terrakube UI. Every config change goes through AWX UI. oc apply reserved for OCP GitOps bootstrap only.


Phase 3 — Platform tools tier (~4 hours)

In parallel (Terrakube workspace runs concurrently):

Step VMs What
3.1 gitlab-vm1-dc/dr + gitlab-pg-vm1-dc/dr GitLab CE + its PG
3.2 jenkins-vm1-dc/dr Jenkins LTS (filesystem state)
3.3 nexus-vm1-dc/dr Nexus OSS, S3 blob store pointed at MinIO
3.4 Git repos migration Move brac-poc-infrastructure + brac-poc-ansible + create openshift-platform-gitops on our own GitLab CE (no longer on staxv GitLab)

After 3.4, we're fully self-hosted.


Phase 4 — Identity + workflow tier (~4 hours)

Step VMs What
4.1 keycloak-vm on hub clusters via GitOps Keycloak deployed on hubs (per Decision #017) — not VMs — via hub-platform ApplicationSet later. Keycloak is the only exception to the VM-tier rule.
4.2 wso2-is-vm1/2-dc/dr + wso2-is-pg-vm1-dc/dr WSO2 IS cluster + PG — federated with Keycloak (see IDENTITY-STRATEGY.md)
4.3 wso2-apim-<profile>-vm1-dc/dr (5 profiles × 2 sites = 10 VMs) + shared wso2-apim-pg-vm1-dc/dr Distributed APIM
4.4 temporal-vm1-dc/dr + PG Workflow engine
4.5 n8n-vm1-dc/dr + PG No-code automation

Keycloak deployment order: hubs must exist first (Phase 6) → Keycloak installed via ArgoCD → then WSO2 IS federates. So actually Phase 4.2+ happens after Phase 6-7. Reordered in the runbook.


Phase 5 — Observability + cache tier (~3 hours)

Step VMs
5.1 redis-vm1/2/3-dc/dr (Redis + Sentinel combined mode)
5.2 clickhouse-vm1-dc/dr + ClickHouse Keeper sidecar
5.3 signoz-vm1-dc/dr (stateless UI → ClickHouse)
5.4 splunk-vm1-dc/dr (Free edition, 500 MB/day)

Phase 6 — OpenShift hub clusters (~6 hours)

Step Cluster Action
6.1 hub-dc (SNO) Assisted Installer: 1-node SNO install, FIPS on, Tang encryption (Tang server from hub-dc itself or ops-runner for bootstrap), static networking
6.2 hub-dc Day-2: apply install manifests via POST /v2/clusters/{id}/manifests (see OCP-COMPLIANCE-CONSIDERATIONS.md)
6.3 hub-dc Manual oc apply: OpenShift GitOps Subscription (the only oc command after Phase 0)
6.4 hub-dc Manual oc apply: root ArgoCD Application pointing at openshift-platform-gitops/bootstrap/
6.5 hub-dc ArgoCD auto-syncs: RHACM + MCE + ACS Central + Compliance + COO + RHBK Keycloak + Logging + Loki + Tempo + OTel + External Secrets + cert-manager
6.6 hub-dr Provision hub-dr via RHACM + Assisted Installer (as a ManagedCluster)
6.7 hub-dr Klusterlet pulls its own config from hub-dc's ArgoCD via pull-mode ApplicationSet

Only Steps 6.3 + 6.4 use oc apply. Everything else is Git.


Phase 7 — OpenShift spoke clusters (~6 hours)

Step Cluster Action
7.1 spoke-dc ACM-provisioned via ZTP (zero-touch provisioning) or Assisted Installer registered to hub-dc
7.2 spoke-dc RHACM klusterlet auto-installed
7.3 spoke-dc GitOps add-on OR manual OpenShift GitOps Subscription (per pull-mode setup)
7.4 spoke-dc ArgoCD pull-mode receives ApplicationSets targeting spoke role
7.5 spoke-dr Same sequence as 7.1-7.4

Phase 8 — OCP operators + workloads (via GitOps, ~3 hours to settle)

All triggered automatically by merges to openshift-platform-gitops/main:

Layer ApplicationSet Deploys
All clusters all-clusters-baseline Cert-manager Issuers, OperatorHub source config, PSA defaults, sysctl, auditd
Hub only hub-platform RHACM MultiClusterHub, ACS Central, RHBK Keycloak realm brac-poc, COO, Logging, Loki, Tempo
Spoke only spoke-platform Compliance Operator ScanSettingBindings, ACS SecuredCluster, External Secrets Operator, AMQ Streams (Kafka) operator install
Spoke workloads spoke-workloads Kafka cluster CR, schema registry, sample-app namespaces, OTel collector, service-mesh if we enable it
ACM policies (Policy + PlacementBinding) Audit profile, allowed registries, default-deny NetworkPolicy, file integrity rules

Phase 9 — Demo applications (~2 hours)

Demo workloads deployed on spoke-dc (and replicated to spoke-dr via ApplicationSet).

Demo Deploys Purpose
OpenTelemetry Demo (Astronomy Shop) Helm chart opentelemetry-demo from open-telemetry repo, committed as manifests in GitOps Multi-language microservices (Java, Go, .NET, Python, Ruby, JS, etc.) with built-in traffic generator. Traces + metrics + logs visible in SigNoz + Tempo + Loki.
Bookinfo (Istio sample) bookinfo.yaml from istio/samples, committed as kustomize base 4-service canonical demo (productpage, details, reviews, ratings) for canary/routing/service-mesh demos
Custom traffic generator (brac-poc-demo-app) Small Go service we build — see next doc-needed Banking-flavored workflow: simulated loan-approval requests, payment-settlement flows, realistic log output, Prometheus metrics, OTel traces

Details in OBSERVABILITY-DEMOS.md.


Phase 10 — DR drill rehearsals (~2 days)

See DR-DRILL-PLAYBOOK.md. Execute drills 1 through 7. Record video of full-site drill for demo day.


Master checklist

  • Phase 0 bootstrap scripts committed
  • Phase 0 runbook: every manual command documented
  • Phase 1-5 OpenTofu modules committed to brac-poc-infrastructure
  • Phase 1-5 Ansible roles committed to brac-poc-ansible
  • Phase 6-9 manifests committed to openshift-platform-gitops
  • Every version pinned (per OPERATOR-CATALOG.md + VM-TIER-ARCHITECTURE.md)
  • GitLab CI enforces: tofu-fmt, tofu-validate, tflint, ansible-lint, yamllint, kubeconform, gitleaks
  • Terrakube workspaces + AWX job templates pointed at the repos
  • Nothing deployed without a corresponding Git commit

Single rule

If it's not in Git, it didn't happen. oc apply outside Phase 0 / Phase 6.3-6.4 = policy violation. Break-glass exceptions logged as GitLab issues with resolution within 24h.


Created: 2026-04-24 · Owner: Project Lead + DevOps Lead · Status: ready for execution