OpenShift Assisted Installer Deployment Plan¶
Status: Ready to implement (in an execution session, on feature branch infra/ocp-assisted-installer)
Target: OCP 4.21.9 via full-ISO + static networking + Tang disk encryption
Runs on: dedicated ops VM (brac-ops-runner), invoked from engineer workstation via SSH
Architecture recap¶
``` brac-ops-runner VM (internal 10.x.x.x) │ ├─ talks to Red Hat Assisted Installer API (api.openshift.com) │ via offline-token → access-token exchange │ ├─ generates install-config values + NMState YAML per host ├─ downloads discovery ISO, uploads to the hosting platform as bootable image │ └─ creates OCP VMs (bootstrap + 3 masters + 3 workers) each with: stable MAC address bridge interface on the internal VM subnet (static IP) Tang-client disk encryption CDROM = discovery ISO
Tang server (Deployment in openshift-brac ns on the hosting platform) → internal LoadBalancer IP → serves keys to OCP nodes at boot for LUKS unlock
haproxy-brac (dedicated TCP LB for the OCP cluster) → fronts api:6443, api-int:6443, 22623, 443, 80 → TCP-LBs to masters/workers
PowerDNS brac-poc.comptech-lab.com zone → api / api-int / *.apps point to haproxy-brac ```
Script inventory¶
All scripts live in scripts/ocp/ on feature branch. Designed to run on brac-ops-runner (which has curl, jq, kubectl, openshift-install, oc pre-installed plus any hosting-platform CLIs needed to create VMs).
| # | Script | Purpose |
|---|---|---|
| 01 | token.sh |
Exchange offline token → access token; cache for ~10 min. Sourced by every other script. |
| 02 | tang-deploy.sh |
Deploy Tang server as Deployment + Service with IP from lan-pool-26; extract thumbprint |
| 03 | cluster-create.sh |
POST /v2/clusters — register cluster definition with disk_encryption, VIPs, pull-secret, SSH keys. Writes cluster_id.txt |
| 04 | infra-env-create.sh |
POST /v2/infra-envs — build static_network_config array (one entry per host with NMState + MAC), image_type: full-iso. Writes infra_env_id.txt |
| 05 | iso-download.sh |
GET /v2/infra-envs/{id}/downloads/image → save as discovery.iso on the runner. Upload as a bootable image to the hosting platform. |
| 06 | vms-create.sh |
Apply VM manifests on the hosting platform for bootstrap + 3 masters + 3 workers. Each manifest pins MAC, attaches bridge interface with the static IP for its role, attaches the ISO as CDROM, sizes root disk, sets boot order. |
| 07 | hosts-configure.sh |
Poll GET /v2/infra-envs/{id}/hosts until all 7 hosts registered. Then PATCH each host with hostname, role (master/worker/bootstrap), installation_disk_id. |
| 08 | install.sh |
Wait for cluster status: ready, then POST /v2/clusters/{id}/actions/install. Poll cluster + host progress, print summary every 30s. |
| 09 | credentials.sh |
Download kubeconfig + kubeadmin-password to ./creds/. Print the console URL and login string. |
| 10 | teardown.sh |
Delete VMs, delete DataVolumes, delete Tang deployment, DELETE /v2/clusters/{id} + /v2/infra-envs/{id}. Idempotent. |
| — | deploy-all.sh |
Orchestrator: runs 01 through 09 with timing + pause-on-error. Acceptance test = oc get nodes returns 7 Ready at end. |
Inputs (one config file)¶
scripts/ocp/config.env (gitignored; not committed — kept on runner):
```bash
--- cluster ---¶
export CLUSTER_NAME="brac-poc" export OCP_VERSION="4.21.9" export BASE_DOMAIN="comptech-lab.com"
--- networking (external VIPs via haproxy-brac) ---¶
export API_VIP="59.153.29.101" export INGRESS_VIP="59.153.29.101"
--- disk encryption ---¶
export ENCRYPTION_MODE="tang" # or "tpmv2" export TANG_URL="http://26.26.199.110:7500" export TANG_THUMBPRINT="" # populated by tang-deploy.sh
--- paths ---¶
export TOKEN_FILE="/Users/ze/Documents/Brac-POC/redhat-offline-token" export PULL_SECRET="/Users/ze/Documents/Brac-POC/redhat-pull-secret" export SSH_KEYS_FILE="./ssh-keys.txt" # concatenated pubkeys (engineer Mac + hosting platform admin + brac-ops-runner)
--- host plan (referenced by infra-env-create.sh to build NMState + MAC list) ---¶
Format: name:mac:ip:role (prefix-len is /16, gateway 26.26.0.1, DNS 26.26.199.100)¶
export HOSTS=( "bootstrap:52:54:00:00:01:10:26.26.200.10:auto-assign" "master-1:52:54:00:00:01:11:26.26.200.11:master" "master-2:52:54:00:00:01:12:26.26.200.12:master" "master-3:52:54:00:00:01:13:26.26.200.13:master" "worker-1:52:54:00:00:01:21:26.26.200.21:worker" "worker-2:52:54:00:00:01:22:26.26.200.22:worker" "worker-3:52:54:00:00:01:23:26.26.200.23:worker" ) ```
Dependencies (must exist before deploy-all.sh runs)¶
brac-ops-runnerVM created, SSH reachable via ProxyJump- Tools on runner:
curl,jq,kubectl,virtctl,openshift-install(optional fallback),oc - Kubeconfig for the hosting platform copied to
brac-ops-runner:~/.kube/config(so the runner can create VM manifests) - PowerDNS zone
brac-poc.comptech-lab.comcreated with records: api.brac-pocA59.153.29.101api-int.brac-pocA59.153.29.101*.apps.brac-pocA59.153.29.101- Per-host A records:
ocp-master-1.brac-poc→26.26.200.11, etc. - Cloudflare NS delegation:
brac-poc.comptech-lab.comNS → PowerDNS public IP (or we put records directly in Cloudflare; see DNS architecture memory) haproxy-bracdeployed and health-checking the VMs' IPs (backends will be empty/DOWN until VMs boot — acceptable)- VM network attachments created on the hosting platform: one for the internal br26 bridge (static IPAM) and one for the public br-real bridge (for
haproxy-brac's dual-homing)
Key API decisions + gotchas¶
Authentication¶
client_id=cloud-services(matchesazpin our offline JWT).- Token endpoint:
https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token. - Access token expires ~15 min — cache with expiry file check, refresh before each long phase.
Image type: full-iso¶
- ~1 GB per infra-env; downloaded once to the runner, then uploaded as a DataVolume for all 7 VMs.
- Trade-off vs minimal-iso: full ISO boots offline; minimal fetches RHCOS from internet at boot. Full avoids network-at-first-boot failure modes.
user_managed_networking vs. VIPs¶
- With
user_managed_networking: false, Assisted expects the VIPs to live in the same subnet as the nodes. Our nodes are on26.26.200.0/24but the VIP is59.153.29.101on the external subnet. - Likely needs
user_managed_networking: true(pure external LB model — Assisted skips in-cluster VIP management and relies on our HAProxy). Confirm during execution by tryingfalsefirst; if cluster validation fails on VIP subnet mismatch, flip totrue.
Disk encryption (Tang)¶
- Tang server's thumbprint is fetched AFTER Tang is deployed (Tang generates keys on first start).
tang-deploy.shwaits for readiness, curlshttp://tang:7500/adv, computes thumbprint viajoseorstep-cli. tang_serversin the API is a JSON-string of an array (double-encoded), not an array:"[{\"url\":\"...\",\"thumbprint\":\"...\"}]".enable_on: allencrypts master + worker root disks.
NMState static config¶
static_network_configis an array, one entry per host; MAC inmac_interface_mapmust match exactly the NIC MAC the host will have on first boot.network_yamlis the NMState YAML as a string (with\nescape or multi-line literal in JSON viajq).
Host registration & disk¶
- After ISO boots, discovery agent
POSTs to Assisted Installer and hosts show up inGET /v2/infra-envs/{id}/hosts. inventory.disks[*]lists available disks; use the first NVMe/virtio-block (>100GiB) asinstallation_disk_id.- Set
roleandhostnameviaPATCH /v2/infra-envs/{id}/hosts/{host_id}before install.
Install validation gates¶
- Cluster transitions
insufficient→readyonly when all hosts pass validations (network, DNS, disk size, API reachability). Thevalidations_infofield lists each failing check — script should print these when cluster is stuck.
Execution order for the execution session¶
- Precondition check (first step of
deploy-all.sh): - Tang deployed + reachable at
$TANG_URL/adv - PowerDNS resolves
api.brac-poc.comptech-lab.comand*.apps.brac-pocto59.153.29.101 - HAProxy reachable at
59.153.29.101:6443(TCP connect succeeds; backends may be DOWN) - Hosting platform healthy; storage provisioner has ≥500 GiB free
- NADs
br26-staticandbr-real-bracexist inopenshift-brac - Run scripts 01–09 in order; 10 (
teardown.sh) only on failure or deliberate reset. - Acceptance:
oc --kubeconfig ./creds/kubeconfig get nodesshows 6 nodesReady(3 masters + 3 workers; bootstrap already gone).
What this plan intentionally does NOT include¶
- OCP operators/workloads (Compliance, ACS, OTel, WSO2, etc.) — those belong to Phase 2 issues, deployed after cluster is up.
- ODF storage on OCP — separate issue; runs on workers after install.
- CI/CD integration (GitLab/Jenkins HA) — separate issue; not dependent on OCP install.
Related memory (what to read for more detail)¶
memory/reference_assisted_installer_api.md— REST endpoint details, payload shapesmemory/project_vm_network_plan.md— MAC+IP+VM tablememory/project_dns_lb_architecture.md— DNS + HAProxy rolesmemory/feedback_vm_ssh_keys.md— pubkeys to inject into every VM
Next action: open an execution session on branch infra/ocp-assisted-installer, read this plan + referenced memory, implement scripts 01–10. Estimated: 4–6 hours for a working install including the Tang server.