Skip to content

Team & Execution Guide

Team structure, expertise requirements, coordination patterns, and knowledge transfer for a smooth 6-day execution.


Team structure

Role Responsibilities Skills required Time commitment
Infrastructure Lead OpenShift provisioning, IaC Terraform, OpenShift, networking Days 1-3 full · Days 4-6 support
Platform Lead OTel stack, SigNoz, Kafka, Redis Kubernetes, observability, databases Days 2-5 full
DevOps Lead GitLab, Jenkins, ArgoCD, Nexus CI/CD, container registries, Git Days 1-2 full · Days 5-6 full
Integration Lead WSO2 APIM + IS, SSO WSO2, auth protocols, API gateways Days 2-4
Security Lead Compliance scanning, ACS, Trivy K8s security, compliance, scanning Days 2-3 · Day 5
Project Lead Coordination, BRAC comms, risk mgmt Leadership, communication All 6 days

If you're solo

Working alone is high risk. Strategy:

Phase Action
Day 0 Pre-stage everything possible
Days 1-2 Critical path only: OpenShift + Kafka + OTel
Days 3-4 WSO2 + middleware; start Phase 3 prep
Days 5-6 Finish Phase 3, demo, report

Solo risks

  • Single point of failure — you get sick, POC stops
  • Context switching between components is slow
  • No peer review / sanity checks
  • Knowledge transfer is near-impossible (only you know how it works)

Solo mitigations: document heavily as you build · record yourself explaining components · automate everything (no manual steps) · line up an on-call backup (friend/colleague).


Expertise requirements

Must-have skills for everyone

  • Git (branching, commits, PRs) — ~70% of collaboration
  • Bash scripting for deployments
  • Kubernetes basics (kubectl)
  • Reading error logs

Per-component skill matrix

Component Required skill Nice-to-have
OpenShift openshift-install, oc CLI, VM provisioning IaaS automation
Kafka Kubernetes, distributed-systems concepts KRaft mode, Kafka ops
Redis Kubernetes, caching patterns Sentinel, failover concepts
GitLab/Jenkins CI/CD pipeline design, container registries Git workflows, plugin config
OTel/SigNoz Observability concepts, time-series DBs APM, tracing, metrics
WSO2 API gateway concepts, SAML/OIDC OAuth2, SSO configuration
NGINX/Liberty Load balancing, app servers Canary deployment, routing
Trivy Container security scanning Supply-chain security, SBOM
JBoss App server concepts Domain mode, JNDI, datasources

If the team lacks expertise

  1. Pre-POC: 2-3h demo session per tool
  2. During: Pair experienced person with learner
  3. Always: whoever deploys also documents for the team
  4. Recorded walkthroughs: screen-record yourself, share with the team

Example — nobody knows WSO2:

  • Day 0: watch a 30-min WSO2 tutorial
  • Day 2: pair-program with WSO2 community help
  • Day 2 evening: recorded walkthrough of what you learned
  • Team review: everyone understands setup

Team communication

Daily standup (15 min, 09:00)

Three questions per person:

  1. What did I complete yesterday?
  2. What am I working on today?
  3. What's blocking me?

Template:

Infrastructure Lead Today: OpenShift provisioning (8h, 60% done) Tomorrow: Finish OpenShift (4h), validate storage (1h) Blocker: None

Platform Lead Today: Kafka KRaft setup (2h), started OTel collector (1h) Tomorrow: OTel collector (3h), SigNoz (2h) Blocker: Need Docker images for OTel, downloading overnight

Rule: if blockers, propose a solution — don't just state the problem.

Escalation protocol (stuck > 30 min)

Step Action Time budget
1 Slack/Chat: post the problem + what you've tried 5 min
2 Buddy reviews — another team member takes a look 15 min
3 Pair program (screen share) 30 min
4 Skip the component, document as "blocked on X", move on

Never block the whole team

Keep forward momentum. A blocked component gets parked; others keep shipping.


Knowledge transfer

During execution (continuous)

Every build session should produce documentation. Examples:

Day 1 — Infrastructure Lead creates docs/OPENSHIFT-SETUP.md:

  • What the installer does
  • How to debug provisioning
  • How to access the cluster
  • What worked, what didn't

Day 2 — Platform Lead creates docs/OBSERVABILITY-SETUP.md:

  • How OTel Collector sends to Kafka
  • How SigNoz queries ClickHouse
  • Troubleshooting: traces missing? Check these 5 things
  • How to scale OTel (replicas)

For every component: whoever deploys it also documents it, as they go.

Post-POC knowledge transfer (Day 6 evening + after)

Recorded walkthroughs (~10-15 min each):

  • Infrastructure Lead: "How the OpenShift cluster works"
  • Platform Lead: "How the OTel pipeline flows"
  • DevOps Lead: "How to deploy new apps via ArgoCD"

Wiki entry per component:

  • How to access it (URL / credentials)
  • How to deploy it (commands)
  • How to troubleshoot it (5-10 common issues)
  • How to scale it (replicas? memory?)
  • How to extend it (custom dashboards, new policies)

One-page runbooks:

  • How to restart a component
  • How to restore from backup
  • How to handle a component failure
  • How to monitor for issues

Coordination & dependencies

Phase 1 — fully parallelizable

All 4 issues run concurrently on Day 1:

Issue Owner Duration
OpenShift Infrastructure Lead 8h
GitLab DevOps Lead 3h
Kafka Platform Lead 2h
Redis Platform Lead 1.5h

No waiting. Action: everyone starts Day 1 morning.

Phase 2 — partially sequential

OpenShift must be ready before Phase 2 kicks off. Then:

Dependency Duration Notes
OpenShift ready (Day 1 EOD) Prerequisite
Compliance scan 1h Runs in background
OTel + SigNoz + ClickHouse 4-5h 🔴 Critical path
WSO2 APIM + IS 2-3h Start when OTel > 50% to avoid cluster overload

Infrastructure Lead watches cluster health during Phase 2. If resource exhaustion looms, Platform Lead reduces OTel replicas.

Phase 3 — parallel again

All Phase 3 components are independent. All start Day 5 morning.

Issue Owner Duration
Trivy Security Lead 1-2h
ArgoCD DevOps Lead 1.5-2h
Nexus DevOps Lead 1-1.5h
JBoss Middleware Lead 1-2h

Common execution mistakes (and fixes)

# Mistake Reality Fix
1 "I'll document after" Never happens — team forgets, post-POC becomes painful Document as you build. 5 min/day = 30 min total.
2 "Just do it my way, I'll explain later" Inconsistency, confusion, rework 2-min huddle on approach before starting a component
3 Working in silos Multiple people hit the same problem separately Daily standup, pair-program for blockers, shared chat
4 Skipping tests to save time Broken deployment → 4-hour debug → lost timeline Always validate before moving on (each issue has DoD)
5 "We'll fix it later" "Later" never comes — tech debt ships Fix immediately · or document as limitation
6 Ignoring resource constraints Cluster OOM → evictions → cascading failures Daily kubectl top nodes; escalate at > 80% memory
7 Solo-debugging complicated issues 2 hours wasted solo — 20 min with two brains If stuck > 30 min, call for pair programming
8 Feature creep "Let me add one more thing" — 1h each, 6h lost Scope lock enforced; new ideas → Phase 2 list

Phase handoffs

Phase 1 → Phase 2 handoff

Who: Infrastructure Lead → Platform Lead When: End of Day 1, 17:00 Duration: 30 min

Infrastructure Lead presents:

  • How the cluster was set up
  • Where the kubeconfig is
  • How to access the console
  • Known issues / quirks
  • How to scale nodes if needed

Platform Lead asks:

  • Can I deploy 20 pods without issues?
  • How much free memory do we have?
  • Any known problems?
  • How do I get to the cluster if something breaks?

Deliverables: cluster info document · kubeconfig file · access credentials · infrastructure code with docs.

Phase 2 → Phase 3 handoff

Who: Platform Lead + DevOps Lead → Phase 3 team When: End of Day 4, 17:00 Duration: 30 min

Platform Lead presents:

  • OTel pipeline setup
  • Kafka topology
  • ClickHouse queries
  • Common bottlenecks

DevOps Lead presents:

  • GitLab / Jenkins setup
  • Container registry access
  • How to push images
  • How to trigger CI/CD

Morale & burnout prevention

6 days is intense. Actively prevent burnout:

Daily:

  • 15-min standup (keeps everyone aligned, breaks isolation)
  • Pair-program for blockers (less frustrating than solo debugging)
  • Celebrate small wins — "OTel traces working!" 🎉

Days 3-4:

  • Take a real break — 2h lunch, walk outside
  • If ahead of schedule: ease off, don't over-invest
  • Socialize: team lunch or dinner

Day 6:

  • Wrap up calmly, don't panic
  • Demo is not your grade — you did your best
  • Celebrate completion: team dinner after POC

If someone burns out

  • Redistribute work immediately
  • No shame
  • Say: "We can shift your tasks to others. Rest — you're valuable."

Escalation for team issues

Team member overwhelmed

  1. Person tells Project Lead
  2. Project Lead asks: "What's overloading you?"
  3. Immediate action: shift tasks to others or reduce scope
  4. Don't wait until they break

Two members disagree on approach

  1. Quick 15-min discussion
  2. If deadlocked: Project Lead decides (not by vote — by timeline/risk)
  3. Decision made: everyone commits (no undermining)
  4. Learn from it post-POC

Team member not delivering

  1. Private conversation: "Here's what I'm seeing. What's going on?"
  2. Understand the blocker (stuck? unclear? task too hard?)
  3. Adjust: more support, different task, or escalate if needed
  4. 24h to improve

Per-role Definition of Done

Infrastructure Lead

  • OpenShift cluster: 3-node, all operators healthy
  • Infrastructure code clean, documented, in Git
  • Cluster monitoring set up
  • Any resource issues escalated to team
  • Setup documented in docs/

Platform Lead

  • OTel pipeline flowing data
  • SigNoz accessible with traces visible
  • Kafka topics created and validated
  • Redis HA tested with failover
  • Documented in docs/

DevOps Lead

  • GitLab + Jenkins running
  • Sample pipeline triggered and working
  • ArgoCD syncing apps
  • Nexus repos accessible
  • Documented in docs/

Integration Lead

  • WSO2 APIM deployed
  • At least one SSO method (SAML or OIDC) working
  • Rate-limiting policy applied
  • Documented in docs/

Security Lead

  • Compliance scan completed and report generated
  • Trivy dashboard running
  • SBOM generated
  • Documented in docs/

Project Lead

  • All issues tracked and progressing
  • Daily standups happening
  • BRAC updates sent daily
  • Blockers escalated within 2 hours
  • Demo scheduled and prepared
  • All documentation complete

Created: 2026-04-24 · Owner: Project Lead · Next step: recruit team, assign roles, kickoff training