Skip to content

Phase 1: Staging Proof of Concept - High-Level Plan

Table of Contents

  1. Current Architecture Analysis
  2. Target Helm Architecture
  3. Phase 1.1: Infrastructure Foundation
  4. Phase 1.2: Application Migration
  5. Phase 1.3: GitOps Integration
  6. Phase 1.4: Validation & Testing
  7. Technical Specifications
  8. Testing Strategy
  9. Success Criteria

Current Architecture Analysis

Service Dependencies Overview

Infrastructure Services (Persistent Data): - app-database: MySQL for primary application data. - audit-database: MongoDB for audit logs and caching. - rabbitmq: Message broker for asynchronous tasks. - mailhog: Development-only email capture service.

Application Services (Frequently Updated): - app: The main PHP/Apache web application. - scheduler: A Node.js service for processing background jobs. - websocket: A Node.js server for real-time client communication.

Key Dependencies Identified

  • Database Connections: The app, scheduler, and websocket services connect to the MySQL and MongoDB databases.
  • Message Queue: The app service produces tasks consumed by the scheduler and websocket services via RabbitMQ.
  • Service Communication: The app service communicates directly with the scheduler and websocket services via internal APIs.

Target Helm Architecture

Chart Structure

All Helm charts and Kubernetes manifests will be developed within the .mg-build-system submodule established in Phase 0.

.mg-build-system/
└── charts/
    ├── infrastructure/          # Umbrella chart for persistent services
    │   ├── mysql/
    │   ├── mongodb/
    │   └── rabbitmq/
    └── applications/           # Umbrella chart for application services
        ├── app/
        ├── scheduler/
        └── websocket/

Deployment Strategy

  1. Infrastructure First: Deploy persistent services using the infrastructure chart to establish a stable data layer.
  2. Application Layer: Deploy application services that connect to the infrastructure endpoints.
  3. Service Discovery: Utilise Kubernetes internal DNS for all service-to-service communication.
  4. Configuration: Leverage environment-specific values.yaml files for review, staging, and production environments, managed as secrets within the CI/CD system.

Phase 1.1: Infrastructure Foundation

This phase focuses on creating robust, version-controlled Helm charts for the stateful services that form the application's backbone.

  • Key Tasks:
  • Assess the target Kubernetes cluster and install necessary operators (e.g., ingress-nginx, cert-manager).
  • Develop Helm charts for MySQL, MongoDB, and RabbitMQ supporting persistence, configuration, and monitoring.
  • Integrate New Relic monitoring for all infrastructure components.
  • Deliverables:
  • A version-controlled infrastructure umbrella chart within the .mg-build-system submodule.
  • Documented procedures for backup and restore.
  • A stable data layer deployed to the staging environment.

Phase 1.2: Application Migration

This phase focuses on containerising the core applications and creating their corresponding Helm charts to run on the new infrastructure.

  • Key Tasks:
  • Create optimised, multi-stage Dockerfiles for the app, scheduler, and websocket services.
  • Develop Helm charts for each application, including support for health checks, ingress, and environment configuration.
  • Configure inter-service communication using Kubernetes service names.
  • Deliverables:
  • Production-ready container images for each application, stored in a central registry.
  • A version-controlled applications umbrella chart within the .mg-build-system submodule.
  • A fully assembled application stack running in the staging environment.

Phase 1.3: GitOps Integration

This phase implements the automated deployment workflow defined in Phase 0, connecting Git events to deployments in the Kubernetes cluster.

  • Key Tasks:
  • Configure a container registry and integrate it with the CI/CD pipeline.
  • Create CI/CD workflows to build and tag container images based on semantic versioning.
  • Develop deployment scripts for DeployHQ that execute helm upgrade --install commands.
  • Automate deployments to review and staging environments based on branch pushes.
  • Implement a manual approval gate for tagged production releases.
  • Deliverables:
  • A fully automated build-and-push pipeline for all services.
  • A GitOps-driven deployment process capable of managing multiple concurrent environments.

Phase 1.4: Validation & Testing

This phase ensures the new Helm-based deployment is stable, performant, and secure before being considered production-ready.

  • Key Tasks:
  • Conduct end-to-end functional testing on the new staging environment.
  • Perform load testing to establish performance benchmarks and identify bottlenecks.
  • Execute security hardening, including network policy implementation and container scanning.
  • Finalise monitoring dashboards and alerting in New Relic.
  • Deliverables:
  • A comprehensive test report confirming functional parity.
  • Performance benchmarks and resource optimisation recommendations.
  • A security compliance report.
  • A fully documented, production-ready staging environment.

Technical Specifications

Container Images

  • Base Images: Use official Alpine-based images for smaller footprint
  • Multi-stage Builds: Separate build and runtime environments
  • Security: Regular vulnerability scanning and updates
  • Size Optimisation: Remove development dependencies in production images

Persistent Storage

  • Storage Classes: Use SSD-backed storage for databases
  • Backup Strategy: Daily automated backups with retention policy
  • Volume Expansion: Support for storage growth
  • Cross-AZ Replication: For high availability

Service Communication

  • Internal DNS: Use Kubernetes service names for discovery
  • Health Checks: Implement liveness and readiness probes
  • Circuit Breakers: For resilient service communication
  • TLS: Encrypt inter-service communication where sensitive

Environment Configuration

  • ConfigMaps: Non-sensitive configuration data
  • Secrets: Database passwords, API keys, certificates
  • Environment Variables: Service endpoints and feature flags
  • Values Files: Environment-specific Helm value overrides

Testing Strategy

Unit Testing

  • Maintain existing test suites
  • Add container-specific tests
  • Test Helm template rendering
  • Validate configuration generation

Integration Testing

  • Database connectivity tests
  • Message queue communication
  • Service discovery validation
  • API endpoint testing

Performance Testing

  • Load testing with Apache Bench/JMeter
  • Database performance under concurrent load
  • WebSocket connection scaling
  • Resource usage monitoring

Security Testing

  • Container image vulnerability scanning
  • Network policy validation
  • RBAC permission testing
  • Secrets management verification

Success Criteria

Technical Success Metrics

  • Deployment Time: Application deployments complete in < 2 minutes
  • Uptime: 99.9% availability during testing period
  • Performance: Response times within 10% of Docker Compose baseline
  • Resource Efficiency: 30% reduction in overall resource usage

Operational Success Metrics

  • Recovery Time: < 15 minutes MTTR for common issues
  • Build Success: > 95% successful deployments
  • Multi-Branch: 3+ concurrent feature branch environments
  • Monitoring: Full New Relic observability of all services

Business Success Metrics

  • Developer Velocity: Reduced time from code to staging
  • Feature Testing: Faster validation of new features
  • Environment Parity: Staging accurately represents production
  • Risk Reduction: Proven deployment process for production

Ready for Phase 2 Criteria

  • [ ] All services running stably
  • [ ] Successful load testing at production scale
  • [ ] Complete New Relic monitoring and alerting coverage
  • [ ] Documentation and runbooks completed
  • [ ] DeployHQ integration fully functional
  • [ ] Multi-branch testing proven
  • [ ] Security audit passed
  • [ ] Performance benchmarks met
  • [ ] Team training completed
  • [ ] Go-live approval obtained

Next Steps

Upon successful completion of Phase 1:

  1. Production Planning: Begin Phase 2 detailed planning
  2. Regional Architecture: Design multi-region deployment strategy
  3. Security Review: Conduct comprehensive security audit
  4. Performance Optimisation: Fine-tune resource allocation
  5. Documentation: Complete operational handbooks
  6. Training: Prepare team for production deployment

Key Milestone: Production-ready staging environment with proven Helm charts and CI/CD integration

Last modified by: Unknown