
Senior Cloud Engineer
Who we are
Moniepoint is an all-in-one financial services platform for emerging markets and the second-fastest-growing company in Africa. Since 2019, Moniepoint’s technology has powered over 3 million people, offering personal and business banking, payment, credit, and business management tools to help them succeed. Moniepoint processed $182 billion in 2023 and currently processes the majority of the POS transactions in Nigeria.
About the role
Engineering at Moniepoint is an inspired, customer-focused community dedicated to crafting solutions that redefine our industry. Our infrastructure runs on some of the cool tools that excite infrastructure engineers - kubernetes, docker etc.
We also make business decisions based on the large stream of data we receive daily, so we work daily with big data, perform data analytics and build models to make sense of the noise and give our customers the best experience.
Curious about what makes Moniepoint an incredible place to work?
Check out posts on how we cultivate a culture of innovation, teamwork, and growth.
Position Overview
We are seeking an experienced Cloud Engineer to design, implement, and manage our multi-cloud infrastructure. The ideal candidate will have deep expertise in cloud platforms, container orchestration, infrastructure automation, CI/CD pipelines, and observability solutions, ensuring scalable, reliable, and cost-effective cloud operations across multiple cloud providers.
Principal Duties and Responsibilities
Cloud Infrastructure Management
- Design, deploy, and manage multi-cloud infrastructure across Google Cloud Platform (GCP), Amazon Web Services (AWS), Azure, and Oracle Cloud Infrastructure (OCI)
- Architect and implement highly available, fault-tolerant, and scalable cloud solutions
- Manage cloud resources including compute instances and networking components
- Design and implement disaster recovery and business continuity plans for cloud workloads
- Migrate on-premises applications and services to cloud environments with minimal disruption
- Optimize cloud resource utilization and implement auto-scaling policies
- Maintain comprehensive documentation of cloud architectures, configurations, and runbooks
Kubernetes & Container Orchestration
- Design, deploy, and manage production-grade Kubernetes clusters across multiple cloud providers
- Implement and maintain container orchestration strategies for microservices architectures
- Configure and manage Kubernetes resource objects
- Manage Kubernetes cluster upgrades, scaling, and performance optimization
- Troubleshoot complex container and orchestration issues in production environments
- Implement multi-cluster and multi-region Kubernetes deployments for high availability
Service Mesh & Advanced Networking
- Design, deploy, and manage Istio service mesh for microservices communication and observability
- Configure Istio traffic management, including virtual services, destination rules, and gateways
- Implement advanced traffic routing (canary deployments, A/B testing, traffic splitting) using Istio
- Deploy and manage Istio observability components (telemetry, distributed tracing, service graphs)
- Implement circuit breaking, retries, timeouts, and fault injection for resilience testing
- Configure Istio ingress and egress gateways for external traffic management
- Monitor and optimize service mesh performance and resource utilization
- Implement multi-cluster service mesh architectures across different cloud providers
Reverse Proxy & Load Balancing
- Deploy, configure, and manage HAProxy for high-performance load balancing and reverse proxy
- Implement HAProxy ACLs, backend routing, health checks, and session persistence
- Design and implement Nginx as reverse proxy for web applications and API gateways
- Configure Nginx for rate limiting and request filtering
- Implement Nginx load balancing algorithms and upstream health monitoring
- Manage Nginx Plus features for advanced traffic management and monitoring
- Optimize HAProxy and Nginx performance for high-throughput environments
Infrastructure as Code & Configuration Management
- Develop and maintain infrastructure as code using Terraform
- Create reusable, modular Terraform configurations for various cloud resources and Implement Terraform state management and remote backends
- Design and implement configuration management solutions using Ansible
- Develop Ansible playbooks and roles for automated server provisioning and configuration
- Integrate Terraform and Ansible workflows for end-to-end infrastructure automation
- Implement infrastructure version control, code review processes, and GitOps practices
- Manage infrastructure drift detection and remediation
- Create and maintain infrastructure documentation and architecture diagrams
- Implement policy-as-code using tools like OPA (Open Policy Agent) or Sentinel
CI/CD Pipeline Management
- Design, implement, and maintain continuous integration pipelines using Jenkins and Harness
- Optimize build times and pipeline efficiency
- Integrate security scanning (SAST, DAST, container scanning) into CI/CD pipelines
- Configure Jenkins jobs, pipelines, and shared libraries for automated build, configure build agents, runners, and execution environments
- Implement Harness deployment pipelines for cloud-native applications
- Integrate CI/CD pipelines with version control systems (Git, GitHub, GitLab)
- Implement continuous deployment workflows using ArgoCD for Kubernetes-based applications
- Design and implement GitOps workflows with ArgoCD for declarative application delivery
- Manage ArgoCD application definitions, sync policies and multi-cluster deployments
- Implement progressive delivery strategies (blue-green deployments, canary releases) using ArgoCD
Message Streaming & Event-Driven Architecture
- Deploy and manage Apache Kafka clusters for real-time data streaming and event-driven architectures
- Configure Kafka topics, partitions, replication factors, and retention policies
- Implement Kafka Connect for data integration with various sources and sinks
- Monitor Kafka cluster health, performance metrics, and consumer lag
- Optimize Kafka performance for high-throughput and low-latency use cases
- Troubleshoot Kafka producer and consumer issues
Database & Proxy Management
- Deploy, configure, and manage ProxySQL for MySQL load balancing and high availability
- Implement query routing, caching, and connection pooling strategies using ProxySQL
- Optimize database performance through ProxySQL query analysis and optimization
- Implement database failover and disaster recovery using ProxySQL
- Monitor ProxySQL metrics and troubleshoot connection and performance issues
- Integrate ProxySQL with database clusters and replication topologies
- Implement database access security and audit logging through ProxySQL
Cloud Networking
- Design and implement cloud networking architectures, including VPCs, subnets, and network segmentation
- Configure and manage cloud load balancers (Application Load Balancers, Network Load Balancers, Cloud Load Balancing)
- Implement VPN connections, Direct Connect/Interconnect, and hybrid cloud networking solutions
- Implement network security controls, including security groups, network ACLs, and firewall rules
- Implement network monitoring and traffic analysis
- Troubleshoot complex networking issues across multi-cloud environments
- Design and implement private connectivity between cloud providers
Secrets Management & Security
- Configure and manage HashiCorp Vault for centralized secrets management across multi-cloud environments
- Configure Vault secret engines (KV, database, PKI, AWS, GCP, Azure dynamic secrets)
- Manage Vault high availability clusters and disaster recovery procedures
- Implement dynamic database credentials and secret rotation strategies
- Manage Vault encryption as a service for application-level encryption
- Implement Vault agent and sidecar injectors for Kubernetes workloads
- Migrate secrets from legacy systems to Vault
Qualifications, Competency & Skills Required
Education & Experience
- Bachelor's degree or diploma in Computer Science, Information Technology, Engineering, or related field
- Minimum of 5 years of proven experience in cloud engineering, DevOps, or platform engineering roles
- Hands-on experience managing production workloads across multiple cloud platforms
- Relevant cloud and technology certifications are highly desirable
Technical Skills
Cloud Platforms (Required)
- Google Cloud Platform (GCP): Deep expertise in Compute Engine, GKE, Cloud Storage, Cloud SQL, VPC, Cloud Functions, Cloud Run, IAM
- Amazon Web Services (AWS): Proficiency in EC2, EKS, S3, RDS, VPC, Lambda, ECS, CloudFormation, IAM
- Microsoft Azure: Experience with Virtual Machines, AKS, Blob Storage, Azure SQL, Virtual Networks, Azure Functions, ARM templates
- Oracle Cloud Infrastructure (OCI): Familiarity with Compute, OKE, Object Storage, networking, and OCI-specific services
- Multi-cloud architecture design and implementation experience
- Cloud migration strategies and execution (lift-and-shift, re-platforming, re-architecting)
Container & Orchestration (Required)
- Expert-level Kubernetes knowledge, including cluster architecture, networking, storage, and security
- Hands-on experience with managed Kubernetes services (GKE, EKS, AKS)
- Proficiency in Docker containerization, image optimization, and registry management
- Experience with Helm charts for application packaging and deployment
- Knowledge of container runtime environments (containerd, CRI-O)
Service Mesh & Microservices (Required)
- Istio: Deep expertise in Istio architecture, deployment, and operations
- Istio traffic management (virtual services, destination rules, gateways, service entries)
- Istio security features (mTLS, authorization policies, peer authentication, request authentication)
- Istio observability and telemetry configuration
- Multi-cluster and multi-mesh deployments
- Service mesh troubleshooting and performance optimization
- Understanding of sidecar proxy patterns and Envoy proxy
- Experience with other service mesh solutions (Linkerd, Consul Connect) is a plus
Reverse Proxy & Load Balancing (Required)
- HAProxy Advanced configuration and management for load balancing and high availability
- Nginx Expert-level configuration as reverse proxy and API gateway
- Nginx rate limiting, and performance tuning
- Nginx load balancing algorithms and upstream configurations
- Experience with Nginx modules and custom configurations
- High availability configurations using keepalived, VRRP, or similar
- Integration with Kubernetes ingress controllers (Nginx Ingress, Istio Ingress)
Infrastructure as Code (Required)
- Advanced Terraform skills for multi-cloud infrastructure provisioning
- Terraform module development, state management, and workspace strategies
- Proficiency in Ansible for configuration management and automation
- Ansible playbook development, roles, and inventory management
- Experience with version control systems (Git) and GitOps workflows
- Infrastructure testing frameworks (Terratest, Kitchen-Terraform)
CI/CD Tools (Required)
- Jenkins: Pipeline development (declarative and scripted), shared libraries, plugin management
- Harness: Deployment pipeline configuration, workflow creation, approval gates
- ArgoCD: GitOps workflows, application synchronization, multi-cluster management
- Integration of CI/CD tools with Kubernetes and cloud platforms
- Automated testing and deployment strategies
- Artifact repository management (Nexus, Artifactory, cloud-native registries)
Messaging & Streaming (Required)
- Apache Kafka architecture, cluster management, and operations
- Kafka topic design, partitioning strategies, and performance tuning
- Kafka Connect experience
- Experience with Kafka management tools (Kafka Manager, Cruise Control)
- Understanding of event-driven architectures and patterns
Database & Proxy Technologies (Required)
- ProxySQL configuration, management, and optimization
- MySQL database administration basics
- Understanding of database replication and clustering
Observability & Monitoring (Required)
- Prometheus metrics collection, PromQL, and alerting rules
- Grafana dashboard design and visualization techniques
- Log aggregation and analysis
Networking (Required)
- Deep understanding of TCP/IP, DNS, HTTP/HTTPS, and network protocols
- Cloud networking concepts (VPC, subnets, routing tables, NAT, VPN)
- Load balancing strategies and implementations
- Service discovery and DNS-based routing
- Network security and firewall configuration
- Software-defined networking (SDN) concepts
Scripting & Programming
- Proficient in scripting languages: Python, Bash
- Go or python programming basics for tooling development
- YAML and JSON for configuration management
- Understanding of software development best practices
Secrets Management (Required)
- HashiCorp Vault: Advanced knowledge of Vault architecture, deployment, and operations
- Vault authentication methods and integration with cloud providers and Kubernetes
- Vault secret engines (KV v1/v2, database, transit, cloud dynamic...
What we can offer you
- Culture -We put our people first and prioritize the well-being of every team member. We’ve built a company where all opinions carry weight and where all voices are heard. We value and respect each other and always look out for one another. Above all, we are human.
- Learning - We have a learning and development-focused environment with an emphasis on knowledge sharing, training, and regular internal technical talks.
- Compensation - You’ll receive an attractive salary, pension, health insurance, annual bonus, plus other benefits.
What to expect in the hiring process
- A technical interview with the Hiring Manager
- A behavioural and technical interview with a member of the Executive team.
Create a Job Alert
Interested in building your career at Moniepoint? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field