
Sr Software Engineer (Kubernetes and Distributed Systems) Radian Arc (EMEA)
Location & work modality: EMEA / Remote
Start: Aug 2026
Type of Contract: Contractor or FTE
About Radian Arc.
Radian Arc provides an infrastructure-as-a-service (IaaS) platform for running cloud gaming, artificial intelligence and machine learning applications inside telecommunication carrier networks. Our teams across the USA, Australia, Central Europe, Malaysia, Singapore and Japan offer telecom operators a GPU-based edge computing platform without the need for capital expenditure, facilitating low latency and improved economics for value-added services and the monetization of 5G investments.
What impact you will have
Design and develop the platform control plane responsible for managing GPU cloud infrastructure and distributed AI workloads.
This role focuses on building the core orchestration services that provision, manage, and coordinate compute, networking, and storage resources across global GPU clusters.
The Senior Software Engineer will design and implement distributed systems that power the platform’s control plane, enabling reliable orchestration of Kubernetes clusters, GPU workloads, and multi-tenant infrastructure. You will build APIs, services, and Kubernetes-native operators that automate infrastructure lifecycle management and provide the primitives required to run large-scale
AI workloads across multiple regions.
This role works closely with platform, networking, storage, and infrastructure teams to ensure the control plane integrates seamlessly with the underlying GPU infrastructure, networking fabrics, and disaggregated storage systems.
The emphasis is on independently delivering major control-plane components, solving difficult distributed-systems and orchestration problems, and improving platform reliability and operability
within the broader platform direction.
What you’ll do
Platform Control Plane Development
- Design and develop the platform control plane services responsible for managing GPU cloud infrastructure.
- Implement APIs and services that orchestrate compute, networking, and storage resources.
- Build distributed services responsible for cluster lifecycle management and infrastructure orchestration.
- Implement reliable state management systems for distributed infrastructure components.
Kubernetes Platform Integration
- Develop Kubernetes operators and controllers that automate platform infrastructure.
- Implement cluster lifecycle APIs responsible for:
○ Cluster provisioning,
○ Cluster upgrades,
○ Node lifecycle management.
○ Infrastructure automation.
- Integrate platform services with Kubernetes control planes running on bare-metal infrastructure.
AI Infrastructure Orchestration
- Develop orchestration frameworks that manage GPU workloads across distributed clusters.
- Implement platform services that optimize resource scheduling and utilization for AI workloads.
- Integrate the platform control plane with components such as:
○ NVIDIA GPU Operator,
○ KServe,
○ Argo Workflows,
○ SLURM integration,
○ KubeVirt virtualization.
Distributed Systems Engineering
- Build distributed systems that coordinate workloads across multi-region GPU clusters.
- Implement services capable of handling high-throughput infrastructure orchestration workloads.
- Design scalable mechanisms for distributed state management and coordination.
- Contribute practical design input for platform components.
Reliability & Operations
- Engineer systems for high availability and fault tolerance.
- Implement observability, monitoring, and alerting for platform services.
- Participate in incident response and on-call rotations for platform systems.
- Perform root cause analysis and implement systemic improvements to platform reliability.
Engineering Excellence
- Drive technical design decisions for platform components.
- Maintain high standards for testing, CI/CD, and operational safety.
- Participate in architecture discussions, code reviews, and system design.
- Contribute to repeatable patterns, implementation quality, and operational maturity within the platform software domain.
Technical Stack
Platform Development
- Go.
- Kubernetes controllers / operators.
- Distributed systems architecture.
- REST / gRPC APIs.
Platform Infrastructure
- Kubernetes.
- Helm.
- GitOps workflows.
AI Platform Components
- NVIDIA GPU Operator.
- KServe
- Argo Workflows.
- SLURM integration
- KubeVirt.
Storage Integration
- Weka distributed storage
- VAST disaggregated storage
- StorPool HCI
- CSI drivers
Observability
- Prometheus
- Grafana
- OpenTelemetry
What you'll need
Core Experience
- 5+ years of experience building distributed systems or infrastructure platforms.
- Strong programming experience in Go.
- Experience developing Kubernetes operators and controllers.
Kubernetes Platform Engineering
- Strong understanding of Kubernetes internals and control plane architecture.
- Experience building infrastructure automation around Kubernetes.
- Familiarity with multi-tenant Kubernetes environments.
Distributed Systems
- Experience designing and operating distributed systems at scale.
- Understanding of distributed state management and service coordination.
- Experience building reliable, highly available infrastructure services.
Infrastructure & Systems Knowledge
- Strong Linux systems knowledge.
- Experience troubleshooting complex production systems.
- Understanding of networking and storage infrastructure used by distributed systems.
Operational Excellence
- Experience operating high-availability production systems.
- Familiarity with observability tooling such as Prometheus and Grafana.
- Experience participating in on-call rotations and incident response.
Personal Attributes
- Strong analytical and problem-solving skills.
- Excellent communication and collaboration abilities.
- Passion for building reliable infrastructure systems at scale.
What we offer
- Attractive compensation package reflecting your expertise and experience.
- A great work environment characterised by friendliness, international diversity, flexibility, and a hybrid-friendly approach.
- You'll be part of a fast-growing scale-up with a mission to make a positive impact, offering an exciting career evolution.
Our job titles may span more than one job level. The actual base pay is dependent on a number of factors, such as transferable skills, work experience, business needs and market demands.
Our inclusive responsibility
Radian Arc is committed to creating a diverse and inclusive environment and is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, veteran status, or any other protected category under applicable law.
Apply for this job
*
indicates a required field