Back to jobs

HPC Systems Engineer (Network)

United Kingdom

Join Nscale as a HPC Systems Engineer

 

Are you passionate about Data Centre builds and large scale GPU infrastructure projects? Do you thrive in a fast-paced, high-growth environment where your work has a direct impact on business outcomes? If so, this could be the role for you!

 

Nscale is the GPU cloud engineered for AI. We provide cost-effective, high-performance infrastructure for AI start-ups and large enterprise customers. Nscale enables AI-focused companies to achieve superior results by reducing the complexity of AI development. Our GPU cloud bolsters technical capabilities and directly supports strategic business outcomes, including cost management, rapid innovation, and environmental responsibility.

 

At Nscale, our Engineering team plays a critical role in driving the delivery of our GPU infrastructure. If you're passionate about datacenter architecture and thrive in high-performance computing environments then please apply.

 

Why Nscale?

We thrive on a culture of relentless innovation, ownership, and accountability, where every team member takes pride in their work and drives it with excellence and urgency. As an Nscaler, you’ll build trust through openness and transparency, where everyone is inspired to do their best work. If you join our team, you’ll be contributing to building the technology that powers the future.

 

About the Role

Network engineers at Nscale are responsible for the design, deployment, and ongoing operation of all networking services that underpin both the internal management platform and the customer-facing cloud infrastructure, this includes internet transit, WAN connectivity and DC networking. You will act as a 3/4th line escalation point for the support organisation.

 

What You’ll be Doing

  • Designing, deploying, and operating large-scale HPC clusters and GPU-based compute environments

  • Creating and maintaining hardware architectures, including BOMs, rack elevations, and reference designs

  • Implementing and maintaining HPC scheduling and workload management systems (e.g., Slurm)

  • Designing and optimising InfiniBand and Ethernet network topologies (Fat Tree, Dragonfly, rail-optimized configurations)

  • Working with deployment teams to ensure cluster builds align with architectural specifications

  • Automating provisioning, configuration, and operations of multi-vendor HPC hardware and software stacks

  • Troubleshooting and tuning cluster performance across compute, storage, and interconnect layers
  • Collaborating with software, infrastructure, and data center teams to ensure seamless integration of HPC environments

 

About you

  • Proven experience in designing, deploying, and operating HPC or large-scale compute clusters

  • Strong knowledge of Slurm or similar workload management systems (e.g., PBS, LSF)

  • Proven experience in InfiniBand networking design and operations, including subnet management, QoS, RDMA, and performance tuning

  • Experience with high-speed Ethernet networks and associated protocols (e.g., VLAN, LACP, BGP, OSPF, EVPN, VXLAN)

  • Familiarity with HPC network topologies such as Fat Tree or Dragonfly

  • Experience creating hardware BOMs, rack layouts, and reference architectures for compute deployments

  • Strong scripting skills in Python and/or Bash for automation and orchestration

  • Solid understanding of optics, cabling, and physical layer design considerations for HPC and GPU cluster environments

  • Strong analytical, troubleshooting, and documentation skills

  • A collaborative mindset and passion for building high-performance, scalable infrastructure

 

Personal Attributes

  • Proactive and self-motivated, with a strong sense of ownership.
  • Thrives in a fast-paced, dynamic, and high-growth environment.
  • Collaborative team player with a passion for delivering outstanding candidate and stakeholder experiences.
  • Strong attention to detail and documentation skills.
  • Excellent communication skills, both written and verbal.
  • A self-starter mindset with a “see a problem, fix a problem” mentality.

 

Please Note: This role will require 20-30% travel to our European sites.

 

What We Can Offer You

 

At Nscale, you'll find a collaborative, supportive, and innovative environment where your contributions spark real impact. We're building something extraordinary, and we want you at the core.

  • Highly competitive package (base + equity) with reviews every 12 months. 🚀
  • Join the fastest-growing tech startup, your chance to push boundaries, collaborate with brilliant minds, and make your mark on cutting-edge AI. ✨
  • Expect a dynamic progression plan tailored to your ambitions. Grow by trying new things, leading, challenging the status quo, and owning your impact, always with our full support. 
  • Human-First Flexibility: We treat you as humans first. 🫶🏽 Our flexible workplace trusts Nscalers to deliver, giving you the autonomy to shape your day around life's moments.

Join our thriving remote-first team. Geography is no barrier to impact or connection. We build seamless virtual collaboration, empowering you, wherever you work.

 

Equal Opportunities Statement

We strongly encourage applications from people of colour, the LGBTQ+ community, people with disabilities, neurodivergent people, parents, carers, and people from lower socio-economic backgrounds.

If there’s anything we can do to accommodate your specific situation, please let us know.

The responsibilities outlined in this job description are not exhaustive and are intended to provide a general overview of the position. The employee may be required to perform additional duties, tasks, and responsibilities as assigned by management, consistent with the skills and qualifications required for the role.

Apply for this job

*

indicates a required field

Phone
Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf