Infrastructure Support Engineer - APAC (GPUs)
About Nscale
Nscale is the GPU cloud engineered for AI. We provide cost-effective, high-performance infrastructure for AI start-ups and large enterprise customers. Nscale enables AI-focused companies to achieve superior results by reducing the complexity of AI development. Our GPU cloud bolsters technical capabilities and directly supports strategic business outcomes, including cost management, rapid innovation, and environmental responsibility.
At Nscale, our Support and Operations team plays a critical role in maintaining service availability, driving service reliability and rapid response to customer tickets.
We thrive on a culture of relentless innovation, ownership, and accountability, where every team member takes pride in their work and drives it with excellence and urgency. As an Nscaler, you’ll build trust through openness and transparency, where everyone is inspired to do their best work. If you join our team, you’ll be contributing to building the technology that powers the future.
Infrastructure Support Engineer - APAC
What You'll be Doing
- Join the Support duty rotation and handle day‑to‑day tickets and alerts, escalating early and appropriately. Collaborate with Engineering with guidance when incidents or changes require it
- Accurately record, update, manage and resolve tickets using the ticketing system whilst keeping all parties informed of the tickets progression
- Follow established runbooks to resolve common issues. Propose improvements and contribute incremental fixes with review
- Keep tickets up to date with clear notes, next steps, and customer communications via the agreed channels
- Learn the Platform fundamentals so you can help customers get value from our services, asking for support when deeper expertise is needed
- Participate in monitoring, troubleshooting, and triage. Capture logs and facts to enable efficient handover
- Deliver assigned tasks and project work to agreed quality and timelines. Flag blockers early and seek help when needed
- Share knowledge by documenting steps you’ve validated and by contributing to training materials. Shadow seniors during complex work to build capability
- Take part in incident reviews as a contributor and help track preventative follow‑ups in your scope
- Identify areas for implementation for automation to optimize processes
- Constantly endeavour to learn and upskill
- Collaborate with cross-functional teams for service improvements. Be the escalation point for onsite operations staff
- Participate in on‑call or out‑of‑hours work when scheduled and after onboarding.
- Availability to travel to Nscale or Customer locations to assist with deployments, trouble shooting and operational tasks and attendance of supplier related training courses.
About You
- 2-4 years of experience in a support, operations, or infrastructure engineering role, ideally within a cloud, Data Centre, or managed services environment
- Growth mindset. Curious, dependable, and collaborative. You seek feedback, ask questions, and invest in learning to progress toward Senior
- Platform and DC fundamentals. Awareness of servers, networks, storage, and virtualization concepts, ideally from a support or operations background
- Linux fundamentals. Comfortable with the CLI, services via system, filesystems, permissions, and basic networking tools. Able to troubleshoot common issues and know when to escalate
- Networking basics. Solid grasp of IP addressing, subnets, VLANs, routing at a high level, DNS, and firewalls. Advanced topics like BGP or VXLAN are a plus, not required
- Kubernetes exposure. Understand core concepts like nodes, pods, services, and logs. Can perform basic troubleshooting and follow runbooks. Cluster‑level administration experience is a nice to have
- GPU awareness. Familiar with basic diagnostics such as nvidia‑smi
- Observability foundations. Able to use dashboards and alerts to identify symptoms, gather evidence, and follow runbooks. Comfortable proposing simple alert or dashboard tweaks with review
- Scripting and automation basics. Comfortable reading and writing simple Bash or Python snippets and using Git for version control. Experience with Ansible or Terraform is beneficial but not required
- Cloud and virtualization basics. Familiarity with common hypervisor or cloud troubleshooting flows. OpenStack experience is a plus, not a requirement
Nice to Have
- Hands‑on exposure to Kubernetes administration, operators, and storage or networking add‑ons
- Deeper GPU/HPC concepts such as RDMA/InfiniBand, performant distributed workload basics, or job schedulers. Awareness and used NCCL for performance troubleshooting
- Infrastructure as Code and config management tools like Ansible or Terraform
- GitOps and CI/CD participation. Contributing to pipelines and modernizing scripts using GitHub Actions or similar
- Experience with access and security tooling used at Nscale, such as Teleport or Vault.
- Progress toward relevant certifications over time (e.g., Linux, Kubernetes, cloud, or security)
In All We Do, Our Core Values Guide Us
Relentless Innovation
At Nscale, we constantly push the boundaries of innovation, embracing creative risks to shape the future. Our aim is to deliver products that not only meet but exceed today’s expectations, setting new standards for tomorrow.
Ownership and Accountability
Every Nscaler is fully accountable for their work, driving it with excellence and urgency. We set high standards, ensuring that our contributions are not just good but exceptional.
Openness and Transparency
We believe trust and transparency are key to our success. We maintain open communication within our teams and with stakeholders, sharing both successes and challenges. Our open-source approach allows customers to explore our technology, building trust and ensuring our solutions are both innovative, secure, and reliable.
Customer-Centric Focus
Our customers are central to our mission, and we are committed to delivering impactful solutions that drive real-world success. We focus on deeply understanding their needs and challenges, striving to exceed expectations in both product quality and service.
Sustainability
We are dedicated to considering the long-term environmental and societal impacts of our technologies. By integrating sustainability into our operations and product development, we ensure that our innovations are both effective and responsible, contributing positively to the world around us.
Full-Speed Collaboration
Collaboration at Nscale is fast, efficient, and respectful. We work together seamlessly, with clear communication and mutual respect, ensuring our shared goals are met with high standards and impactful outcomes.
Equal Opportunities Statement
We strongly encourage applications from people of colour, the LGBTQ+ community, people with disabilities, neurodivergent people, parents, carers, and people from lower socio-economic backgrounds.
If there’s anything we can do to accommodate your specific situation, please let us know.
The responsibilities outlined in this job description are not exhaustive and are intended to provide a general overview of the position. The employee may be required to perform additional duties, tasks, and responsibilities as assigned by management, consistent with the skills and qualifications required for the role.
For information on how Nscale handles candidate personal data, please see our Employee & Candidate Privacy Notice: Here.
For information on how Nscale handles candidate personal data, please see our Employee & Candidate Privacy Notice: Here.
Apply for this job
*
indicates a required field
