Incident & Change Champion
.
About Nscale
Nscale is the GPU cloud engineered for AI. We provide cost-effective, high-performance infrastructure for AI start-ups and large enterprise customers. Nscale enables AI-focused companies to achieve superior results by reducing the complexity of AI development. Our GPU cloud bolsters technical capabilities and directly supports strategic business outcomes, including cost management, rapid innovation, and environmental responsibility.
We thrive on a culture of relentless innovation, ownership, and accountability, where every team member takes pride in their work and drives it with excellence and urgency. As an Nscaler, you’ll build trust through openness and transparency, where everyone is inspired to do their best work. If you join our team, you’ll be contributing to building the technology that powers the future.
Role Summary
Nscale's Incident Management and Change Management processes lack a single owner driving them as operational discipline. As the company onboards large scale workloads, we need a dedicated process champion sitting inside Support — the team that owns both functions — to own the processes, implement them in tooling, train the organization, advocate across teams, run the daily operational rhythm, and report on program health to leadership.
This is a hands-on, operational role. You will be the person in the bridge when MI\SEV-1s fire, the chair of the Change Advisory Board, the author of postmortem templates, the trainer of new Incident Commanders, and the analyst presenting monthly process metrics to the SLT. The work spans process design, tooling configuration and culture change.
What You Will Own
Own the processes. Take the in-flight Incident Management and Change Management process documents to a v1.0 state. Close the gaps that are already known: severity declaration authority, IC/scribe/comms-lead role separation, SLA\SLO tables for ack and resolution, customer communication ladder, war-room scaling beyond -red/-blue, change risk classification, emergency change path, change freeze policy, postmortem template, RCA SLA\SLO.
Implement in tooling. Drive the Jira Service Management implementation for incident and change workflows as part of the active Servicely-to-Jira migration. Define required fields, ticket hygiene standards, escalation routing, automation, and integrations. Ensure the service catalogue is accurate, current, and properly referenced by both incident and change tickets so impact analysis is reliable.
Run the operational rhythm. Act as Incident Commander or Major Incident Manager for SEV-1 and complex SEV-2 events. Chair the Change Advisory Board on a defined cadence. Facilitate postmortems and drive action items to closure with measurable SLAs. Manage the change calendar including freeze windows around customer-critical periods. Coordinate communications during incidents — internal updates, customer notifications, executive escalation, regulatory notification where sovereign workloads require it.
Train and advocate. Build and certify a pool of Incident Commanders across Support, SRE, and adjacent engineering teams. Run tabletop exercises and game days on a quarterly cadence (immediate priority: three tabletops in May, June, and July leading into first production customer go-live). Onboard engineers to both processes as they join. Be the visible champion for blameless postmortem culture, mitigate-first response, and disciplined change practice.
Report on health program. Define the metrics that matter — mean time to acknowledge, mean time to mitigate, mean time to resolve, postmortem closure rate, recurrence rate, change success rate, change-caused incident rate, action-item ageing — and publish a monthly program report to the SLT. Identify systemic issues from trend analysis and feed them back into runbooks, training, and process revisions.
Required Experience
- 5+ years in ITSM / Service Management roles with direct ownership of Incident Management and Change Management processes
- Hands-on experience facilitating major incidents end-to-end as Incident Commander or Major Incident Manager in a 24/7 production environment
- Demonstrable experience running a Change Advisory Board or equivalent change-review forum
- Proven track record configuring Jira Service Management, ServiceNow, or equivalent ITSM tooling for both incident and change workflows
- Strong writing skills — process documents, postmortems, executive incident reports, training material
- Comfort holding the room under pressure with senior stakeholders, engineers, and customers concurrently on the bridge
Strongly Preferred
- Experience in cloud, hyperscaler, AI infrastructure, or HPC environments
- Familiarity with SRE concepts — SLOs, error budgets, blameless postmortems, runbook discipline
- Experience designing and running tabletop exercises and game days
- Experience operating processes for regulated or sovereign customer workloads where notification timing has regulatory weight
- Familiarity with Jira's automation, JSM portals, and integration ecosystem (the migration is in flight)
- Comfortable working across time zones and cultures — Nscale spans Norway, UK, Finland, Portugal, and the US
What We Can Offer You
At Nscale, you'll find a collaborative, supportive, and innovative environment where your contributions spark real impact. We're building something extraordinary, and we want you at the core.
- Highly competitive package (base + equity) with reviews every 12 months. 🚀
- Join the fastest-growing tech startup, your chance to push boundaries, collaborate with brilliant minds, and make your mark on cutting-edge AI. ✨
- Expect a dynamic progression plan tailored to your ambitions. Grow by trying new things, leading, challenging the status quo, and owning your impact, always with our full support.
- Human-First Flexibility: We treat you as humans first. 🫶🏽 Our flexible workplace trusts Nscalers to deliver, giving you the autonomy to shape your day around life's moments.
Join our thriving remote-first team. Geography is no barrier to impact or connection. We build seamless virtual collaboration, empowering you, wherever you work.
Equal Opportunities Statement
We strongly encourage applications from people of colour, the LGBTQ+ community, people with disabilities, neurodivergent people, parents, carers, and people from lower socio-economic backgrounds.
If there’s anything we can do to accommodate your specific situation, please let us know.
The responsibilities outlined in this job description are not exhaustive and are intended to provide a general overview of the position. The employee may be required to perform additional duties, tasks, and responsibilities as assigned by management, consistent with the skills and qualifications required for the role.
For information on how Nscale handles candidate personal data, please see our Employee & Candidate Privacy Notice: Here.
Apply for this job
*
indicates a required field
