Back to jobs

SRE Team Lead (Dev tools team)

The company

Nebius AI is an AI-centric public cloud platform specifically crafted to serve AI models for training and inference.

Our mission is to help ML practitioners concentrate on their core jobs, while DevOps, MLOps, and infrastructure-related tasks are handled by us. The idea is to build an ML-specific cloud platform covering the entire ML lifecycle from A to Z: from data preparation and labeling to ML training and inference.

We recognize the potential of ML and AI technologies and aim to provide our future users with the perfect environment to train and fine-tune their models. We are committed to delivering the best user experience and excellent customer support.

Four development hubs: Nebius is headquartered in the Netherlands, with hubs in Finland, Serbia, and Israel.

Data center in Europe: Our own data center in Finland features server racks designed in-house for ML-specific high load, with power-efficient solutions, including a free-cooling system.

500+ professionals:
Our mature team of engineers has a proven track record in developing sophisticated cloud and ML solutions and designing cutting-edge hardware.


In this position, your responsibility will be to:

  • Build and grow an SRE team.
  • Develop fault-tolerant and stable systems, often using open-source tech on top of our own Infrastructure cloud.
  • Disseminate engineering practices within and outside the team and help colleagues build fault-tolerant services.
  • Automate work with infrastructure.
  • Ensure security best practices are integrated into the infrastructure and operational processes.

We expect you to have:

  • Solid experience with programming languages (like Go, Python, or C++);
  • Solid understanding of classic algorithms and data structures;
  • Commercial experience with and deep understanding of Unix systems and network technology;
  • Experience with systems for containerization and configuration management (Ansible, Salt, Terraform, Docker, K8s, Helm).
  • 3 years team leader experience.

It would be an added bonus if you had:

  • Desire to be involved in backend development;
  • Experience designing, developing, and running high-load distributed systems;
  • Commercial experience with a variety of cloud platforms.

Does all that sound like your kind of challenge? Then join us!

 

Apply for this job

*

indicates a required field

Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf