Back to jobs
SRE/Platform Engineer
EU +/- 2hrs
This is a REMOTE position. We only accept candidates located in EU, Norway, UK, and Switzerland. Possibility for full time and contract based work.
Principal SRE / Platform Engineer — AI Production
The Role
You will build and own the production harness that keeps EggAI's agentic workloads safe, observable, and compliant across banking, finance, insurance, and public sector clients.
This is a hands-on Principal/Staff role. The role starts from a client's need but applies to our internal Platform solution for a EU sovereign cloud.
You will design the system, then build it — not hand specs to others.
You will work daily with an AI engineer and a project manager. You will solve hard problems alongside EggAI's infrastructure lead and head of engineering.
What You'll Build
- Production harness for agentic workloads on Kubernetes and [ARK](https://mckinsey.github.io/agents-at-scale-ark/): deployment, lifecycle management, rollback
- Guardrails: policy enforcement, output validation, circuit breakers for both long-running agents and event-driven workflows
- Agentic monitoring: drift detection, cost runaway prevention, latency SLOs, safety and compliance alerting
- Audit trails meeting financial-grade and health data compliance requirements (not optional — these are regulated environments)
What We're Looking For
- Principal/Staff SRE or platform engineer — strong fundamentals in reliability, observability, and incident ownership
- Has owned reliability for systems where "it passed tests" wasn't enough
- Strong Kubernetes experience
- Comfortable operating in regulated industries (finance, banking, health data) — audit trails and compliance constraints are part of the job
- You think about blast radius before you ship
Nice to have: prior experience with AI/ML or agentic workloads in production
About EggAI
EggAI is an enterprise-focused generative AI company on a mission to help large organizations move AI solutions from prototyping into production. We specialize in building safe, reliable, and scalable AI systems that deliver real business impact.
We work at the cutting edge of AI technology, implementing agentic systems, RAG (Retrieval-Augmented Generation) architectures, and autonomous AI agents that scale from task automation to workforce automation. Our proprietary frameworks—including the EggAI Meta Framework for agentic systems and EggAI Quality Flow for governance—power AI transformations at enterprise scale.
Based in Munich, Germany, we work with enterprise clients to build AI capability, deliver production-ready systems, and establish quality-controlled AI operations.
Create a Job Alert
Interested in building your career at EggAI? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field