Cloud Infrastructure Engineer (Kineto)
Kineto is a next-generation platform that enables creators, educators, and small businesses to generate, deploy, and operate fully functional AI-powered web applications – instantly and at scale. It combines LLM-driven code generation, multi-tenant Postgres (Neon), dynamic hosting (GKE and Knative), automated deployments (Flux), analytics, billing, and a seamless chat-based UX to make software creation accessible to everyone. Our team is growing rapidly, and we’re now seeking an experienced Infrastructure Engineer who can design, build, and maintain our cloud-native platform, with a focus on scalability, reliability, and automated operations.
What you’ll do:
Cloud and platform engineering (DevOps):
- Design, implement, and manage the core infrastructure powering Kineto's platform on Google Cloud Platform (GCP), including networking, security, and identity management.
- Build and operate resilient, highly available distributed systems using Kubernetes (GKE), Knative, Istio, and related cloud-native technologies.
- Automate the entire infrastructure life cycle (IaC) using Terraform and Terragrunt, ensuring secure, reproducible, and auditable environments.
- Implement and maintain CI/CD pipelines (e.g. GitHub Actions and TeamCity) and deployment tools like Flux and Helm for GitOps-driven application delivery.
- Optimize and manage the multi-tenant data layer on Postgres and Neon, focusing on robust tenant isolation, performance, backups, and safe schema management.
Operational excellence and reliability:
- Drive site reliability engineering (SRE) practices, including monitoring, alerting (Prometheus, Grafana), logging (Loki), and incident response.
- Solve complex operational challenges, such as optimizing scale to zero for cost efficiency, minimizing cold starts, enhancing autoscaling behavior, and managing queue backpressure.
- Implement platform-wide performance tuning (e.g. container resource limits, distributed locks, caching strategies, and GC configurations).
- Ensure platform security and compliance by implementing best practices for secrets management, network segmentation, and vulnerability scanning.
Technical leadership:
- Own major infrastructure roadmap items, including multi-region deployments, disaster recovery planning, advanced tenancy separation, and ephemeral preview environments.
- Champion DevOps and SRE principles across the engineering team, mentoring engineers on cloud-native best practices, operational readiness, and debugging complex distributed systems.
- Collaborate with product and engineering teams to define the long-term vision for the platform's architecture and operational model.
We’d be glad to have you on our team if you:
- Have five or more years of experience building and operating large-scale, commercial cloud-native infrastructure, with a strong focus on DevOps/SRE practices.
- Possess deep, hands-on expertise with GCP (or AWS/Azure) and Kubernetes administration and operations (GKE experience is a strong plus).
- Are proficient with infrastructure-as-code (IaC) tools, particularly Terraform, for managing complex environments.
- Have a solid understanding of Linux internals, networking (CNI and service mesh), security, and distributed system design.
- Are familiar with CI/CD tools, GitOps (e.g. Flux), monitoring stacks (Prometheus/Grafana), and logging systems.
- Thrive in cross-functional teams and excel at communicating complex infrastructure ideas clearly.
We process the data provided in your job application in accordance with the Recruitment Privacy Policy.
Create a Job Alert
Interested in building your career at JetBrains? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field

