
Senior Database Reliability Engineer (Platform)
What Cognite is: Relentless to achieve
Cognite operates at the forefront of industrial digitalization, building AI, and data solutions that solve the world’s hardest, highest-impact problems. With unmatched industrial heritage and a comprehensive suite of AI capabilities, including low-code AI agents, Cognite accelerates the digital transformation to drive operational improvements.
We thrive in challenges. We challenge assumptions. We execute with speed and ownership. If you view obstacles as signals to step forward - not backwards - you’ll feel right at home here.
Our Moonshot is bold: Unlock $100B in customer value by 2035, and redefine how global industry works. Join us in this venture where AI and data meet ingenuity, and together, we will forge the path to a smarter, more connected industrial future.
How you’ll demonstrate Ownership
We need a DBRE who views "Cloud" as just another API. You will spend your time writing Go code to bridge the gaps between different cloud providers, ensuring our stateful services (Postgres, Kafka, Elasticsearch) behave consistently regardless of the underlying infrastructure.
What you will do:
- Multi-Cloud Automation: Develop Go-based tools and Kubernetes Operators to manage the lifecycle of 1,000+ clusters across AWS, GCP, and Azure.
- Cloud-Agnostic Resilience: Design and test failover strategies that account for cloud-specific nuances (e.g., storage latency variances). Design database architecture based on RTO, RPO, HA, and disaster recovery.
- Unified Observability: Build a "single pane of glass" for database health (SLIs/SLOs) that aggregates telemetry from diverse cloud environments and on-prem deployments.
- Hybrid-Cloud Strategy: Evaluate when to use a Cloud Provider’s native DBaaS vs. when to run our own self-managed clusters on Kubernetes to avoid vendor lock-in.
- Advanced Engineering: Solve high-complexity challenges including Customer Managed Encryption Keys (CMEK), automatic database resizing/splitting, and mitigating "noisy neighbor" problems.
The Impact you bring to Cognite
- Software Engineering: Professional-grade Go (Golang) experience. You prefer building controllers and automation over manual cloud-console clicking.
- Ideal candidate will be 6-10 years of experience.
- Database Internals: Deep knowledge of Postgres, Kafka, or Elasticsearch. You understand replication lag, WAL shipping, and partition rebalancing in distributed environments.
- Kubernetes Mastery: Experience with StatefulSets, CSI drivers, and the Operator pattern. You know how to make Kubernetes behave when the "state" is high-stakes.
- Multi-Cloud Fluency: Experience managing stateful workloads across at least two major clouds (GCP/Azure/AWS).
- Operational Grit: Experience in a 24/7 high-availability environment. You understand that "Reliability" is a feature that must be engineered, not just monitored.
- Impact 2025
- Cognite's Industrial AI: Moonshot
- We’re globally recognized domain experts with an international presence that spans Phoenix, Houston, Oslo Tokyo, Bengaluru, and Abu Dhabi.
Create a Job Alert
Interested in building your career at Cognite - AI for Industry? Get future opportunities sent straight to your email.
Apply for this job
*
indicates a required field
