New

Middle SRE Engineer

Europe
Growe welcomes those who are excited to:
  • Ensure availability, performance, and scalability of infrastructure and services through monitoring, automation, and operational best practices;

  • Lead incident response, perform root cause analysis, and implement recovery and long-term fixes;

  • Manage infrastructure using Terraform, Terragrunt, and automation tools for consistency and repeatability;

  • Implement and maintain metrics, logs, and tracing solutions (Prometheus, Grafana, Loki, VictoriaMetrics, CloudWatch) to ensure system visibility;

  • Identify bottlenecks, tune systems, and improve infrastructure performance;

  • Monitor resources, forecast growth, and implement scaling strategies;

  • Integrate security best practices into IaC, CI/CD pipelines, and deployments;

  • Participate in 24/7 rotations for the timely resolution of critical incidents;

  • Work with DevOps, PRE, development, and security teams to improve reliability and design resilient systems;

  • Maintain operational runbooks, incident reports, and system documentation.

We need your professional experience:
  • Bachelor’s degree in Computer Science, Information Technology, or related field; 

  • 3+ years in a DevOps, SRE, or related role;

  • Strong hands-on experience with AWS services including EC2, ECS, EKS, RDS, DocumentDB, ElastiCache, Keyspaces, S3, EBS, VPC, Route53, KMS, ACM, and CloudWatch;

  • Proficiency with Terraform, Terragrunt, and Atlantis for reproducible and version-controlled infrastructure;

  • Experience with GitLab CI, FluxCD, Argo Rollouts, and automation tools (Ansible, Python, Bash);

  • Solid experience with Docker, Kubernetes (AWS EKS), and Helm (including custom templates, ChartMuseum);

  • Familiarity with cluster add-ons such as KEDA, VPA, Karpenter, External-DNS, ingress-nginx, aws-alb-controller, and ebs-csi-driver;

  • Hands-on experience with Grafana, VictoriaMetrics stack, Tempo, metrics exporters, Pingdom, AWS CloudWatch, and alerting systems like PagerDuty, VMAlert, and Alertmanager;

  • Proficiency with Grafana Loki, OpenSearch, and Vector Agent for centralized logging;

  • Strong understanding of networking concepts, AWS networking (VPC, Network Firewall, Transit Gateway, Site-to-Site VPN), identity and access management, certificate management (ACM, Vault, SOPS), and application security best practices;

  • Familiarity with Cloudflare services including caching, DNS, and Workers;

  • Exposure to AWS Cost Explorer, KubeCost, and custom cost export tools.

We appreciate those skills and personal features:
  • Analytical thinking – skilled at interpreting metrics, logs, and system behavior to drive informed decisions;

  • Attention to detail – ensures precision in infrastructure changes, configurations, and deployment processes;

  • Adaptability – quick to learn new tools and technologies, with strong ability to adjust to changing environments.

We are seeking those who align with our core values:
  • GROWE TOGETHER: Our team is our main asset. We work together and support each other to achieve our common goals;

  • DRIVE RESULT OVER PROCESS: We set ambitious, clear, measurable goals in line with our strategy and driving Growe to success;

  • BE READY FOR CHANGE: We see challenges as opportunities to grow and evolve. We adapt today to win tomorrow.

Apply for this job

*

indicates a required field

Resume/CV*

Accepted file types: pdf, doc, docx, txt, rtf

Cover Letter

Accepted file types: pdf, doc, docx, txt, rtf


Select...
Select...