Senior Site Reliability Engineer (SRE) (m/f/x)
At HelloFresh, our mission is to change the way people eat – forever. From our 2011 founding in Europe’s vibrant tech hub Berlin, we’ve become the global market leader in the meal kit sector and inspire millions of energised home cooks across the globe every week.
We offer our meal kit boxes full of exciting recipes and thoughtfully sourced, fresh ingredients in more than 12 countries, operating from offices in Berlin, New York City, Sydney, Toronto, London, Amsterdam and Copenhagen and shipped out more than 250 Million meals in 2019.
Our more than 5,000 employees are the heart and soul of our highly international, fast-paced, and dynamic environment where innovation and smart, fast action is encouraged.
We want you to join us and help take HelloFresh to the next level – as a company in its growth phase this is a great time to join. Career and development opportunities are endless.
We will encourage you to make an immediate impact in your area of work as well as empower you to grow your career with us.
Our Engineering, Data, Product and Security teams are located in Berlin and New York and are critical to what we do. From procurement tools, to conversion rate optimization, live pricing tools, payment services and add-on upselling features, we work on challenging problems and have a high output of building and releasing features and engines that make our business thrive and deliver real financial impact.
You can get a taste of what we’ve been working on by checking out our tech blog (https://engineering.hellofresh.com/).
About the job
You will be joining the Platform Tribe at HelloTech. Platform forms a stable and fresh environment for our talented teams of developers to thrive. As well as building a great foundation, Platform is also responsible for spreading their knowledge throughout the other tribes, they make sure everyone is taking advantage of the easy to use infrastructure, and applying the best practices when it comes to Reliability, Observability, Monitoring, Containerisation, Performance, Security etc.
- Build infrastructure automation on scale
- Own the solution-wide alerting strategy in the tech organisation
- Optimise the incident management systems, policies and procedures
- Ensure the engineering organisation has self-service observability tools, and advocate for observability best practices
- Drive positive change in MTTD, MTTR and MTBF metrics
- Guide and educate the engineering organisation about operations and reliability
- Remove toil in infrastructure through automation
- Pair with both your squad members and engineers in the greater organisation to spread SRE knowledge and best practices
- Undertake measured, methodical troubleshooting of complicated systems
Who we are looking for
- You have knowledge of scalable production architectures (config management, monitoring, infrastructure-as-a-code, load balancing, CDNs, distributed systems)
- You have experience with Cloud Infrastructure (eg. AWS, Azure), Kubernetes and most of the following technologies: Helm, Docker, Terraform, Graylog, Prometheus, Jaeger, Kafka, Concourse CI
- Good understanding of the SLIs, SLOs, and SLAs concepts
- Software and System Engineers with a passion for system reliability and observability, you will build reliability as a feature into our core infrastructure and applications
- Experience using Data to diagnose and troubleshoot complex distributed systems
- Experience as a software developer [Python or Go]
- You work anywhere in the stack, from right beside the OS and up
- Solid Linux background
- Familiarity with operations: metrics/statistics, incident management, post mortems etc.
- You know what to monitor and alert when things go awry
- You LOVE to automate things
- You are passionate about mentoring and sharing knowledge
It would be beneficial if you also have the following:
- Experience in negotiating SLO/SLI with product owners
- Experience building highly available and observable systems at scale
What we offer
- Relocation assistance to move to Berlin and visa application support
- Competitive compensation
- Significant reduction on our meal kits
- Annual learning and development budget to attend conferences or purchase educational resources
- Sabbatical policy
- Work in our office located in the heart of Berlin
- A diverse and vibrant international environment
- A range of perks (Free in-house crash course in German, compensation for advanced German classes, in-house lecture series and knowledge sharing programme, discounts for our neighboring gym & Urban Sports Club, free weekly yoga classes, summer & winter parties, discount on our HelloFresh GO vending machines)
- The chance to have a significant impact on one of the fastest-growing technology companies in Europe in an exciting growth phase
Are you up for a challenge?
Please submit your complete application below including your salary expectations and earliest starting date.