SRE: Observability (m/f/d)

We are an innovative technology company specializing in the development and marketing of scalable and high-performance community platforms in the online dating sector—an industry that remains stable and continues to grow even in times of crisis.

As a Site Reliability Engineer with a focus on Observability, you will ensure full transparency and deep insights into our IT systems. You can expect exciting challenges, cutting-edge technologies, and a strong team that embraces automation, efficiency, and continuous improvement.

This is what you do in our team

Development and Enhancement of the Observability Strategy:
- Design and operation of a comprehensive observability strategy for our platforms and systems.
- Ensuring optimal monitoring of performance, availability, and security.
Monitoring, Logging, and Tracing Solutions:
- Implementation and optimization of monitoring, logging, and tracing technologies.
- Evaluation and integration of new observability tools and best practices.
Dashboards, Alerting, and Analytics:
- Development and maintenance of dashboards for comprehensive system monitoring.
- Setup of alerting and analytics mechanisms for early error detection and resolution.
Collaboration and Integration:
- Close collaboration with developers, DevOps teams, and other stakeholders.
- Integration of observability practices into the software development lifecycle.
Performance Analysis & Capacity Planning:
- Conducting performance analyses to optimize system efficiency.
- Capacity planning to ensure a stable and scalable operation.
Incident Response & Root Cause Analysis:
- Supporting automation of operational processes and improving SRE principles.
- Conducting post-mortems and root cause analyses after incidents to drive continuous improvement.

This is what you are good at

At least 5 years of professional experience in Site Reliability Engineering, DevOps, or Software Development with a focus on Automation & Configuration Management.
Language skills: Very good proficiency in German (fluent in spoken and written form) and/or English (fluent in spoken and written form).
Strong expertise in observability tools such as Prometheus, Grafana.
Experience with logging technologies like ELK Stack or Splunk.
Knowledge of cloud technologies (AWS, Azure) and container orchestration (Kubernetes, Docker).
Good programming and scripting skills (e.g., Python).
Experience with Infrastructure-as-Code (Terraform, Ansible) is a plus.
Strong analytical skills and a solution-oriented mindset.
Team player with excellent communication skills and the ability to explain complex technical concepts in a clear and understandable manner.

This is what we have for you

Strong team spirit
Permanent employment contract
Flexible working hours and mobile working
Individual training and development opportunities
Subsidy for an Urban Sports Club membership (M or L)
Three weekly fitness sessions with a professional trainer
Regular team events to foster a strong team spirit
Use our JobRad leasing model and ride your dream bike for both work and personal use
Option to work remotely from abroad for a limited time in coordination with your team
Subsidy for the 58-euro ticket ("Deutschlandticket")

Are you interested?

Then we would like to get to know you and are looking forward to receive your detailed application documents including availability and your salary expectations. Just click on "Apply now", and you can directly upload your application documents.
Your contact person is Moritz.

iVentureGroup GmbH
Human Resources
Wendenstraße 21b
20097 Hamburg

If you have any questions, please feel free to email us at jobs@iventuregroup.com

We look forward to hearing from you!

Back

Apply now!

Contact Imprint Hot Jobs Agb Data Protection