Observability Engineer (SRE)

Job Title: Observability Engineer (SRE)

Job Category: Technical

Job Type: Full Time

City: Hyderabad

Bond: No

Experience: 2+

Qualification: Any degree

Salary: Depends on Skill

About the Role:

We’re looking for a passionate and detail-oriented Observability Engineer (SRE) to build and manage monitoring, alerting and telemetry systems that keep our platforms reliable, fast and scalable.

You’ll work at the intersection of software engineering and systems operations—ensuring uptime, improving performance and empowering teams with visibility into production systems.

Key Responsibilities:

  • Design and implement end-to-end observability solutions (metrics, logs, traces)
  • Build dashboards, alerts and insights using tools like Prometheus, Grafana, ELK, DataDog or New Relic
  • Collaborate with SREs and developers to improve system reliability and reduce MTTR
  • Automate monitoring and alerting for distributed services and cloud infrastructure
  • Implement SLOs/SLIs and improve incident response workflows
  • Optimize logging and tracing pipelines for performance and cost
  • Lead root cause analysis and drive post-incident reviews

Requirements:

  • 2+ years in SRE, DevOps or Infrastructure Engineering
  • Strong experience with observability tools: Prometheus, Grafana, ELK, Datadog, New Relic, etc.
  • Knowledge of distributed systems and cloud platforms (AWS/GCP/Azure)
  • Proficient in scripting (Python, Bash, etc.) and infrastructure as code (Terraform, Ansible)
  • Familiarity with CI/CD pipelines and container orchestration (Docker, Kubernetes)
  • Strong understanding of system performance, monitoring and reliability principles

Job Category: Cloud Engineer

Job Type: Full Time

Job Location: Hyderabad

Apply for this position

Allowed Type(s): .pdf, .doc, .docx