Site Reliability Engineer III

Aumni

Aumni

IT, Software Engineering
Hyderabad, Telangana, India
Posted on Sep 4, 2025

Site Reliability Engineer III

Hyderabad, Telangana, India

Job Information

  • Job Identification 210663092
  • Job Category Software Engineering
  • Business Unit Corporate Sector
  • Posting Date 09/03/2025, 01:20 PM
  • Locations MAGMA,UNIT-1,PHASE-IV,SY NO.83/1,PLOT NO 2, GR Floor TO 2 Floor and 5 Floor TO 16 Floor,Basement 1,2, Hyderabad, IN-TG, 500081, IN
  • Apply Before 10/03/2025, 06:00 AM
  • Job Schedule Full time

Job Description

As a Site Reliability Engineer III at JPMorgan Chase within the Chief Technology Office, you will collaborate with engineering, support, and operations teams to maintain and improve the reliability of mission-critical applications. You’ll participate in incident management, troubleshooting, and continuous improvement, and help implement automation and monitoring solutions. On-call rotation is part of the role, requiring effective action during production incidents and a commitment to operational excellence. You’ll share knowledge, follow best practices, and contribute to a culture of learning and innovation. We value team players who communicate clearly, solve problems proactively, and focus on customer needs.

Job responsibilities

  • Design, develop, and operate solutions for application reliability, monitoring, and automation.
  • Execute incident response, troubleshooting, and root cause analysis to resolve production issues and improve system stability.
  • Build and maintain CI/CD pipelines using Jenkins (including global libraries), and implement infrastructure as code with Terraform.
  • Develop and support containerized applications using Docker and Kubernetes, ensuring robust deployments and scalability.
  • Implement and maintain observability solutions using tools such as Grafana, Prometheus, Splunk, and OpenTelemetry.
  • Collaborate with engineering and support teams to drive continuous improvement and operational excellence.
  • Participate in on-call rotation, responding to production incidents and ensuring timely resolution.

Required qualifications, capabilities, and skills

  • Formal training or certification on Site Reliability Engineering concepts and 3+ years applied experience
  • Experience in SRE, DevOps, or application support roles, with knowledge of SLIs/SLOs, incident response, and troubleshooting.
  • Familiarity with monitoring and observability tools (e.g., Grafana, Prometheus, Splunk, OpenTelemetry).
  • Hands-on experience with CI/CD pipelines (Jenkins, including global libraries), infrastructure as code (Terraform), version control (Git), containerization (Docker), and orchestration (Kubernetes).
  • Exposure to cloud platforms (AWS, GCP, or Azure) and automating infrastructure and deployments.
  • Willingness to participate in on-call rotation and respond to production incidents.
  • Ability to break down issues, document solutions, and communicate effectively with team members and customers.

Preferred qualifications, capabilities, and skills

  • Familiar in banking, fintech, or regulated environments.
  • Participation in game days or chaos engineering.
  • Interest in sharing knowledge and best practices with peers.

Similar Jobs