Site Reliability Engineer III

Aumni

Aumni

Software Engineering, IT
Hyderabad, Telangana, India
Posted on Dec 11, 2025

There’s nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems.

As a Site Reliability Engineer III at JPMorgan Chase within the Chief Technology Office, you will solve complex and broad business problems with simple and straightforward solutions. We are seeking a Site Reliability Engineer (SRE) to help drive reliable, scalable, and intelligent platform operations in a global financial environment. This role combines technical support, DevOps practices, and SRE principles—including on-call incident response, automation, and a customer-first mindset. You will work with modern tools to ensure our applications and services remain robust and available.

Job Responsibilities

  • Collaborate with engineering, support, and operations teams to maintain and improve the reliability of mission-critical applications.
  • Participate in incident management, troubleshooting, and continuous improvement initiatives.
  • Implement automation and monitoring solutions to enhance system reliability.
  • Join an on-call rotation and respond effectively to production incidents.
  • Share knowledge and follow best practices to foster a culture of learning and innovation.
  • Communicate clearly with stakeholders and proactively solve problems.
  • Focus on customer needs and deliver high-quality support.
  • Document solutions and incident responses for future reference.
  • Analyze system performance and recommend improvements.
  • Contribute to post-incident reviews and drive process enhancements.
  • Support the integration of new tools and technologies to improve operational efficiency.

Required Qualifications, Capabilities, and Skills

  • Formal training or certification on SRE and Application Support concepts and 3+ years applied experience
  • Demonstrate experience in SRE, DevOps, or application support roles, including knowledge of SLIs, SLOs, incident response, and troubleshooting.
  • Utilize monitoring and observability tools such as Grafana, Prometheus, Splunk, and Open Telemetry.
  • Apply hands-on experience with CI/CD pipelines (Jenkins, including global libraries), infrastructure as code (Terraform), version control (Git), containerization (Docker), and orchestration (Kubernetes).
  • Work with cloud platforms such as AWS, GCP, or Azure, and automate infrastructure and deployments.
  • Participate in on-call rotation and respond to production incidents.
  • Break down complex issues, document solutions, and communicate effectively with team members and customers.
  • Implement automation and monitoring solutions to support operational goals.
  • Collaborate with cross-functional teams to resolve incidents and improve reliability.
  • Contribute to continuous improvement of support processes and system performance.

Preferred Qualifications, Capabilities, and Skills

  • Demonstrate experience in banking, fintech, or regulated environments.
  • Participate in resilience engineering activities such as game days or chaos engineering.
  • Mentor peers by sharing knowledge and best practices.
  • Contribute to the adoption of innovative tools and approaches in support operations.



We are seeking a Site Reliability Engineer (SRE) to help drive reliable, scalable, and intelligent platform operations