JOB DESCRIPTIONThere’s nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems.
As a Site Reliability Engineer III at JPMorgan Chase within the Chief Technology Office, you will solve complex and broad business problems with simple and straightforward solutions. We are seeking a Site Reliability Engineer (SRE) to help drive reliable, scalable, and intelligent platform operations in a global financial environment. This role combines technical support, DevOps practices, and SRE principles—including on-call incident response, automation, and a customer-first mindset. You will work with modern tools to ensure our applications and services remain robust and available.
Job Responsibilities
- Collaborate with engineering, support, and operations teams to maintain and improve the reliability of mission-critical applications.
- Participate in incident management, troubleshooting, and continuous improvement initiatives.
- Implement automation and monitoring solutions to enhance system reliability.
- Join an on-call rotation and respond effectively to production incidents.
- Share knowledge and follow best practices to foster a culture of learning and innovation.
- Communicate clearly with stakeholders and proactively solve problems.
- Focus on customer needs and deliver high-quality support.
- Document solutions and incident responses for future reference.
- Analyze system performance and recommend improvements.
- Contribute to post-incident reviews and drive process enhancements.
- Support the integration of new tools and technologies to improve operational efficiency.
Required Qualifications, Capabilities, and Skills
- Formal training or certification on SRE and Application Support concepts and 3+ years applied experience
- Demonstrate experience in SRE, DevOps, or application support roles, including knowledge of SLIs, SLOs, incident response, and troubleshooting.
- Utilize monitoring and observability tools such as Grafana, Prometheus, Splunk, and Open Telemetry.
- Apply hands-on experience with CI/CD pipelines (Jenkins, including global libraries), infrastructure as code (Terraform), version control (Git), containerization (Docker), and orchestration (Kubernetes).
- Work with cloud platforms such as AWS, GCP, or Azure, and automate infrastructure and deployments.
- Participate in on-call rotation and respond to production incidents.
- Break down complex issues, document solutions, and communicate effectively with team members and customers.
- Implement automation and monitoring solutions to support operational goals.
- Collaborate with cross-functional teams to resolve incidents and improve reliability.
- Contribute to continuous improvement of support processes and system performance.
Preferred Qualifications, Capabilities, and Skills
- Demonstrate experience in banking, fintech, or regulated environments.
- Participate in resilience engineering activities such as game days or chaos engineering.
- Mentor peers by sharing knowledge and best practices.
- Contribute to the adoption of innovative tools and approaches in support operations.
ABOUT US
hackajob is partnering with JPMorganChase to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.