Senior Lead SRE - Databricks

Hyderabad, Telangana, India

Operations Engineer Site Reliability Engineer Platform Engineer DevOps Engineer Cloud Engineer

Actively hiring

Senior Lead SRE - Databricks

JPMorganChase

Hyderabad, Telangana, India

Operations Engineer Site Reliability Engineer Platform Engineer DevOps Engineer Cloud Engineer

JPMorganChase

Actively hiring

hackajob is partnering with JPMorganChase to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.

Elevate your engineering prowess to unprecedented levels by joining a team of exceptionally gifted professionals and position yourself among the top echelon in site reliability.

As a Principal Site Reliability Engineer at JPMorgan Chase within the AI/ML & Data platform team, you work with your fellow stakeholders to define non-functional requirements (NFRs) and availability targets for the services in your application and product lines. You will ensure those NFRs are accounted for in your products’ design and test phases, that your service level indicators are effectively measuring customer experience, and that service level objectives are defined with stakeholders and implemented in production.

Job responsibilities

Demonstrate expertise in application development and support with multiple technologies such as Databricks, Snowflake, AWS, Kubernetes, etc.
Coordinate incident management coverage to ensure effective resolution of application issues.
Collaborate with cross-functional teams to perform root cause analysis and implement production changes.
Mentor and guide team members to foster innovation and strategic change.
Develop and support AI/ML solutions for troubleshooting and incident resolution.

Required qualifications, capabilities, and skills

Formal training or certification on SRE concepts and 5+ years applied experience
Proficient in site reliability culture and principles and familiarity with how to implement site reliability within an application or platform
Proficiency in running production incident calls and managing incident resolution.
Experience in observability such as white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, and others
Strong understanding of SLI/SLO/SLA and Error Budgets
Proficiency in Python or PySpark for AI/ML modeling.
Must be able to reduce toil by building new tools to automate repeated tasks.
Hands-on experience in system design, resiliency, testing, operational stability, and disaster recovery
Understanding of network topologies, load balancing, and content delivery networks.
Awareness of risk controls and compliance with departmental and company-wide standards.
Ability to work collaboratively in teams and build meaningful relationships to achieve common goals.

Preferred qualifications, capabilities, and skills

SRE or production support role with AWS Cloud, Databricks, Snowflake or similar Technologies.
AWS and Databricks certifications.

hackajob is partnering with JPMorganChase to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.

Upskill

Level up the hackajob way. Verify your skills, learn brand new ones and test your ability with Pathways, our learning and development platform.

Find out more

Ready to reach your potential?

Find out more

Platform

Solutions

Resources

Senior Lead SRE - Databricks

Hyderabad, Telangana, India

Actively hiring

Senior Lead SRE - Databricks

JPMorganChase

Hyderabad, Telangana, India

JPMorganChase

Actively hiring

Upskill

Ready to reach your potential?