Sourcing as a channel, not a feature.

Site Reliability Champion, Senior Specialist

Dallas, TX, USA
Up to $170,000/ year
Site Reliability Engineer Operations Engineer DevOps Engineer

Site Reliability Champion, Senior Specialist

Vanguard
Dallas, TX, USA
Up to $170,000/ year
Site Reliability Engineer Operations Engineer DevOps Engineer
Vanguard

hackajob is partnering with Vanguard to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.

 

The Site Reliability Engineer (SRE) for Global Technology Operations (GTO) is a strategic technical leader responsible for ensuring the resiliency and stability of the critical applications our crew and clients rely on every day. This role combines deep hands‑on engineering expertise with enterprise‑level influence. You will help define what resiliency means at Vanguard and partner across teams to design, test, and strengthen some of our most critical systems. In addition, you will automate incident response capabilities and pioneer AI‑enhanced diagnostics and analysis to improve detection, response, and recovery. You will work alongside a collaborative, technically focused team where your innovations in resiliency engineering directly shape Vanguard’s next generation of reliable, client‑centric experiences.

Core Responsibilities:

  • Evaluate applications, platforms, and vendors to assess resiliency, reliability, and operational risk.

  • Design and implement processes that enforce enterprise resiliency and reliability standards.

  • Lead blameless post‑incident reviews for high‑severity incidents or incidents spanning multiple complex product families.

  • Partner with product and platform teams to proactively identify and remediate reliability risks before they impact clients.

  • Develop, communicate, and evangelize new standards, tools, and frameworks across subdivisions, ensuring consistent adoption.

  • Troubleshoot complex production issues and implement durable solutions that prevent recurrence.

  • Participate in a periodic on‑call rotation to support production stability.

  • Evaluate and onboard resiliency and reliability tooling.

  • Actively participate in reliability engineering and resilience communities of practice, contributing to shared learning and enterprise consistency.

  • Contribute to strategic initiatives that advance Vanguard’s operational maturity and resiliency posture.

Qualifications | Technical Skills:

  • Observability Platforms: Experience with modern observability and monitoring tools, such as Splunk, Honeycomb, CloudWatch, Dynatrace, or AppDynamics.

  • Reliability Metrics: Strong understanding of SLIs, SLOs, and SLAs, including dashboarding and reporting practices.

  • Monitoring & Alerting: Experience with alert design, anomaly detection, predictive alerting, and synthetic monitoring using structured methodologies.

  • Automation & Resilience Engineering: Experience with automation and resilience practices such as Python-based automation, RPA platforms (e.g., Blue Prism, UiPath), chaos engineering, and failure analysis techniques (e.g., FMEA).

Special Factors

Sponsorship

Vanguard is not offering visa sponsorship for this position.

About Vanguard

At Vanguard, we don't just have a mission—we're on a mission.

To work for the long-term financial wellbeing of our clients. To lead through product and services that transform our clients' lives. To learn and develop our skills as individuals and as a team. From Malvern to Melbourne, our mission drives us forward and inspires us to be our best.

How We Work

Vanguard has implemented a hybrid working model for the majority of our crew members, designed to capture the benefits of enhanced flexibility while enabling in-person learning, collaboration, and connection. We believe our mission-driven and highly collaborative culture is a critical enabler to support long-term client outcomes and enrich the employee experience.

hackajob is partnering with Vanguard to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.

 

Upskill

Level up the hackajob way. Verify your skills, learn brand new ones and test your ability with Pathways, our learning and development platform.

Ready to reach your potential?