Save time and effort sourcing top tech talent

Manager, Site Reliability Engineering

Remote
Site Reliability Engineer DevOps Engineer Database Administrator Engineering Manager DevOps Leader
NMI
Actively hiring

Sign up for the chance to get matched to this role, and similar opportunities.

Manager, Site Reliability Engineering

 

NMI is seeking a Manager to lead a distributed team of highly skilled DevOps Engineers and Database Administrators within our Site Reliability Engineering (SRE) organization. This individual will be responsible for managing a team of professionals spread across various locations in the United States. The ideal candidate will have a strong technical background, exceptional leadership skills, and experience in managing remote teams in a fast-paced, high-availability environment.

The SRE team is responsible for the operation of all hardware and software within the production and SDLC environments.  This consists of a global network connecting numerous colocation sites which must be highly available 24x7 with a minimal desired target of 99.99% availability.  A successful leader in this position must be an avid player coach, with both deep technical knowledge to vet changes and direct their team, as well as the ability to grow, mentor, and amplify the skilled staff on their teams.

The Ideal Candidate

  • Is a strategic thinker with a strong technical background and proven leadership experience.
  • Is passionate about building and leading high-performing, distributed teams.
  • Is committed to operational excellence and continuous improvement in a fast-paced environment.
  • Is cost and risk conscious in all things, but with the ability to make strong, rapid decisions when required

Key Duties

Leadership and Team Management:

  • Manage and mentor a distributed team of DevOps Engineers and Database Administrators.
  • Foster a collaborative and high-performing team culture, ensuring team members are motivated, engaged, and working effectively towards common goals.
  • Conduct regular performance reviews, provide feedback, and support the professional development of team members.

Strategic Planning and Execution:

  • Develop and implement strategic plans to enhance the reliability, performance, and scalability of our infrastructure.
  • Coordinate with cross-functional teams to align engineering efforts with business objectives and customer needs.
  • Drive continuous improvement initiatives and promote best practices in DevOps and database management.

Project Management:

  • Oversee the delivery of complex projects involving hardware, virtualization, observability, and cloud technologies.
  • Assist team members in navigating priority of competing tasks to focus on high ROI and strategically important tasks first
  • Ensure timely and efficient execution of projects, balancing priorities and managing resources effectively.
  • Collaborate with product, software, and security teams to meet project requirements and deadlines.

Operational Excellence:

  • Ensure the reliability and availability of services, maintaining a minimum target of 99.99% uptime.
  • Implement and oversee automation strategies to reduce operational toil and increase efficiency.
  • Develop and maintain observability tools to monitor system health, performance, and deployment.

Incident Management:

  • Lead incident response efforts, ensuring timely resolution and minimal customer impact.
  • Conduct blameless post-mortems and document lessons learned to prevent recurrence of issues.
  • Maintain a rotating on-call schedule to ensure 24x7 availability of services.

This is a fully remote role (work anywhere in the US); however, if you live within a reasonable commutable distance, we’d love to see you in the office from time to time!  Periodic travel (typically 1-4 times a year) will be required to company colocation facilities, at company expense.

Requirements:

  • 7+ years of experience in Site Reliability Engineering, DevOps, System Administration, or similar roles
  • Strong background in hardware and data center operations, including server and storage installation, troubleshooting, and decommissioning.
  • Proven track record of managing high-availability production environments with stringent uptime requirements.
  • Excellent problem-solving abilities and a proactive approach to incident management and infrastructure improvement.
  • Strong communication skills, with the ability to engage both technically and strategically with team members and stakeholders.

Preferred Qualifications/Experience

  • Previous experience in agile methodologies such as Kanban or Scrum is a plus.
  • Experience representing team velocity, capacity, and throughput using Jira as the source of record will be extremely valuable

We Offer:

  • A remote first culture!
  • Competitive compensation and benefits
  • Personal growth and advancement opportunities
  • Flex PTO & dedicated sick time
  • 13 Paid Holidays
  • Gym membership discount
  • Company volunteer days

Do you feel like you have a slightly out of the ordinary career path or history? We are open to all walks of life and very willing to hear your story. Please don’t feel like this should be a barrier to securing a great career at NMI! We appreciate success can come in all shapes and sizes. Fill in the ‘Additional Info’ box on our application to tell us more about your path.

What we do!

NMI enables our partners with choice, and challenges the one-size-fits-all approach to payments. You've probably used NMI in the last 24 hours without even realizing it. We’re the platform that powers success for innovative tech created by SMBs, entrepreneurs and fintech startups. We’re creative problem solvers who help visionaries smash through boundaries and think beyond what’s possible so they can think about what’s next. But we’re not just built for the tech savvy. We democratize the latest payments technology so that everyone can realize the benefits of easy payments across the full spectrum of commerce. We’re all about enabling more payments in more ways and more places.

We believe that having a diverse group of employees strengthens both our work and our workplace. We’re focused on making NMI more diverse and welcoming with initiatives like having a dedicated Diversity, Equity & Inclusion action group, diversity goals for hiring, anonymized resume screening, affinity groups such as our Women's network and LGBTQ+ Network, open forums for discussions on diversity and social justice, and measuring inclusion and belonging as part of our regular employee engagement surveys.

Equal Opportunity

NMI is committed to providing equal employment opportunity for all persons regardless of race, color, religion, sex, age, marital status, national origin, sexual orientation or sexual identity, genetic information, citizen status (except those that do not have the legal right to be employed in the United States), disability, military service, service member, veteran status, or any other basis protected by applicable law.

Please be aware that all offers of employment are made subject to receipt of satisfactory background and financial checks.

Please be aware that NMI does not operate a license for the sponsorship of those who are not already eligible to work within the US. Unfortunately, therefore we cannot process any application from individuals unable to provide documentary evidence of their eligibility to commence work in the US.

#LI-Remote

 

Salary range, depending on experience:
$120,000$155,000 USD

Sign up for the chance to get matched to this role, and similar opportunities.

Upskill

Level up the hackajob way. Verify your skills, learn brand new ones and test your ability with Pathways, our learning and development platform.

Ready to reach your potential?