Save time and effort sourcing top tech talent

Senior Specialist, AI Infrastructure Engineer

Pune, MH, India
Platform Engineer DevOps Engineer Site Reliability Engineer Operations Engineer

Senior Specialist, AI Infrastructure Engineer

BNY Mellon
Pune, MH, India
Platform Engineer DevOps Engineer Site Reliability Engineer Operations Engineer
BNY Mellon

hackajob is partnering with BNY Mellon to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.

 

Senior Specialist, AI Infrastructure Engineer 

At BNY, our culture allows us to run our company better and enables employees’ growth and success. As a leading global financial services company at the heart of the global financial system, we influence nearly 20% of the world’s investible assets. Every day, our teams harness cutting-edge AI and breakthrough technologies to collaborate with clients, driving transformative solutions that redefine industries and uplift communities worldwide.

Recognized as a top destination for innovators and champions of inclusion, BNY is where bold ideas meet advanced technology and exceptional talent. Together, we power the future of finance – and this is what #LifeAtBNY is all about. Join us and be part of something extraordinary.

We’re seeking a future team member for the role of Senior Specialist, AI Infrastructure Engineer to join our Engineering team. This role is located in Pune

In this role, you’ll make an impact in the following ways: 

  • Execute routine platform operations including access requests, environment checks, job monitoring, and platform health verification across AI infrastructure environments.
  • Perform Linux system administration tasks such as service checks, log analysis, process inspection, configuration updates, and environment hygiene.
  • Support containerized workloads running on Kubernetes and Docker, including validating pod and service health, retrieving logs, and assisting with first-line troubleshooting.
  • Triage operational incidents using ticketing systems and follow runbooks to resolve issues or escalate appropriately to senior engineers.
  • Assist with troubleshooting networking issues affecting application connectivity, including problems related to TCP/IP, DNS, routing, and firewall configurations.
  • Support enterprise AI/ML platforms used by data science teams, collaborating with platform engineers and analysts to maintain reliable environments.
  • Maintain operational documentation including runbooks, incident records, troubleshooting notes, and recurring issue tracking.
  • Participate in scheduled maintenance windows, environment updates, and controlled production changes following disciplined operational processes.
  • Assist during incidents by collecting diagnostics, validating symptoms, coordinating handoffs between teams, and clearly communicating status updates.
  • Work in environments that include enterprise compute infrastructure and NVIDIA GPU-enabled platforms, with awareness of GPU workloads and scheduling concepts.

  •  

To be successful in this role, we’re seeking the following: 

  • Bachelor’s degree in Computer Science, Information Systems, Engineering, or a related technical field, or equivalent practical experience.
  • Hands-on experience of 5+ years supporting production infrastructure or platform environments such as operations, infrastructure support, junior SRE, or platform support roles.
  • Solid Linux fundamentals with experience using command-line tools, troubleshooting services, analyzing logs, and understanding system behavior in production environments.
  • Understanding of networking fundamentals including TCP/IP, DNS, routing, firewall concepts, and service connectivity troubleshooting.
  • Foundational knowledge of Kubernetes and containerized workloads, including pods, namespaces, services, and log retrieval.
  • Practical experience using Python and/or JavaScript for operational scripting, log parsing, lightweight tooling, or automation tasks.
  • Experience working in security-conscious or regulated environments where uptime, auditability, and disciplined operational execution are critical.
  • Exposure to Docker and Kubernetes in real production or lab environments.
  • Familiarity with automation or configuration management tools such as Ansible or Puppet.
  • Awareness of GPU-enabled compute environments and AI platform concepts.
  • Experience collaborating with data scientists or analytics teams supporting shared AI platforms.
  • Exposure to operational support for distributed systems or containerized application platforms.
  • On-Prem Enterprise server-based compute environments supporting production workloads
  • Exposure to NVIDIA/AMD TPU/GPU-based systems
  • Awareness of GPU-enabled workloads and scheduling concepts
  • Linux-based production environments – RHEL/Ubuntu
  • Service management, process inspection, configuration updates, and log analysis
  • Understanding of TCP/IP, DNS, routing, and firewall fundamentals
  • Ability to troubleshoot connectivity issues affecting distributed services
  • Docker containerization concepts
  • Advanced Kubernetes  knowledge(pods, namespaces, services, health checks)
  • Operational tooling for incident triage, runbooks, and documentation – ServiceNow, JIRA
  • AI/ML platforms supported by the environment
  • Exposure to infrastructure automation tools such as Ansible or Puppet


At BNY, our culture speaks for itself, check out the latest BNY news at:

BNY Newsroom

BNY LinkedIn 

 Here’s a few of our recent awards: 

  • America’s Most Innovative Companies, Fortune, 2025
  • World’s Most Admired Companies, Fortune 2025
  • “Most Just Companies”, Just Capital and CNBC, 2025


Our Benefits and Rewards:

BNY offers highly competitive compensation, benefits, and wellbeing programs rooted in a strong culture of excellence and our pay-for-performance philosophy. We provide access to flexible global resources and tools for your life’s journey. Focus on your health, foster your personal resilience, and reach your financial goals as a valued member of our team, along with generous paid leaves, including paid volunteer time, that can support you and your family through moments that matter. 

BNY is an Equal Employment Opportunity/Affirmative Action Employer - Underrepresented racial and ethnic groups/Females/Individuals with Disabilities/Protected Veterans.

hackajob is partnering with BNY Mellon to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.

 

Upskill

Level up the hackajob way. Verify your skills, learn brand new ones and test your ability with Pathways, our learning and development platform.

Ready to reach your potential?