Machine Learning Platform Engineer

Remote

Machine Learning Engineer Platform Engineer

hackajob on-demand

Actively hiring

hackajob is partnering with hackajob on-demand to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.

Machine Learning Platform Engineer

Remote

Machine Learning Engineer Platform Engineer

hackajob on-demand

Actively hiring

Apply by creating a free profile

Back to jobs

hackajob is partnering with hackajob on-demand to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.

hackajob on-Demand is currently partnering with an AI startup company to help them hire the best talent. At on-demand, we match and speak with exceptional talent like you and provide insights into the problem they are looking to solve and the interview process.

Role: Machine Learning Platform Engineer

Opportunity: Perm or Contract

Based: London or New York (remote possible but ideally onsite in either city)

About us

We are a stealth-mode startup developing cutting-edge AI and machine learning tools for the financial sector. Our mission is to revolutionize how hedge funds leverage advanced technologies for data analysis and decision-making. We're building a diverse team of experts from various fields to create innovative solutions that push the boundaries of what's possible in financial technology.

The role

We're seeking an ML Platform Engineer to join our founding team. You'll work directly with our AI Research team to build and optimize our on-premises ML infrastructure. This is a unique opportunity to shape the foundation of our ML platform from the ground up, with a focus on high-performance, secure computing environments.

What you’ll do:

Design and implement scalable, on-premises infrastructure for training and deploying ML models across GPU clusters
Build and maintain high-performance computing environments optimized for ML workloads
Develop secure, robust data pipelines that can handle high-throughput, real-time processing requirements
Create comprehensive monitoring and observability solutions for our distributed ML systems
Implement testing frameworks and development workflows that accelerate our research team's productivity
Collaborate closely with research scientists to translate innovative ideas into production-ready systems
Make critical architectural decisions that will shape our technical infrastructure
Design and implement security measures to protect proprietary systems and data

Requirements

5+ years of software engineering experience, with 3+ years focused on ML infrastructure
Strong programming skills in Python and experience with ML frameworks (PyTorch, TensorFlow)
Experience building and maintaining on-premises ML infrastructure and GPU clusters
Proven track record of optimizing distributed computing systems
Deep understanding of ML ops, including experiment tracking, model versioning, and deployment
Expertise in designing and implementing monitoring and observability solutions
Strong background in software engineering best practices, including testing and CI/CD

Preferred Qualifications

Experience with high-performance computing infrastructure and GPU optimization
Knowledge of Linux system administration and networking
Background in security best practices for ML systems and data protection
Experience with containerization and orchestration (Docker, Kubernetes)
Track record of building developer tools and improving engineering productivity
Experience collaborating with research scientists and PhD-level practitioners
Familiarity with low-latency systems design

Apply by creating a free profile

Back to jobs

hackajob is partnering with hackajob on-demand to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.

Upskill

Level up the hackajob way. Verify your skills, learn brand new ones and test your ability with Pathways, our learning and development platform.

Find out more

Ready to reach your potential?

Find out more

Platform

Solutions

Resources

Machine Learning Platform Engineer

Remote

hackajob on-demand

Actively hiring

Machine Learning Platform Engineer

Remote

hackajob on-demand

Actively hiring

Upskill

Ready to reach your potential?