Sourcing as a channel, not a feature.

ML Engineer, Network Intelligence

Remote
Up to $180,000/ year
Data Scientist MLOps Engineer Machine Learning Engineer

ML Engineer, Network Intelligence

Colt Technology Services
Remote
Up to $180,000/ year
Data Scientist MLOps Engineer Machine Learning Engineer
Colt Technology Services

hackajob is partnering with Colt Technology Services to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.

 

Colt provides network, voice and data centre services to thousands of businesses around the world, allowing them to focus on delivering their business goals instead of the underlying infrastructure.

Summary

Job title: ML Engineer, Network Intelligence

Reports to: VP of AI Technology and Architecture

Job location: Must have ability to be onsite in Boulder, CO office for 2-3 days a week

About Colt and the AI Practice

Colt Technology Services is a global digital infrastructure company, operating one of the world's largest fiber networks across Europe and Asia. We are building a Boulder-based AI team as the engineering center of our AI Practice. The team is responsible for how AI is adopted, governed, and scaled across Colt.

The AI Practice operates across several parallel workstreams, including use case delivery, AI WAN, and private AI. The Boulder team is where we build, test, and validate before we scale.

AI WAN is Colt's flagship AI product - a software-defined network built to carry AI traffic reliably and at scale, as model inference, agent-to-agent communication, and enterprise AI workloads drive a new wave of network demand. On top of that, we use AI itself to optimize how the network operates: predicting congestion, automating configuration, and ultimately enabling closed-loop autonomous network management. This is not a generic enterprise AI role. The problems we work on are grounded in real network infrastructure, and the Boulder AI Hub is where the intelligence layer gets built.

Why we need this role

As ML Engineer, Network Intelligence, you own the data and modeling layer that turns Colt’s network telemetry into production ML systems. Your primary work is cleaning and structuring real-world network data, building classical ML models for anomaly detection, predictive maintenance, and traffic forecasting, and getting those models into production.

You will work directly with network telemetry data in GCP BigQuery, build the data pipeline and feature engineering layer that makes ML possible on network data, and develop classical ML models for anomaly detection, predictive maintenance, traffic forecasting, and capacity planning. Longer term, you will be central to the AI WAN closed-loop architecture, defining what network state the model consumes and what control plane actions it is safe to initiate.

You do not need to be a deep learning researcher or a network engineer. You need enough network domain knowledge to make sense of the data, and strong ML fundamentals to build models that are trustworthy in a production environment. MLOps tooling experience is a plus but not a day-one requirement. You will have the opportunity to grow into ownership of the model lifecycle layer as the team matures.

Network AI is one of the most high-value applications of ML in enterprise technology, and one of the least staffed with people who understand both sides. Most ML engineers don’t understand networks. Most network engineers don’t know ML. If you sit at that intersection, this role gives you direct access to one of the world’s largest fiber networks, a small team where your work has outsized impact, and a product (AI WAN) that is Colt’s most significant long-term revenue opportunity.

What you will do

Classical ML for Network Operations

  • Build and operationalize ML models for anomaly detection on network time series data
  • Own root cause analysis (RCA) model development, identifying contributing factors and failure chains in network events, in addition to detecting that something is wrong
  • Develop predictive maintenance models to forecast hardware failures and network degradation before customer impact
  • Build traffic forecasting and capacity planning models to support proactive network management
  • Design model evaluation frameworks appropriate for network operations - precision/recall tradeoffs, false positive costs, operational trust-building

Network Data & Feature Engineering

  • Assess, clean, and structure network telemetry data in GCP BigQuery - the foundational step before any ML is possible
  • Build data pipelines that transform raw network telemetry into ML-ready features
  • Work with Colt's NaaS and network operations teams to understand data semantics, quality gaps, and labeling challenges
  • Define the data access and enrichment roadmap for network AI use cases

MLOps & Model Lifecycle

  • Own the full lifecycle of network ML models: experiment tracking, model versioning, retraining pipelines, and production drift monitoring
  • Define retraining triggers and model health thresholds appropriate for network operations, where a degraded model can have real service impact
  • Partner with the AI Platform Engineer, who owns the underlying infrastructure; you own the ML layer on top. The boundary is model serving (yours) versus Kubernetes and GPU infrastructure down (theirs)

AI WAN Closed-Loop Architecture

  • Work with Cisco and Colt's NaaS team to understand what network state data are available and what control plane APIs exist for programmatic network actions
  • Define the closed-loop architecture: what inputs feed the model, what decisions it can make autonomously, what requires human confirmation
  • Build the initial recommendation layer (human-in-the-loop) before progressing to autonomous closed-loop actions
  • Design guardrails, rollback mechanisms, and confidence thresholds appropriate for production network control

Cross-Team Collaboration

  • Partner with the Staff AI Engineer to connect ML model outputs to agent orchestration and recommendation systems
  • Work with the AI Platform Engineer on the handoff boundary: they own Kubernetes, GPU infrastructure, and model serving setup; you own what runs on top, including experiment tracking, retraining pipelines, and production model health
  • Engage directly with NaaS team and network operations stakeholders to ground use cases in real operational problems

What we're looking for

Classical ML

  • Time series analysis, anomaly detection, supervised/unsupervised learning; scikit-learn, XGBoost, PyTorch or equivalent; model evaluation and production deployment experience

Data Engineering

  • SQL and BigQuery; data pipeline construction; feature engineering from raw telemetry; experience with real-world network data

Cloud & Tooling

  • GCP (BigQuery, Vertex AI, Cloud Storage); Python; MLOps lifecycle tooling (MLflow, Weights & Biases, Vertex AI Pipelines or equivalent) is a growth expectation. Experience is a plus, ownership is where you are headed in six months

Mindset

  • Comfortable working with ambiguous, incomplete data; understands that network operation requires high trust thresholds before autonomous action; can translate between network engineering and ML concepts

Nice to Have

  • Experience with Cisco platforms, NSO, Itential, or similar network orchestration tools; streaming telemetry (Kafka, Pub/Sub); OpenTelemetry
  • Familiarity with network operations or network telemetry data is a plus; SDN experience is a significant advantage for the AI WAN closed-loop work but is not required for Phase 1 delivery

What we offer you:

Looking to make a mark?

At Colt, you’ll make a difference. Because around here, we empower people. We don’t tell you what to do.

Instead, we employ people we trust, who come together across the globe to create intelligent solutions.

Our global teams are full of ambitious, driven people, all working together towards one shared purpose: to put the power of the digital universe in the hands of our customers wherever, whenever and however they want.

We give our people the opportunity to inspire and lead teams, and work on projects that connect people, cities, businesses, and ideas. We want you to help us change the world, for the better.

Diversity and inclusion

  • Inclusion and valuing diversity of thought and experience are at the heart of our culture here at Colt. From day one, you’ll be encouraged to be yourself because we believe that’s what helps our people to thrive. We welcome people with diverse backgrounds and experiences, regardless of their gender identity or expression, sexual orientation, race, religion, disability, neurodiversity, age, marital status, pregnancy status, or place of birth.

Most recently we have:

  • Signed the UN Women Empowerment Principles which guide our Gender Action Plan
  • Trained 60 (and growing) Colties to be Mental Health First Aiders
  • Please speak with a member of our recruitment team if you require adjustments to our recruitment process to support you. For more information about our Inclusion and Diversity agenda, visit our DEI pages.

Benefits

Our benefits support you through all parts of life, for both physical and mental health.

  • Flexible working hours and the option to work from home.
  • Extensive induction program with experienced mentors and buddies.
  • Opportunities for further development and educational opportunities.
  • Global Family Leave Policy.
  • Employee Assistance Program.
  • Internal inclusion & diversity employee networks.

A global network

  • When you join Colt you become part of our global network. We are proud of our colleagues and the stories and experience they bring – take a look at ‘Our People’ site including our Empowered Women in Tech.

hackajob is partnering with Colt Technology Services to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.

 

Upskill

Level up the hackajob way. Verify your skills, learn brand new ones and test your ability with Pathways, our learning and development platform.

Ready to reach your potential?