Company Description
We're ASOS, the online retailer for fashion lovers all around the world.
We exist to give our customers the confidence to be whoever they want to be, and that goes for our people too. At ASOS, you're free to be your true self without judgement, and channel your creativity into a platform used by millions.
Everyone needs some help showing up as their best self. We're Disability Confident Committed - let our Talent team know if you need any reasonable adjustments throughout the recruitment process
Job Description
As a Senior AI Engineer, you will be part of the AI Platform team, helping to build and scale the shared foundations that enable AI capabilities across ASOS. The primary focus of this role will be contributing to the Agentic AI Platform initiative, alongside other core AI platform capabilities as the platform evolves.
This role is focused on the platform layer, rather than individual business use cases. You will design and implement shared standards, templates and reference implementations for agentic AI on Azure, enabling application teams to safely design, deploy and operate AI agents at enterprise scale. Working closely with Product teams, Cloud Infrastructure, Security and partners, you will help ensure AI capabilities are secure, observable, reusable and governed by default. You will also contribute to the production foundations needed to operate AI capabilities reliably, including LLMOps, model access patterns, prompt and agent lifecycle practices, observability and secure enterprise integration.
What you’ll be doing
- Designing and building AI platform capabilities on Azure, with a strong focus on agentic AI patterns such as agent runtimes, orchestration and tool integration
- Contributing to the Agentic AI Platform initiative, helping define how agents are built, integrated and operated across the organisation
- Designing and maintaining standardised templates and reference implementations for LLM and Generative AI workflows, enabling teams to adopt consistent patterns for prompt design, tool calling, multi‑step agent flows, retries and failure handling
- Implementing secure, governed access patterns for LLMs and enterprise tools using APIM, platform gateways, Entra ID, RBAC and managed identities
- Contributing to LLMOps and model runtime patterns, including standard approaches for model access, routing, caching, token optimisation and cost‑aware usage controls
- Supporting lifecycle and evaluation practices for agent configurations, prompts and AI workflows, including testing, controlled change and release readiness
- Designing secure tool-access patterns for agents, including MCP/tool abstraction, credential management and enterprise API integration.
- Contributing to AgentOps and GenAIOps capabilities, including telemetry, run history, task outcomes, error analysis and feedback loops
- Contributing to reliability patterns for production AI systems, including latency monitoring, alerting, scaling considerations and operational readiness.
- Applying CI/CD and software engineering best practices to AI platform and agentic components
- Embedding observability by default, ensuring AI systems are measurable, debuggable and auditable through logs, metrics and traces
- Partnering with Cloud Infrastructure and Security teams to design secure, scalable and cost‑effective Azure environments
Qualifications
- Significant experience as an AI Engineer, AI Platform Engineer or similar, delivering production‑grade AI systems
- Hands‑on experience with LLMs, Generative AI and agent‑based systems in real‑world environments
- Strong understanding of the end‑to‑end AI lifecycle, from experimentation through deployment and operation
- Practical understanding of production LLM or GenAI runtime concerns, such as model access, routing, caching, token usage, cost optimisation and reliability.
- High proficiency in Python, with experience building APIs and service‑oriented systems
- Experience working with CI/CD pipelines, automated testing and versioned deployments for AI or platform components
- Practical experience with observability tooling (logging, metrics, tracing and alerting) and using telemetry to improve reliability and performance
- Comfortable working in cloud environments, preferably Azure
- Experience with Azure AI Foundry is preferred, but we are equally open to candidates with hands‑on experience using comparable GenAI or agent platforms, and a strong understanding of how to apply those patterns within Azure
- Experience with Azure API Management (APIM) is preferred, especially as a governance or integration boundary
- Familiarity with AgentOps, MLOps or GenAIOps concepts, including monitoring, evaluation and feedback loops
- Strong collaboration skills, with the ability to influence platform standards and enable other engineering teams
- A pragmatic, engineering‑led approach to responsible and ethical AI, with a focus on safety, reliability and trust
Additional Information
BeneFITS’
- Employee discount (hello ASOS discount!)
- Employee sample sales
- 25 days paid annual leave + an extra celebration day for a special moment
- Discretionary bonus scheme
- Private medical care scheme
- Flexible benefits allowance - which you can choose to take as extra cash, or use towards other benefits
- Opportunity for personalised learning and in-the-moment experiences that enable you to thrive and excel in your role
hackajob is partnering with ASOS.com to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.