Machine Learning Evaluator

Remote

Up to $400,000/ year

Artificial Intelligence Engineer Python Developer Machine Learning Engineer Full Stack Python Developer

Actively hiring

Machine Learning Evaluator

Moody's Corporation

Remote

Up to $400,000/ year

Artificial Intelligence Engineer Python Developer Machine Learning Engineer Full Stack Python Developer

Moody's Corporation

Actively hiring

hackajob is partnering with Moody's Corporation to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.

At Moody's, we unite the brightest minds to turn today’s risks into tomorrow’s opportunities. We do this by striving to create an inclusive environment where everyone feels welcome to be who they are—with the freedom to exchange ideas, think innovatively, and listen to each other and customers in meaningful ways. Moody’s is transforming how the world sees risk. As a global leader in ratings and integrated risk assessment, we’re advancing AI to move from insight to action—enabling intelligence that not only understands complexity but responds to it. We decode risk to unlock opportunity, helping our clients navigate uncertainty with clarity, speed, and confidence.

If you are excited about this opportunity but do not meet every single requirement, please apply! You still may be a great fit for this role or other open roles. We are seeking candidates who model our values: invest in every relationship, lead with curiosity, champion diverse perspectives, turn inputs into actions, and uphold trust through integrity.

Skills and Competencies

Ph.D. in Computer Science, Machine Learning, Natural Language Processing, Statistics, or a related quantitative field; or Master’s degree with 2-3 years of experience in machine learning evaluation or a related area
Strong foundations in statistical methods, experimental design, and hypothesis testing
Experience evaluating machine learning or NLP models, including designing experiments and interpreting results
Familiarity with LLM evaluation benchmarks and methodologies
Strong programming skills in Python or R
Excellent communication skills in English (both written and verbal)

Preferred:

Experience evaluating LLMs or generative AI systems
Experience with production machine learning systems
Exposure to cloud platforms such as AWS, GCP, or Azure
Publications or demonstrated work in model evaluation, benchmarking, or related areas

Education

Ph.D. in Computer Science, Machine Learning, Natural Language Processing, Statistics, or a related quantitative field; or Master’s degree with 2-3 years of experience in machine learning evaluation or a related area

Responsibilities

Evaluate and validate large language models for production-grade analytical and decision-support systems
Design and implement evaluation frameworks for assessing LLM performance in credit analytics and decision-support contexts
Develop metrics and benchmarks to measure model robustness, reliability, consistency, and output quality
Analyze model behavior across diverse inputs, identifying failure modes, edge cases, and areas for improvement
Collaborate with model development and deployment teams to integrate validation processes into the model lifecycle
Conduct systematic assessments of model stability over time and across updates
Evaluate model outputs for bias, fairness, and economic relevance to credit risk applications
Develop and maintain documentation for evaluation methodologies, findings, and recommendations
Contribute to the advancement of best practices for LLM evaluation within the Credit COE

About the Team
Our Credit Center of Excellence (COE) team is responsible for maintaining and enhancing our industry-leading credit analytics and predictive modelling capabilities. We work closely with various teams including product management, commercial strategy, and go-to-market leaders to ensure the delivery of high-quality credit risk assessments and solutions. By joining our team, you will be part of exciting work in credit analytics with a global team spread across all US time zones, GMT, and GMT+1.

hackajob is partnering with Moody's Corporation to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.

Upskill

Level up the hackajob way. Verify your skills, learn brand new ones and test your ability with Pathways, our learning and development platform.

Find out more

Ready to reach your potential?

Find out more

Platform

Solutions

Resources

Machine Learning Evaluator

Remote

Up to $400,000/ year

Actively hiring

Machine Learning Evaluator

Moody's Corporation

Remote

Up to $400,000/ year

Moody's Corporation

Actively hiring

Upskill

Ready to reach your potential?