hackajob is partnering with Mercor to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.
We are building a benchmark dataset to evaluate AI models on professional document understanding and instruction following within the Technology domain. Tasks consist of complex, multi-step requests grounded in real-world workspace files (technical specs, architecture docs, API references, codebases), web search, and code execution — each paired with a clearly defined ground truth output and an objective evaluation rubric. You will be responsible for authoring tasks that test an AI's ability to reason over technical documentation, follow precise instructions, and produce accurate, well-structured outputs. We expect a minimum commitment of 15–20 hours per week. Ideal candidates have 3+ years of hands-on experience in one or more of the following sub-domains: - Software engineering - Data science & analytics
hackajob is partnering with Mercor to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.
Level up the hackajob way. Verify your skills, learn brand new ones and test your ability with Pathways, our learning and development platform.