hackajob is partnering with Mercor to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.
# About the Opportunity A leading AI research organization is seeking advanced LLM power users to evaluate how well AI systems handle personalized, real-world life tasks. This role is for people who use AI tools heavily in their personal lives and can clearly judge whether an AI response is useful, personalized, realistic, and successful. # What You’ll Do - Create written responses and explanations for complex personal tasks - Judge whether outputs are practical, well-reasoned, and appropriately personalized - Identify where the AI succeeds, fails, overreaches, or misses important context - Apply structured rubrics and quality criteria to assess model performance - Use your own LLM experience to evaluate real-world usefulness # Who We’re Looking For Strong candidates have: - Heavy personal usage of LLM products - Experience using AI for multi-step tasks, planning, research, decision-making, or personal workflows - Familiarity with tools such as ChatGPT, Claude, Gemini, Perplexity, Cursor, Windsurf, Codex, or other AI agents - Ability to explain what makes an AI output good, bad, incomplete, unsafe, or unrealistic - Strong written judgment and attention to detail - Experience writing and evaluating against rubrics is extremely critical - Extensive rubric experience, including 100+ hours on prior rubric projects involving rubric design, evaluation, and quality assessment # Expertise Domains We are especially interested in people with experience using AI or personal judgment in one or more of the following domains: ## Personalized Food Recommendations Finding restaurants, booking tables, choosing delivery options, comparing menus, and accounting for dietary preferences, budget, location, occasion, timing, and group needs. ## Personalized Health Reasoning through personal health questions, reviewing Apple Health or wearable data, identifying trends in sleep, activity, heart rate, or blood pressure, and knowing when medical guidance is needed. ## Personal Productivity Managing life admin, personal projects, errands, reminders, household tasks, calendars, follow-ups, and personal CRM systems. ## Personalized Career Advice Finding and applying to jobs, editing resumes or LinkedIn profiles, preparing outreach, networking, using LinkedIn for advancement, and planning career moves. ## Personalized Learning and Development Creating study plans, learning roadmaps, accountability systems, skill-building plans, and personalized development goals. # Why This Work Matters LLMs are quickly becoming personal assistants for everyday decisions, but truly useful AI needs to do more than produce generic advice. It needs to understand context, preferences, constraints, tradeoffs, and what success looks like in real life. Your evaluations will help improve how AI systems support people with practical, high-context tasks across food, health, productivity, careers, and learning. This work directly contributes to making AI assistants more personalized, trustworthy, and useful for real-world personal workflows. # Engagement Details - Expected commitment: 15–40 hours/week - Work Trial: Participants will be asked to complete a paid work trial as part of the onboarding process for this project. The work trial pays **$30 after completion** and is designed to help assess each participant’s fit for the role based on LLM experience and domain expertise. We recommend putting thoughtful effort into the assessment, as it will play a significant role in the selection process. - Please note that the project is still in its early stages, and so please expect an initial delay in getting started or tasking. - Participants will write clear, realistic prompts for complex personal-life tasks, execute those tasks while recording their screens, and create detailed evaluation rubrics to assess how well AI systems perform. This work requires both strong ChatGPT power-user experience and firsthand expertise completing challenging, research-intensive personal workflows end-to-end. - You should be able to turn around tasks within 24 hours - A **desktop or laptop computer** is required (Chromebooks are not supported)
hackajob is partnering with Mercor to fill this position. Create a profile to be automatically considered for this role—and others that match your experience.
Level up the hackajob way. Verify your skills, learn brand new ones and test your ability with Pathways, our learning and development platform.