You work with models and data already. AI companies need people who can review how a system reasons about ML problems and judge where it goes wrong.
This isn’t advisory work and it isn’t labelling. You’re reviewing real ML and data work and explaining your judgement clearly enough that an AI can learn from it — paid for your expertise, by the hour.
Applied Clinical Judgement connects qualified people to vetted platforms, and Sean Key personally vouches for those he refers. We’re paid a referral fee by the platform on a successful placement — never by you. The roles below are live today.
22 live AI, ML & Data roles · updated daily
Software / AI / IT / data Evaluator
Mercor seeks experienced Software, AI, IT and data professionals at £60–90 per hour to evaluate AI-generated work products including documents, spreadsheets and presentations. You'll assess outputs against domain-specific quality rubrics, identifying factual, aesthetic and presentation errors, then provide structured written feedback. Requires 5+ years' relevant experience, native or professional English fluency, and strong proficiency in Microsoft Office and Google Workspace.
Member of Technical Staff, Frontier AI
micro1 seeks a Member of Technical Staff for Frontier AI at $100–$130 per hour, offering full-time remote work. This hands-on role bridges research, data, and deployed systems, requiring ownership of evaluation initiatives, ML dataset design, and failure analysis. You'll translate real-world system behaviour into structured research frameworks, work across teams to raise signal quality, and ensure research claims are defensible and production-ready. Suits those with experience in applied research, RL systems, or agentic AI.
Quality Analyst-Multimodality
Turing seeks a Quality Analyst for multimodality work fine-tuning large language models. You'll analyse content, answer complex questions, create training scenarios, validate claims through research, and provide detailed feedback to improve AI systems. The role suits analytical thinkers with strong English skills, research capability, and 2+ years in quality assurance, data annotation, or similar work. Self-motivation and independent working are essential.
AI Quality Analyst (Personalization) - Dutch
Turing seeks a Dutch-fluent AI Quality Analyst to evaluate personalization features in Gemini. You'll design multi-turn conversational prompts using your own Google account data and assess how well the model personalizes responses based on your Gmail, search, and YouTube activity. The role demands analytical rigour, creative prompt design, and meticulous attention to response quality across dimensions like grounding and integration. Suitable for candidates with degrees in policy, law, ethics, linguistics, or computer science, ideally with prior AI evaluation or content moderation experience.
AI Quality Analyst (Personalization) - Vietnamese
Turing seeks a Vietnamese-fluent AI Quality Analyst to evaluate personalization features in Gemini. You'll design prompts using your personal Google data, assess model responses for grounding and naturalness, and provide detailed comparative rationales. The role suits analytical professionals with experience in AI evaluation or content moderation. Contractor position at $15/hour, minimum 20 hours weekly with 4-hour PST overlap required.
LLM Trainer - Agent Function call
Turing seeks LLM trainers to create high-quality multi-turn conversations simulating real user-assistant interactions with function-calling tools. You'll design dialogues across calendar, email, maps and drive applications, demonstrating natural reasoning and contextual understanding. The role suits technically-minded professionals with 3+ years experience in technical or analytical fields who can simulate realistic assistant behaviour and follow detailed formatting guidelines.
Prompt & Verifier
Turing seeks a Prompt & Verifier with 3–6 years in AI operations, QA, or policy roles. You'll design and evaluate prompts for AI systems interacting with APIs and SaaS tools like Slack and PayPal, ensuring compliance and safety across multiple languages. The work demands API/JSON and SQL literacy, strong policy expertise, and critical judgment. Based in eligible countries, minimum 20 hours weekly with 4-hour PST overlap. Three-month contractor role.
Business Analyst (Bahasa Indonesian Language)
Turing seeks analytical professionals fluent in English and Bahasa Indonesian to help train large language models. You'll analyse content, solve reasoning puzzles, and provide detailed feedback to improve AI systems. The role suits independent thinkers with strong analytical and research capabilities. Twelve-month contract with flexible hours, requiring 40 hours weekly and overlap with US Pacific time. No specialist domain background needed.
Business Analyst (Russian Language)
Turing seeks bilingual Business Analysts (English and Russian) to help train large language models through analytical problem-solving. You'll analyse scenarios, validate claims via research, and provide detailed feedback to improve AI reasoning. The role suits analytically-minded individuals comfortable working independently with flexible remote hours, requiring 40 hours weekly with Pacific timezone overlap.
Business Analyst (Thai Language)
Turing seeks analytical professionals fluent in English and Thai to help refine large language models through research and feedback work. The role involves reading complex content, breaking it into logical components, validating claims online, and creating training scenarios for AI systems. Candidates need strong analytical and communication skills, creative thinking, and ability to work independently. This 40-hour weekly contract overlaps with US Pacific hours.
Business Analyst (Malay Language)
Turing seeks a Business Analyst fluent in English and Malay to help refine large language models through analytical problem-solving. You'll answer questions, create training scenarios, and provide detailed feedback to improve AI reasoning. The role suits analytically-minded individuals with strong research skills, creative thinking, and ability to work independently. Requires 40 hours weekly with US timezone overlap.
Business Analyst (Hebrew Language)
Turing seeks a Business Analyst fluent in English and Hebrew to help train large language models through analytical problem-solving. You'll analyse scenarios, validate claims, and provide detailed feedback to improve AI reasoning. The role suits independent thinkers with strong research and communication skills. Requires 40 hours weekly with US timezone overlap; no prior domain expertise needed, though professional writing or analytical background is valued.
Business Analyst (Korean Language)
Turing seeks bilingual analysts (English and Korean) to help train large language models by answering analytical questions and creating training scenarios. The role suits self-motivated professionals with strong research and critical thinking skills, capable of breaking down complex content, validating information, and providing detailed feedback. A bachelor's degree is preferred but not essential if you have relevant writing or analytical experience.
Business Analyst (Arabic Language)
Turing seeks a bilingual Business Analyst (English and Arabic) to help refine large language models through analytical problem-solving. You'll answer analytical questions, create training scenarios, and provide detailed feedback to improve AI system performance. The work involves research, content analysis, and logical reasoning. Candidates should demonstrate strong analytical capabilities and independent working habits, with some US timezone overlap required.
Business Analyst (German Language)
Turing seeks a German-English bilingual Business Analyst to help train large language models by answering analytical questions, creating training scenarios, and validating claims through research. The role suits detail-oriented problem-solvers with strong research and communication skills who can work independently. You'll spend 40 hours weekly identifying where models fail and providing corrections to improve their reasoning across logic puzzles, data analysis, and scenario work.
Business Analyst (French Language)
Turing seeks bilingual (English and French) business analysts to help train AI language models through detailed evaluation and annotation work. You'll analyse scenarios, provide constructive feedback, and validate information to improve model performance. The role suits self-motivated individuals with strong analytical skills, research capabilities, and creative problem-solving abilities. Requires 40 hours weekly with US time overlap and reliable internet connectivity.
Member of Technical Staff, Finance Research
Member of Technical Staff in Finance Research at micro1, offering $180,000–$230,000 base salary plus equity and performance bonuses. This full-time remote role suits researchers with advanced finance qualifications and deep domain expertise in capital markets, risk management, or related areas. You'll design AI evaluation frameworks, conduct original research on financial reasoning, develop benchmark datasets, and collaborate across research and engineering teams to advance enterprise financial AI systems.
Business Analyst
Turing seeks analytical professionals to evaluate and improve large language models through content analysis, fact verification, and reasoning task creation. You'll review AI outputs, identify logical gaps, and provide detailed feedback to enhance model performance. The role suits detail-oriented researchers comfortable working independently with strong critical thinking skills and excellent written English. Remote contractor position requiring 30 hours weekly with some UTC-8 overlap.
AI Trainer & Evaluator
Micro1 seeks AI trainers to evaluate and annotate AI-generated responses across business, finance, healthcare, legal and marketing domains. Paying $20–$40 per hour, this remote contractor role suits graduates with strong critical reading skills and attention to detail. You'll score AI outputs against rubrics, provide feedback to improve model performance, and document findings. Prior experience in content evaluation or AI training is valued.
Video Data Entry Specialist (LATAM)
micro1 pays $6 per hour for video data entry work capturing motion data via smartphone sensors and head-mounted cameras. This contractor role suits those in Latin America and the Caribbean with technical aptitude and meticulous attention to detail. You'll record physical activities following strict protocols, delivering 10+ hours of approved video weekly. No AI experience required, though backgrounds in robotics, kinesiology, or sensor-based work are advantageous.
Knowledge Graph Curator (Knowledge Graph, Data Labeling, LLMs)
Turing seeks a Knowledge Graph Curator to maintain and expand a large-scale knowledge graph for an advanced content understanding platform. The role involves entity onboarding, validating AI-generated data, managing stakeholder requests, data labeling, and quality assurance. You'll work with LLMs, taxonomy management, and semantic modelling, investigating data issues and collaborating with technical and non-technical teams. Full-time, remote with required UK working hours overlap.
Member of Technical Staff, Frontier AI
micro1 seeks a Member of Technical Staff for Frontier AI at $100–$130 per hour, offering full-time remote work. This hands-on role bridges research, data, and deployed systems, requiring ownership of evaluation initiatives, ML dataset design, and failure analysis. You'll translate real-world system behaviour into structured research frameworks, work across teams to raise signal quality, and ensure research claims are defensible and production-ready. Suits those with experience in applied research, RL systems, or agentic AI.
No live roles match your search.
AI training work is organised by profession, task and software — not by topic or sector. Try your field (for example “nursing” or “Python”), clear the filters, or browse the categories further down the page. The always-open talent pools below are a good place to start.
What the work looks like
Evaluating model output
Judge the quality of what a model produces and explain why.
Building reference work
Produce the correct ML solution or analysis the model learns from.
Writing hard problems
Design the evaluation tasks that separate a strong model from a weak one.
Common questions
How much does it pay?
Hourly and contractor-based, varying with seniority and role. Every role card shows its pay band. You invoice as an independent contractor and choose your hours.
Can I do this alongside my current job?
Yes — the work is flexible and part-time by design. Check your employer's policy on outside work first; ACJ can't advise on that.
Who is ACJ, and what's your part in this?
Applied Clinical Judgement is run by Sean Key. We connect qualified people to vetted AI-training platforms (Mercor, micro1, Turing), and Sean personally vouches for the people he refers. We're paid a referral fee by the platform on a successful placement — never by you.
How do I get started?
Find a role below that fits, and apply through the link — it carries Sean's referral. If you'd like him to vouch for you or talk it through first, book a short call.
Sean Key vouches for the people he refers
I’m Sean Key, editor of Applied Clinical Judgement. After 29 years in the NHS I help qualified professionals find legitimate, well-paid AI-training work — and I’ll personally vouch for you when you apply.
Applied Clinical Judgement is a referral intermediary, not an employer or recruiter. We refer candidates to third-party platforms (Mercor, micro1, Turing) and may earn a referral fee on a successful placement. We never charge candidates. Pay rates are set by the platforms and may change. PRAG-DEL-SOL-ONE LTD · Co. 07204925 · VAT 987-3626-64 · ICO ZC086000.
