Live, remote AI-training roles relevant to Model Evaluation Comparison, updated daily. Applied Clinical Judgement is a UK-based referral intermediary: we point you to genuine openings on the major training platforms and are paid only when a referral succeeds. Pay rates are shown on each role; we never display our referral fee.
212 live Model Evaluation Comparison roles · updated daily
Personal finance / consumer planning Evaluator
Mercor is recruiting personal finance and consumer planning evaluators at $80–$120 per hour on a remote, hourly basis. You'll assess AI-generated documents, spreadsheets and presentations for accuracy and quality, applying structured feedback against domain rubrics. The role requires five years' relevant professional experience and fluency in Microsoft Office and Google Workspace.
Process improvement / SOPs Evaluator
Mercor is seeking Process Improvement and SOPs evaluators at $80–$120 per hour to assess AI-generated business documents, spreadsheets, and presentations for accuracy and quality. The role suits professionals with 5+ years' experience in process improvement or standard operating procedures who possess native-level English and strong proficiency in Microsoft Office and Google Workspace. You will apply domain expertise to grade outputs against rubrics and deliver structured written feedback on factual, aesthetic, and presentation quality.
Public-sector procurement / RFI response Evaluator
Mercor seeks experienced procurement professionals at $80–$120 per hour to evaluate AI-generated documents, spreadsheets and presentations for accuracy and quality. You'll assess work against domain-specific rubrics, identify errors across factual, aesthetic and presentation dimensions, and provide structured feedback. Requires 5+ years in public-sector procurement or RFI response, native English fluency, and strong proficiency in Microsoft Office and Google Workspace.
Finance operations / audit support Evaluator
Mercor seeks experienced finance operations and audit professionals at $80–$120 per hour to evaluate AI-generated work products for accuracy and quality. You'll assess documents, spreadsheets, and presentations against domain rubrics, identify errors, and provide structured feedback. Requires 5+ years' relevant experience, native English fluency, and proficiency in Microsoft Office and Google Workspace.
Document/deck production QA Evaluator
Mercor seeks evaluators at $80–$120 hourly to assess AI-generated documents, spreadsheets and presentation decks for accuracy and quality. You'll apply 5+ years of professional QA expertise to grade outputs against domain rubrics, identifying factual and presentation errors, then delivering structured feedback. Requires native or professional English fluency and proficiency in Microsoft Office and Google Workspace.
BI dashboards / performance reporting Evaluator
Mercor is recruiting BI dashboards and performance reporting evaluators at $80–$120 per hour. You'll assess AI-generated work products including documents, spreadsheets, and presentations for accuracy and quality. The role suits professionals with 5+ years' relevant expertise who can provide structured feedback against domain-specific criteria. Remote, hourly contract work.
Investment analysis / valuation / credit Evaluator
Mercor is recruiting evaluators at $80–$120 per hour to assess AI-generated investment analysis, valuation, and credit documents for accuracy and quality. You'll review spreadsheets, presentations, and reports against domain rubrics, flagging errors and providing structured feedback. The role suits experienced professionals with 5+ years in investment or credit work and strong Microsoft Office and Google Workspace skills.
IP / trademark / copyright law Evaluator
Mercor is seeking IP, trademark and copyright law specialists at $80–$120 per hour to evaluate AI-generated legal documents, spreadsheets and presentations. You will assess outputs for accuracy, rigour and quality against domain rubrics, identifying factual and presentation errors and providing structured feedback. Requires 5+ years' professional experience and fluency in Microsoft Office and Google Workspace.
Data quality / CRM operations Evaluator
Mercor is hiring Data quality and CRM operations evaluators at $80–$120 per hour for remote, flexible work. You'll assess AI-generated documents, spreadsheets, and presentations against domain-specific quality standards, identifying errors and providing structured feedback. Requires 5+ years of relevant professional experience, fluent English, and strong proficiency in Microsoft Office and Google Workspace.
Procurement / vendor management Evaluator
Mercor seeks procurement and vendor management specialists to evaluate AI-generated work for accuracy and quality. Earning $80–$120 hourly, you'll review documents, spreadsheets, and presentations against domain-specific standards, identifying errors and providing structured feedback. The role requires five years' relevant experience, native English fluency, and proficiency in Microsoft Office and Google Workspace. Ideal for subject-matter experts seeking flexible remote evaluation work.
Cybersecurity / IT GRC Evaluator
Mercor is recruiting cybersecurity and IT GRC specialists at $80–$120 hourly to evaluate AI-generated work products for accuracy and quality. You'll assess documents, spreadsheets, and presentations against domain rubrics, identifying errors and providing structured feedback. This remote role requires 5+ years' professional experience, native or fluent English, and proficiency in Microsoft Office and Google Workspace.
Operations / inventory / capacity planning Evaluator
Mercor seeks operations, inventory and capacity planning evaluators at $80–$120 per hour to assess AI-generated work products for accuracy and domain quality. You will review documents, spreadsheets and presentations against specialist rubrics, identifying factual and presentation errors and providing structured feedback. Requires 5+ years' relevant professional experience and fluency in Microsoft Office and Google Workspace.
Data analysis / quantitative readouts Evaluator
Mercor seeks experienced data analysis evaluators at $80–$120 per hour to assess AI-generated documents, spreadsheets, and presentations for accuracy and rigour. You'll review outputs against domain rubrics, identify errors, and provide structured feedback. Requires five years' relevant professional experience, native or professional English fluency, and proficiency in Microsoft Office and Google Workspace.
FP&A / corporate finance Evaluator
Mercor is recruiting FP&A and corporate finance evaluators at $80–$120 per hour to assess AI-generated financial documents, spreadsheets, and presentations. You'll apply five or more years of professional expertise to grade outputs against quality rubrics, identifying errors and providing structured feedback. The role suits experienced finance professionals seeking remote, flexible contract work.
General finance / accounting Evaluator
Mercor seeks experienced finance and accounting professionals to evaluate AI-generated documents, spreadsheets and presentations for accuracy and quality. This remote, hourly role (£62–£93/hour) suits those with 5+ years' domain experience and strong Microsoft Office and Google Workspace proficiency. You'll apply structured rubrics to assess outputs and deliver detailed feedback, working flexibly as needed.
Program management / implementation planning Evaluator
Mercor is recruiting Program Management and Implementation Planning Evaluators at $80–$120 per hour. You'll assess AI-generated documents, spreadsheets and presentations for accuracy, rigour and quality, applying structured feedback against domain rubrics. Requires 5+ years relevant professional experience, native or professional English fluency, and strong proficiency with Microsoft Office and Google Workspace, particularly presentation software.
General business strategy / management Evaluator
$80–$120 per hour. Mercor seeks business strategy and management experts to evaluate AI-generated work products for accuracy and quality. You'll assess documents, spreadsheets and presentations against domain-specific rubrics, identifying errors and providing structured feedback. Requires 5+ years' professional experience, fluency in English, and proficiency with Microsoft Office and Google Workspace.
Software / AI / IT / data Evaluator
Mercor seeks experienced Software, AI, IT and data professionals at £60–90 per hour to evaluate AI-generated work products including documents, spreadsheets and presentations. You'll assess outputs against domain-specific quality rubrics, identifying factual, aesthetic and presentation errors, then provide structured written feedback. Requires 5+ years' relevant experience, native or professional English fluency, and strong proficiency in Microsoft Office and Google Workspace.
Product management / roadmap / PRD Evaluator
Mercor is recruiting experienced product managers to evaluate AI-generated work across documents, spreadsheets and presentations. You'll assess outputs for accuracy, rigour and quality using domain expertise, identifying errors and providing structured feedback. The role requires 5+ years in product management, fluency in English, and proficiency with Microsoft Office and Google Workspace. Paid $80–$120 hourly.
Spreadsheet QA / workbook maintenance Evaluator
Mercor is seeking spreadsheet QA and workbook maintenance evaluators at $80–$120 hourly. You'll assess AI-generated documents, spreadsheets, and presentations for accuracy and quality, applying deep subject-matter expertise against structured rubrics. Requires 5+ years relevant professional experience, native or professional English fluency, and advanced proficiency in Microsoft Office and Google Workspace. Remote hourly engagement.
Education / school Evaluator
Mercor seeks experienced education professionals to evaluate AI-generated educational materials including documents, spreadsheets and presentations. This remote hourly role, paying £65–£97 per hour, requires five years' relevant experience and native or professional English fluency. You'll assess outputs against domain-specific quality rubrics, identify errors across factual, aesthetic and presentation dimensions, and deliver structured written feedback. Ideal for subject-matter experts comfortable with detailed quality assurance work.
Healthcare operations Evaluator
Mercor seeks experienced healthcare operations professionals to evaluate AI-generated work products on Mercor's platform, assessing documents, spreadsheets and presentations for accuracy and quality. The role pays $80–$120 hourly and suits evaluators with 5+ years in healthcare operations who can provide structured feedback against domain rubrics. Requires native or professional English fluency and strong Microsoft Office and Google Workspace skills.
Market research / competitive intelligence Evaluator
Mercor is seeking market research and competitive intelligence specialists at $80–$120 hourly to evaluate AI-generated documents, spreadsheets, and presentations for accuracy and quality. You'll apply five years' professional expertise to assess outputs against domain rubrics, identifying factual and presentation errors, then deliver structured feedback. Native or professional English fluency and advanced Microsoft Office and Google Workspace skills are essential.
People ops / recruiting Evaluator
Mercor seeks experienced People Ops and Recruiting evaluators at $80–$120 per hour to assess AI-generated work products including documents, spreadsheets and presentations. You will apply deep subject-matter expertise to grade outputs for accuracy, rigour and domain quality, identifying factual, aesthetic and presentation errors whilst providing structured written feedback. Requires 5+ years relevant professional experience and proficiency in Microsoft Office and Google Workspace.
Product launch / experiment readiness Evaluator
Mercor seeks experienced evaluators at $80–120 per hour to assess AI-generated work products including documents, spreadsheets, and presentations. You will apply five or more years of product launch and experiment readiness expertise to grade outputs for accuracy, rigour, and quality against domain-specific rubrics. The role suits subject-matter experts with native English fluency and strong proficiency in Microsoft Office and Google Workspace, particularly presentation software.
Public health communications Evaluator
Mercor is recruiting public health communications evaluators at $80–$120 per hour. You'll assess AI-generated documents, spreadsheets and presentations for accuracy and quality, applying domain expertise to structured feedback. The role suits professionals with 5+ years in public health communications and fluent English, proficient in Microsoft Office and Google Workspace.
User/customer research and feedback synthesis Evaluator
Mercor is recruiting experienced evaluators at $80–$120 per hour to assess AI-generated research and feedback documents, spreadsheets, and presentations. You'll apply five years' professional expertise in user and customer research to grade outputs for accuracy, rigour, and quality against domain rubrics, identifying errors and providing structured written feedback. Native or fluent English required; advanced proficiency in Microsoft Office and Google Workspace essential.
Investor materials / fundraising / pitchbook Evaluator
$80–$120 per hour. Mercor seeks evaluators with 5+ years' investment materials or fundraising experience to assess AI-generated documents, spreadsheets and presentations. You'll grade outputs against domain-specific rubrics, identifying factual and presentation errors, then deliver structured written feedback. Requires native or professional English fluency and proficiency in Microsoft Office and Google Workspace.
Special education / IEP Evaluator
Mercor seeks special education and IEP evaluators at £61–£92 hourly to assess AI-generated documents, spreadsheets and presentations for accuracy and quality. You'll apply domain expertise to grade outputs against rubrics, identify errors across factual, aesthetic and presentation dimensions, and provide structured feedback. Requires 5+ years relevant professional experience and fluency in Microsoft Office and Google Workspace.
Privacy / regulatory compliance Evaluator
Mercor seeks privacy and regulatory compliance specialists at $80–$120 hourly to evaluate AI-generated documents, spreadsheets and presentations. You'll assess outputs against domain rubrics, identifying factual and presentation errors, then deliver structured feedback. Requires 5+ years' relevant experience and fluency in Microsoft Office and Google Workspace.
Customer success / support operations Evaluator
$80–$120 hourly. Mercor seeks evaluators with 5+ years in customer success or support operations to assess AI-generated documents, spreadsheets and presentations. You'll grade outputs for accuracy, rigour and quality against domain rubrics, identifying errors and providing structured written feedback. Requires fluency in English and advanced proficiency in Microsoft Office and Google Workspace.
Legal contracts / diligence / redlines Evaluator
Mercor seeks experienced legal evaluators at $80–$120 per hour to assess AI-generated contract documents, due diligence materials, and presentation decks. You'll apply five or more years of professional expertise to grade outputs for accuracy and quality, identifying errors and providing structured feedback. Requires native or professional English fluency and proficiency in Microsoft Office and Google Workspace.
General Sales / GTM Evaluator
Mercor is recruiting Sales and Go-To-Market evaluators at $80–$120 per hour to assess AI-generated business documents, spreadsheets, and presentations for accuracy and professional quality. You'll apply five-plus years of sales or GTM expertise to grade outputs against domain rubrics, identifying errors and providing structured feedback. Requires native English fluency and strong Microsoft Office and Google Workspace proficiency.
Compliance / regulatory response with financial-services AI Evaluator
Mercor seeks compliance and regulatory specialists to evaluate AI-generated financial-services documents and presentations. Paying $80–$120 hourly, this remote role suits professionals with 5+ years' experience in financial compliance or regulatory work. You'll assess AI outputs against quality rubrics, identify errors, and provide structured feedback. Fluency in English and proficiency with Microsoft Office and Google Workspace required.
Humanities / arts / culture Evaluator
Mercor seeks humanities, arts, and culture specialists to evaluate AI-generated documents, spreadsheets, and presentations at $80–$120 per hour. You'll assess outputs for accuracy, rigour, and domain quality, identifying factual and aesthetic errors whilst providing structured feedback. Requires 5+ years of relevant professional experience, native or professional English fluency, and proficiency in Microsoft Office and Google Workspace.
Training / onboarding / L&D Evaluator
Mercor is seeking experienced Training and L&D professionals to evaluate AI-generated work products including documents, spreadsheets, and presentations. Paying $80–$120 hourly, this remote role requires five years' relevant experience, native English fluency, and advanced proficiency in Microsoft Office and Google Workspace. You will assess outputs against quality rubrics, identify errors, and provide structured feedback.
Brand / creative direction / marketing collateral Evaluator
Mercor seeks experienced brand and creative direction specialists at $80–$120 hourly to evaluate AI-generated marketing materials including documents, spreadsheets and presentations. You will assess outputs against quality rubrics, identify errors in factual content and aesthetics, and provide structured feedback. Requires 5+ years' professional experience, native English fluency, and proficiency in Microsoft Office and Google Workspace.
Incident management / reliability / SRE Evaluator
Mercor is recruiting SRE and incident management specialists at $80–$120 per hour to evaluate AI-generated technical work. You'll assess documents, spreadsheets and presentations against quality rubrics, identifying errors and providing structured feedback. The role suits experienced reliability engineers and incident commanders who can apply deep domain expertise to grade AI outputs. Remote, hourly contract work.
Media / journalism / communications Evaluator
$80–$120 per hour. Mercor seeks experienced media and journalism professionals to evaluate AI-generated documents, spreadsheets and presentations for accuracy and quality. You'll assess outputs against domain rubrics, identify factual and presentation errors, and provide structured feedback. Requires 5+ years in media, journalism or communications, native English fluency, and proficiency with Microsoft Office and Google Workspace.
Nonprofit / philanthropy / community programs Evaluator
Mercor is seeking experienced nonprofit and philanthropy evaluators at £60–£90 hourly to assess AI-generated documents, spreadsheets and presentations for accuracy and quality. You'll apply five years' sector expertise to review outputs against domain rubrics, identifying factual and presentation errors with structured written feedback. Requires fluent English and proficiency in Microsoft Office and Google Workspace.
Government / public administration Evaluator
Mercor is recruiting Government / public administration Evaluators at $80–$120 hourly to assess AI-generated documents, spreadsheets and presentations for accuracy and quality. You'll apply five years' professional domain expertise to grade outputs against rubrics, spotting factual and presentation errors, then provide structured feedback. Requires native English fluency and strong Microsoft Office and Google Workspace skills.
Clinical / biomedical / pharma Evaluator
Mercor is recruiting clinical, biomedical, and pharmaceutical evaluators at $80–$120 per hour to assess AI-generated documents, spreadsheets, and presentations. You will review outputs for accuracy, rigour, and domain quality using structured rubrics, identifying factual and presentation errors. This role requires five years' relevant professional experience, native or professional English fluency, and proficiency in Microsoft Office and Google Workspace.
Real estate / hospitality / events Evaluator
Mercor seeks evaluators with 5+ years' experience in real estate, hospitality, or events to assess AI-generated documents, spreadsheets, and presentations for accuracy and quality. You'll review outputs against domain-specific rubrics, identify errors, and provide structured feedback. Hourly remote engagement paying $80–$120/hour. Requires native or professional English fluency and proficiency in Microsoft Office and Google Workspace.
Biology / environmental science Evaluator
Mercor is recruiting biology and environmental science evaluators at $80–$120 per hour to assess AI-generated documents, spreadsheets and presentations for accuracy and quality. The role suits professionals with 5+ years' domain experience and fluency in Microsoft Office and Google Workspace. You will apply expert judgment to review outputs against rubrics, identify errors, and provide structured feedback.
Legal / compliance Evaluator
Mercor seeks legal and compliance experts earning $80–$120 hourly to evaluate AI-generated documents, spreadsheets and presentations for accuracy and quality. You'll apply five years' professional experience to grade outputs against domain rubrics, identify errors, and provide structured feedback. Requires native English fluency and proficiency in Microsoft Office and Google Workspace.
Healthcare / clinical Evaluator
Mercor seeks healthcare and clinical evaluators at $80–$120 hourly to assess AI-generated documents, spreadsheets and presentations for accuracy and quality. You'll need 5+ years of relevant professional experience, native English fluency, and proficiency in Microsoft Office and Google Workspace. The role involves applying domain expertise to identify factual and presentation errors, then providing structured feedback using detailed rubrics.
Pricing / ROI / revenue economics Evaluator
Mercor seeks pricing and revenue economics specialists earning $80–$120 hourly to evaluate AI-generated business documents, spreadsheets and presentations. You'll assess accuracy, rigour and domain quality against structured rubrics, identifying factual and presentation errors. Requires five years' relevant experience, native or professional English fluency, and proficiency in Microsoft Office and Google Workspace. Remote hourly contract work.
Generalist Expert
Mercor seeks analytical evaluators at $70/hour to assess AI-generated responses and provide structured feedback. The work suits native English speakers with strong critical reading and writing skills who can identify nuance, spot reasoning gaps, and deliver evidence-based judgments. You'll work independently, applying detailed evaluation guidelines without AI writing assistance.
Japanese Audio Generalist Evaluator Expert
Mercor offers £40 hourly on a short-term remote engagement for Japanese native speakers with professional English. You'll transcribe and evaluate audio content, develop evaluation standards for Japanese language models, and test AI outputs across consumer contexts. Suited to linguists, recent graduates, and those with transcription or localisation experience who understand dialects, keigo, and contemporary Japanese usage.
AI Safety Experts — English & Urdu
Mercor seeks fluent English and Urdu speakers for remote red team roles at $20–$22/hour. You'll probe conversational AI models for vulnerabilities—jailbreaks, prompt injections, bias exploitation, and manipulation tactics—then document reproducible findings. Prior red teaming, cybersecurity, or adversarial ML experience valued. The work is text-based; exposure to sensitive content is optional and supported.
Telecommunications Expert
Mercor seeks experienced telecommunications professionals to evaluate and improve AI systems' understanding of network architecture, telecom operations, and communications technology. This project-based role pays $1,150–$1,450 per completed task. You'll review AI outputs, create realistic telecom scenarios, annotate data, and provide structured feedback on standards, regulatory frameworks, and industry terminology. Suited to network engineers, telecom operations managers, RF specialists, and regulatory experts with 3+ years' industry experience.
Generalist - English & Assamese
Mercor is hiring Assamese-English generalists at $15–$20 per hour for contract work. You'll evaluate AI responses in Assamese, identifying factual errors, reasoning flaws, and communication gaps, then document findings in English. The role suits native Assamese speakers with strong English writing ability, a bachelor's degree, hands-on LLM experience, and analytical background—particularly those with prior RLHF or model evaluation work.
Generalist - English & Odia
Mercor seeks bilingual evaluators fluent in Odia and English to assess AI-generated responses in Odia. You'll conduct fact-checking, identify response strengths and weaknesses, and evaluate reasoning quality to create training data. Ideal candidates hold a bachelor's degree, are native Odia speakers with strong English writing skills, and have experience with large language models. Prior RLHF or model evaluation work is advantageous.
Generalist - English & Gujarati
Mercor is seeking a bilingual generalist earning $15–$20 per hour to evaluate AI-generated Gujarati responses. You'll assess reasoning, factuality, tone and clarity, producing structured feedback to improve model outputs. Suited to bachelor's-qualified native Gujarati speakers with strong English writing skills, LLM experience and analytical backgrounds in research, policy, linguistics or engineering. Contract role, globally remote.
Generalist - English & Malayalam
$15–$20 hourly. Mercor seeks bilingual evaluators fluent in Malayalam and English to assess AI-generated responses. You'll identify strengths and weaknesses in model outputs, conduct fact-checking, and generate structured feedback on reasoning, clarity, and accuracy. Suited to bachelor's-qualified native Malayalam speakers with strong English writing ability, LLM experience, and analytical background (research, policy, linguistics, or engineering). Contract work, global.
Generalist - English & Kannada
Mercor seeks bilingual evaluators fluent in Kannada and English to assess AI-generated responses in Kannada. This contract role, paying $15–$20 hourly, suits graduates with strong LLM experience and analytical backgrounds. You'll identify factual errors, reasoning gaps, and communication issues, producing structured evaluation data to improve model quality. Prior RLHF or annotation work is advantageous.
Generalist - English & Punjabi
Mercor is seeking a Punjabi–English bilingual evaluator at $15–$20 per hour for contract work worldwide. You'll assess AI-generated responses in Punjabi, identifying strengths, weaknesses, and factual errors to improve model output. The role suits bachelor's-educated native Punjabi speakers with strong English writing ability, LLM experience, and analytical skills in fields like research or linguistics. Work involves detailed quality assessment and structured feedback generation.
Generalist - English & Marathi
Mercor offers $15–$20 hourly for bilingual Marathi–English contractors to evaluate AI-generated responses. You'll assess model outputs for factual accuracy, reasoning quality, and conversational alignment, producing structured feedback in English. Suited to native Marathi speakers with a bachelor's degree, strong LLM familiarity, and analytical expertise in fields such as research, policy, or linguistics.
Generalist - English & Tamil
Mercor seeks a Tamil-English bilingual generalist earning $15–$20 hourly to evaluate AI-generated responses in Tamil. You'll assess factual accuracy, reasoning quality, and alignment with guidelines, producing structured feedback that improves model outputs. Ideal candidates hold a bachelor's degree, are native Tamil speakers with strong English writing, understand LLMs deeply, and bring analytical expertise from research, policy, or linguistics backgrounds.
Generalist - English & Telugu
Mercor seeks a Telugu native speaker with bachelor's degree to evaluate AI-generated Telugu responses, identifying strengths and weaknesses for model improvement. You'll conduct fact-checking, assess reasoning quality and tone, and produce evaluation data in English. Ideal candidates have LLM experience, excellent English writing, strong attention to detail, and background in analytical fields like research, policy or linguistics. Prior RLHF or evaluation work preferred. Hourly contract role, $15–$20 per hour.
Generalist - English & Urdu
Mercor seeks a bilingual generalist earning $15–$20 hourly to evaluate Urdu AI responses. You'll assess factual accuracy, reasoning quality, and communication standards, producing structured feedback that improves model outputs. Ideal for bachelor's-qualified native Urdu speakers with strong English writing, LLM familiarity, and analytical background—especially those with prior RLHF or evaluation experience.
Generalist - English & Bengali
Mercor seeks a Bengali native speaker with strong English writing ability to evaluate AI-generated responses in Bengali. Earning $15–$20 hourly, you'll assess model outputs for factual accuracy, reasoning quality, and alignment with guidelines, producing structured feedback to improve AI performance. Suited to those with a bachelor's degree, LLM experience, and analytical background in research, policy, or linguistics.
Legal Expert — Specialist (Real Estate, Tax, Bankruptcy, Estates)
Mercor is recruiting specialist lawyers at $100–$150 per hour to train AI models on legal reasoning across real estate, tax, bankruptcy, and estates practice areas. Candidates must hold a U.S. J.D. or equivalent and maintain active or inactive licensure. The work involves designing realistic scenarios, writing reference responses, evaluating AI outputs against rubrics, and providing feedback to improve model behaviour. Suits experienced practitioners from firms, in-house roles, or government backgrounds willing to commit 15+ hours weekly.
Investment Banking Expert
Mercor is hiring investment banking experts to help train AI systems on financial reasoning. Contributors with 3+ years' IB experience evaluate AI outputs on deal modelling, valuations, and capital markets work. Compensation is $1,750–$2,150 per completed task. The role suits professionals from bulge bracket, boutique, or regional banks with expertise in M&A, ECM/DCM, leveraged finance, or restructuring.
Building Code & Permitting Specialists (ONLY Cal & FL)
Mercor is recruiting Building Code & Permitting Specialists at $65–$90 per hour for California and Florida locations. The role involves annotating construction documents, validating AI-generated permit reviews, and defining annotation standards to train regulatory compliance systems. Suitable for permit expeditors, architects, government plan examiners, or those with deep expertise in specified jurisdictions including Chula Vista, Tampa, Palm Beach County, Pinellas County, Sunny Isles, Jacksonville, and Surfside.
Compliance & Risk Specialist Talent Network
Mercor seeks compliance and risk specialists to join an expert network supporting AI research. Earning $60–$80 per hour, you'll evaluate AI models, assess real-world compliance scenarios, and provide domain-specific feedback. Work 15–30 hours weekly, remote and flexible. Suited to professionals with experience in regulatory frameworks, risk mitigation, and internal audit.
Management Consultant Talent Network
Mercor's Management Consultant Expert Network offers £45–75 hourly rates for experienced strategists to evaluate and train AI models on real-world consulting scenarios. Work comprises 15–30 hours weekly on flexible contract assignments. The role suits management consultants with strong analytical and communication skills who can work independently remotely, contributing domain expertise to frontier AI research projects.
Physical Scientist Talent Network
Mercor's Physical Scientist Talent Network offers £48–£64 hourly for experienced researchers to contribute to AI development. Work comprises model training and evaluation, task creation, and domain feedback on a flexible, rolling-project basis. Suits independent scientists with experimental research and data analysis expertise seeking remote contract work matched to their interests.
Marketing Specialist Talent Network
$60–$80 per hour. Mercor seeks marketing professionals to join their expert network, evaluating and training AI models on real-world marketing scenarios. You'll contribute domain feedback, create training tasks, and help advance frontier AI research. Ideal for experienced digital marketers, strategists, and analysts comfortable working independently on flexible, rolling projects (typically 15–30 hours weekly).
HR & Administration Specialist Talent Network
$60–$80 per hour. Mercor's talent network connects HR and administration specialists with AI research opportunities. You'll train and evaluate AI models, create realistic HR scenarios, and provide domain expertise to advance frontier research. Suited to professionals with background in recruitment, onboarding, payroll, benefits, or compliance. Remote contract work, typically 15–30 hours weekly.
Financial Analyst Talent Network
Mercor's Financial Analyst Talent Network pays $60–$180 hourly for contract work training and evaluating AI models in finance. The role suits experienced financial professionals with expertise in modelling, forecasting, and statement analysis. Work involves creating realistic financial tasks, providing domain feedback, and helping advance AI research. Projects typically require 15–30 hours weekly, entirely remote on a flexible schedule.
Chemist Talent Network
Mercor's Chemist Talent Network offers $60–$80 hourly for chemistry professionals to support AI research projects. Suitable for those with analytical chemistry and laboratory experience, the role involves training and evaluating AI models, creating chemistry-based tasks, and providing domain feedback. Work 15–30 hours weekly from home on a flexible, contract basis.
Biologist Talent Network
$60–$80 hourly. Mercor's Biologist Expert Network connects qualified life scientists with AI research projects. Contribute to training and evaluating AI models whilst providing domain expertise on real-world biology scenarios. Roles suit professionals with molecular and cellular biology experience, strong communication skills, and ability to work independently. Typical projects require 15–30 hours weekly.
Mathematician Talent Network
$60–$80 per hour. Mercor's Mathematician Expert Network connects qualified professionals with AI labs and research companies on a project basis. Roles involve training and evaluating AI models, creating mathematical tasks, and providing domain expertise to advance frontier research. Requires professional experience in statistical analysis, mathematical modelling, proof, theory, or computational mathematics, plus strong communication and independent working ability.
Lawyer Talent Network
Mercor's Lawyer Talent Network offers contract work at $60–$150 per hour for qualified legal professionals. You'll train and evaluate AI models, create legal tasks based on real-world scenarios, and provide domain expertise to frontier AI research. Roles suit lawyers with experience in legal research, writing, contract work, and litigation. Typical projects demand 15–30 hours weekly; you work remotely on a flexible schedule.
Nursing Talent Network
Mercor's Nursing Talent Network offers contract roles at $60–$120 per hour for nursing professionals. Contribute to AI advancement by training and evaluating models, creating nursing-focused tasks, and providing domain expertise feedback. Requires patient care experience, strong communication, and remote work capability. Projects typically demand 15–30 hours weekly, with flexible scheduling.
Physician Talent Network
Mercor's Physician Talent Network offers contract work at $110–$250 per hour for doctors with clinical experience. Physicians contribute to AI model training and evaluation, creating medical scenarios and providing domain expertise to advance AI research. Work is remote and flexible, typically 15–30 hours weekly. Suited to clinicians seeking independent project-based engagement outside traditional employment.
Business Intelligence Analyst Talent Network
$70–$120 per hour. Mercor's Business Intelligence Analyst network connects experienced data professionals with AI research projects. You'll train and evaluate AI models, create realistic scenarios, and provide domain feedback to advance frontier research. Requires SQL expertise, data visualisation proficiency (Tableau/Power BI), and data modelling knowledge. Work flexibly on a rolling basis, typically 15–30 hours weekly.
Frontend Engineer Talent Network
Mercor seeks experienced frontend engineers for its expert network, paying $70–$150 per hour on a contract basis. You'll train and evaluate AI models, create real-world engineering tasks, and provide feedback to advance frontier research. Work flexibly remote, 15–30 hours weekly, if you have strong JavaScript framework experience (React, Vue, Angular), responsive design skills, and independent work habits.
Full-Stack Engineer Talent Network
Mercor's Full-Stack Engineer Expert Network offers $70–$150 per hour for remote contract work on AI research projects. Suited to experienced full-stack developers comfortable with React, Angular, Vue, Node.js, Django, Spring, and both relational and NoSQL databases. Work involves training and evaluating AI models, creating real-world engineering tasks, and providing specialist feedback. Typical commitments range from 15–30 hours weekly, with flexible scheduling.
SOC Investigation Specialist Talent Network
Mercor is recruiting SOC Investigation Specialists at $70–$95 per hour to evaluate and construct high-quality security investigations for AI-driven SOC automation platforms. The role suits experienced Tier 2+ SOC analysts with strong Splunk expertise and proven investigative judgment. You'll review alerts, distinguish true from false positives, perform end-to-end investigations across SIEM and cloud environments, and mentor junior annotators. Hands-on experience with log analysis, entity pivoting, and evidence correlation is essential.
Family Medicine / Primary Care Physician/MD (San Francisco based, Talent Network)
Mercor seeks family medicine or internal medicine MDs (board-certified or eligible) with 2+ years clinical experience for San Francisco-based AI training work. Compensated at $170–$190 hourly. You will annotate clinical text and EHR data, validate AI-generated medical outputs, and contribute to safe, explainable AI system development. Academic hospital background preferred. Part of a talent network for ongoing projects.
Customer Support Email Analyst
Micro1 seeks a Customer Support Email Analyst at $10–$20 per hour on a contractor basis. You'll evaluate customer support communications—both human and automated—for clarity, tone, and effectiveness, identifying areas for improvement. The role suits those with proven customer support or customer experience backgrounds and strong written English. Prior AI training experience is preferred but not required.
Marketing Domain Expert
micro1 offers $10–40 per hour for a Marketing Domain Expert contractor role. You'll develop AI training data by creating realistic marketing content, writing authentic prompts, and evaluating AI responses based on enterprise marketing workflows. The work suits marketing professionals with business writing expertise and Microsoft 365 proficiency who can work independently in a remote setting.
Member of Technical Staff, Coding Research
micro1 seeks a Member of Technical Staff for coding research on a full-time remote basis with a base salary of $140,000–$180,000 USD. You will design evaluation frameworks and benchmarks for frontier coding agents, develop datasets and assessment protocols, and analyse model behaviour to improve performance. The role suits software engineers with 3+ years' experience in ML, AI research, or evaluation, with strong Python or C++ skills and familiarity with LLMs and coding systems.
Private Equity Expert
Micro1 seeks a private equity specialist to train AI systems on deal-making workflows, paying $30–65/hour. You'll review investment memos, analyse case studies, and assess AI outputs for accuracy and commercial relevance. The role suits experienced practitioners from investment banking or PE with strong analytical skills and domain expertise; no prior AI knowledge required. Work independently from home.
AI Evaluation Specialist
$20–$35 per hour. This contractor role on micro1 suits experienced professionals from administrative, research, or quality assurance backgrounds who excel at precise written communication. You'll design AI evaluation tasks with detailed rubrics, observe agent behaviours systematically, and document findings in high-quality English. The work demands meticulous attention to pattern recognition and cross-functional collaboration across diverse domains.
Computational Engineering Expert
Micro1 seeks a computational engineering expert at $20–60/hr to evaluate AI systems' performance in realistic scenarios. You'll apply expertise in CFD, FEA, robotics, and simulation tools (ANSYS, Abaqus, MATLAB, OpenFOAM) to assess models, enhance datasets, and develop benchmarks. The role suits PhD-qualified engineers or equivalent researchers with strong domain knowledge and technical communication skills.
Computational Biology Expert
Micro1 seeks a computational biology expert at $20–60 per hour on a contractor basis. You'll evaluate and annotate AI outputs across genomics, transcriptomics, and systems biology workflows, assessing accuracy and performance against real-world scenarios. The role suits PhD-level scientists with hands-on experience in bioinformatics pipelines, scripting, and domain expertise who can provide rigorous scientific feedback to interdisciplinary teams.
English Voice Coach
Micro1 seeks an experienced voice coach at $30–$65 per hour for part-time remote work training AI voice models. You'll assess AI-generated performances, guide accent and dialect development, and provide real-time feedback to developers. The role demands proven expertise in voice acting, directing, and linguistic nuance, with a background in audio or media projects preferred. No AI experience required.
Finance Expert
Micro1 seeks finance specialists at $30–$65 per hour to train AI systems on Private Equity and Investment Banking content. You'll evaluate financial models, reports, and AI-generated outputs, ensuring accuracy and industry alignment. The role suits experienced finance professionals with deep PE/IB expertise who can communicate complex concepts clearly and work independently in a remote setting.
Customer Service Operator
Turing seeks customer service professionals to support AI benchmarking by recreating real-world support workflows, validating LLM outputs, and generating structured training data. The role involves simulating ticket handling, chat/email scenarios, and escalations whilst ensuring compliance and quality standards. Requires hands-on experience with support platforms like Zendesk or Salesforce, strong attention to detail, and fluent written English. Based in specified countries, minimum 30 hours weekly with 4-hour PST overlap.
Subject Matter Expert — Enterprise Finance & Operations
Turing seeks experienced US-based finance and accounting professionals for senior reviewer roles on a longitudinal finance and operations dataset supporting AI development. Successful candidates bring 5+ years in investment banking, Big 4, FP&A, controllership, or e-commerce operations, with strong knowledge of US GAAP, tax structures, and employment practices. Work involves reviewing complex financial scenarios, leading reviewer consensus, developing evaluation frameworks, and advising client teams. Anticipated 20 hours weekly, fully remote and asynchronous.
AI Quality Analyst (Personalization) - Korean
Turing seeks a Korean-fluent AI Quality Analyst to evaluate personalization features in Gemini. You will design multi-turn prompts using your personal Google account data, then assess how well the model personalizes responses across dimensions including grounding, integration, and helpfulness. The role suits analytically rigorous individuals with prompt engineering experience and strong written communication skills. Contractor position, $15/hour, minimum 4 hours daily with PST timezone overlap.
Subject Matter Expert — Enterprise Finance & Operations
Turing seeks US-based finance and accounting professionals with 5+ years' experience to serve as senior reviewers on AI evaluation projects. The role involves leading complex financial case reviews, developing evaluation frameworks, and advising client teams assessing AI model outputs across enterprise finance and operations. Suited to investment banking, Big 4, FP&A, or e-commerce backgrounds with strong US GAAP knowledge. Fully remote, approximately 20 hours weekly.
AI Analyst – Google Wallet Evaluation (US)
Turing seeks US-based AI Analysts to evaluate Google Wallet functionality via Gemini models over 10 weeks full-time. You'll submit prompts, assess model responses, document findings in Google Sheets, and provide feedback on user experience. Requires active Google Wallet use with linked payment methods, stored passes, and a Plaid account. No technical background needed—attention to detail and analytical thinking matter most.
Senior Dermatologist – Clinical Reviewer
Turing seeks US-based dermatology faculty at Assistant Professor level or above for senior clinical review work supporting AI model development. You'll lead complex case reviews, develop clinical evaluation frameworks, and advise client teams—ideal for academics wanting to apply rigorous expertise to AI evaluation infrastructure. Requires board certification, active practice, subspecialty fellowship training, and publication record. Fully remote, approximately 15 hours weekly.
Small business owners (AI response evaluation)
Turing seeks small business owners or those with strong operational expertise to evaluate AI chatbot responses across realistic business scenarios. Work involves creating prompts, interacting with multiple AI tools, and assessing quality on clarity, usefulness and accuracy. This 10-week project-based role suits individuals with analytical skills and genuine small business knowledge who can provide structured comparative feedback on AI-generated content.
Small business owners (AI response evaluation)
Turing seeks small business owners or those with strong operational expertise to evaluate AI chatbot responses across realistic business scenarios. The 10-week project involves creating prompts, interacting with multiple AI tools, assessing response quality, and providing structured comparative feedback. Fully remote, flexible freelance work evaluating AI performance in accounting and finance contexts.
AI Quality Analyst (Personalization) - Dutch
Turing seeks a Dutch-fluent AI Quality Analyst to evaluate personalization features in Gemini. You'll design multi-turn conversational prompts using your own Google account data and assess how well the model personalizes responses based on your Gmail, search, and YouTube activity. The role demands analytical rigour, creative prompt design, and meticulous attention to response quality across dimensions like grounding and integration. Suitable for candidates with degrees in policy, law, ethics, linguistics, or computer science, ideally with prior AI evaluation or content moderation experience.
AI Quality Analyst (Personalization) - Polish
Turing seeks a Polish-fluent AI Quality Analyst to evaluate personalization features in Gemini. You will design conversational prompts using your personal Google data, then assess how well the model personalizes responses across dimensions like grounding and integration. The role suits analytical thinkers with evaluation experience who can work full-time in their local timezone with PST overlap. Contractor position, one month engagement.
AI Quality Analyst (Personalization) - Russian
Turing seeks a Russian-fluent AI Quality Analyst to evaluate personalization features in Gemini. You will design multi-turn conversational prompts using your personal Google account data, then assess how well the model integrates personalisation across grounding, naturalness, and helpfulness dimensions. The role suits analytical professionals with evaluation experience who can work full-time in their local time zone with 4-hour PST overlap. Three-month contractor position at $15/hour.
AI Quality Analyst (Personalization) - Vietnamese
Turing seeks a Vietnamese-fluent AI Quality Analyst to evaluate personalization features in Gemini. You'll design prompts using your personal Google data, assess model responses for grounding and naturalness, and provide detailed comparative rationales. The role suits analytical professionals with experience in AI evaluation or content moderation. Contractor position at $15/hour, minimum 20 hours weekly with 4-hour PST overlap required.
AI Quality Analyst (Personalization) - Thai
Turing seeks a Thai-language AI Quality Analyst to evaluate personalization features in Gemini. You'll design multi-turn prompts using your personal Google data, then assess whether the model grounds responses appropriately, integrates personal information naturally, and provides helpful outputs. The role suits analytical thinkers with Thai fluency, experience in AI evaluation or annotation, and willingness to use a personal account. Contractor basis, $15/hour, minimum 4 hours daily with PST overlap required.
AI Quality Analyst (Personalization) -Japanese
Turing seeks a Japanese-fluent AI Quality Analyst to evaluate personalization features in Gemini. You will design creative multi-turn prompts using your personal Google account data, then assess how well the model personalizes responses across dimensions like grounding and integration. The role suits analytical thinkers with strong Japanese proficiency and experience in AI evaluation or content moderation. Contractor position, $15/hour, minimum 4 hours weekly with PST overlap.
AI Quality Analyst (Personalization) - Korean
Turing seeks Korean-speaking AI Quality Analysts to evaluate personalization features in Gemini. You'll design multi-turn prompts using your own Google account data, then assess how well the model personalizes responses by analysing grounding, integration, and helpfulness. Requires fluent Korean, analytical rigour, creative prompt design, and meticulous attention to response quality. Contractor role, minimum 20 hours weekly with 4-hour PST overlap.
AI Quality Analyst - English
Turing seeks an AI Quality Analyst to evaluate personalized features in Gemini. You will design creative prompts using your own Google account data, then assess how well the model personalizes responses across dimensions like grounding and helpfulness. The role suits analytical individuals with strong writing skills and experience in AI evaluation or content moderation. Full-time availability with 4-hour daily PST overlap required; three-month contract.
MLE Bench – Data Analyst
Turing seeks experienced Data Analysts for benchmark-driven ML evaluation projects. You'll analyse production datasets, define performance metrics, investigate model outputs and failure modes, and write Python/SQL code to support real-world AI system evaluation. The role suits analysts comfortable at the intersection of data work and machine learning, with strong statistical reasoning and ability to produce documented, reproducible workflows. Three-month contractor assignment, minimum 20 hours weekly with PST overlap.
Senior Software Engineer – LLM Evaluation (US/Canada/WEU based)
Turing seeks senior software engineers to evaluate and improve large language models through code curation, review, and refinement across multiple languages. You'll assess AI-generated code for production readiness, design verification systems, and collaborate with research teams on frontier AI projects. Requires 3+ years' engineering experience and expertise in full-stack development.
Finance Expert (US based)
Turing seeks US-based finance experts with 2+ years' experience across capital markets, trading, investment banking, private equity, or related specialisms. You'll evaluate large language models in finance domains, develop assessment rubrics, and collaborate with AI researchers to improve model performance. No prior AI experience required. Flexible remote work, 10–30 hours per week.
LLM Go Developer
Turing seeks Go developers with 3+ years' experience to review and validate AI-generated code for next-generation dialogue agents. You'll work cross-functionally on feature design and delivery, lead code quality initiatives, and contribute to public repositories. This contract role suits engineers with strong Go expertise, leadership capability, and interest in LLM systems. Fast-paced environment focused on education, entertainment, and general question-answering applications.
Senior Software Engineer – LLM Evaluation
Turing seeks experienced software engineers to evaluate and refine AI-generated code across multiple languages for LLM training datasets. You'll curate code examples, assess model outputs for efficiency and reliability, and design verification mechanisms for software engineering tasks. Requires 2+ years full-time experience at top-tier product companies and deep expertise in full-stack development, architecture, and code quality assessment. Flexible contractor role, 10–40 hours weekly with partial PST overlap.
Mathematics Expert (Master’s/Ph.D.)
Turing seeks mathematics graduates and PhD candidates to design, solve, and evaluate mathematical problems for large language model improvement. Work involves creating rigorous multi-step problems, providing detailed solutions, reviewing AI-generated answers, and developing Python and Lean-based computational tasks. Requires strong analytical skills, clear communication, and ability to work independently in a remote, contractor-based role with flexible weekly hour commitments.
Business Analyst (Finance)
Turing seeks finance professionals with 2+ years' experience in investment banking, private equity, FP&A, accounting or consulting to evaluate AI model outputs and develop assessment rubrics. Work flexibly (10–30 hours weekly) from India, collaborating with AI researchers to improve financial AI systems. CFA, CA, CPA or MBA preferred; strong finance knowledge and English proficiency essential. No prior AI experience required.
LLM Java Developer
Turing seeks a Java developer to work on large language model training and optimization. You'll design and maintain backend systems, evaluate model performance, conduct supervised fine-tuning, and collaborate on RLHF initiatives. The role involves creating datasets, ranking model responses, and developing evaluation strategies across diverse domains. Requires a bachelor's degree in engineering or computer science, Java proficiency, and web application development experience. Minimum 20 hours weekly commitment with 4-hour PST overlap.
JavaScript / TypeScript Full-Stack Developer
Turing seeks JavaScript/TypeScript full-stack developers for remote contractor work with US-based AI companies. You'll design and maintain code for AI model training and optimization, conduct model evaluations, rank responses, develop datasets for supervised fine-tuning, and collaborate on RLHF initiatives. Requires a bachelor's degree in engineering or computer science (or equivalent), demonstrable web application development experience, proficiency in JavaScript ES6, and experience with Node.js, React, Angular, Vue, or Nest.js. Minimum 20 hours weekly; flexible commitment options available.
Python + Full-Stack (JS) Developer
Turing seeks Python and full-stack JavaScript developers to build AI training solutions for US-based companies. The contractor role involves designing code for AI model optimisation, conducting model evaluations, creating datasets for supervised fine-tuning, and collaborating on RLHF processes. Minimum 20 hours weekly with 4-hour PST overlap required. Bachelor's degree in engineering or computer science (or equivalent) and Docker proficiency mandatory.
Data Scientist/Analyst
Turing seeks Python-proficient data scientists and analysts to support frontier AI development for US-based companies. The role combines model evaluation, supervised fine-tuning, and RLHF work with strong analytical and communication skills. Contractors work flexibly (20–40 hours weekly) on cutting-edge AI projects, writing clean code, benchmarking performance, and collaborating with researchers to refine AI systems.
Member of Technical Staff, Medical Research
micro1 seeks a Member of Technical Staff with an advanced healthcare qualification (MD, PhD, MPH, PharmD or equivalent) to design evaluation frameworks for AI systems in clinical and biomedical contexts. The role involves developing benchmarks, conducting medical research, analysing AI performance in healthcare scenarios, and shaping industry standards. Requires deep clinical or healthcare research expertise and proven experience in multidisciplinary research environments.
Member of Technical Staff, Legal Research
micro1 seeks a Member of Technical Staff for legal AI research. The stated hourly rate is $7–$8; however, the full-time base salary is $200,000–$250,000 with equity and performance bonuses. You'll design evaluation frameworks for AI legal agents, conduct original research on legal reasoning and workflow automation, and develop benchmarks for complex legal tasks. Requires a JD, LLM, SJD, or equivalent with deep expertise in corporate law, contracts, litigation, compliance, IP, employment law, or policy. Best suited to researchers with interdisciplinary experience and strong analytical writing skills.
Member of Technical Staff, Finance Research
micro1 seeks a Member of Technical Staff for Finance Research to build evaluation frameworks for AI agents in financial domains. This full-time remote role suits PhD-holders or advanced finance professionals with deep expertise in capital markets, risk management, or quantitative finance. You'll conduct original research, develop benchmarks, curate datasets, and collaborate across research and engineering teams to advance enterprise financial AI capabilities.
Chemistry Expert (PhD) — AI Safety
Mercor seeks PhD-level chemists at $65–70 per hour to strengthen AI model safety through chemistry expertise. You'll write expert prompts, evaluate model responses for scientific accuracy, and classify conversations using structured guidelines. No AI/ML background required; training provided. Part-time roles (15–25 hours weekly, flexible to 40) suit researchers with deep familiarity of modern laboratory or computational techniques and sound judgment on chemical safety and dual-use information handling.
Transactional Attorney
micro1 seeks experienced transactional attorneys at $80–$105 per hour for remote part-time contract work. You'll review AI-generated contract analyses, assess model performance on redlining tasks, and create evaluation frameworks to improve legal AI systems. Suited to in-house counsel with three years' tech transaction experience and active US bar admission.
Business Document Expert (Portuguese Speaker)
Paying $20–$70 per hour, this remote contract role suits business professionals fluent in Portuguese and English. You'll evaluate AI-generated business documents across finance, strategy, marketing and operations on the micro1 platform, assessing them against Fortune 500 standards. Requires a bachelor's degree, three years' professional business experience, and expert proficiency in Excel, PowerPoint and Word.
Business Document Expert (French Speaker)
Earning £16–£56 per hour, this contract role on micro1 invites business professionals fluent in French to evaluate AI-generated documents for quality and professional standards. You'll assess outputs across finance, strategy, marketing and operations, providing structured feedback to improve AI systems. Ideal for those with 3+ years in business functions and mastery of Office Suite tools.
Business Document Expert (German Speaker)
Earning $20–$70 per hour on micro1, this contract role suits German-speaking business professionals with three years' experience in strategy, finance, marketing or operations. You'll evaluate AI-generated business documents using Excel, PowerPoint and Word, testing them against corporate standards, providing feedback, and collaborating with cross-functional teams to improve AI outputs. Based ideally in the US.
Business Document Expert (Chinese Speaker)
Micro1 seeks a Business Document Expert fluent in Chinese for remote contract work at $20–$70 per hour. You'll evaluate AI-generated business documents across finance, strategy, marketing and operations, ensuring they meet Fortune 500 standards. The role requires a bachelor's degree, 3+ years in business functions, and mastery of Excel, PowerPoint and Word. You'll provide structured feedback, design realistic business scenarios, and help improve AI systems through rigorous quality assessment.
Business Document Expert (Japanese Speaker)
Paying $30–$70 per hour, this remote contract role on micro1 suits bilingual business professionals with 3+ years' experience and fluency in Japanese. You'll evaluate AI-generated business documents across finance, strategy, marketing and operations, ensuring they meet Fortune 500 standards. The work involves Excel, PowerPoint and Word expertise, critical feedback on AI outputs, and designing realistic professional scenarios.
Business Document Expert (Korean Speaker)
Micro1 seeks a Korean-fluent Business Document Expert at $30–70/hr to evaluate AI-generated deliverables across finance, strategy, marketing, and operations. You'll assess outputs against Fortune 500 standards, provide structured feedback, and design realistic business scenarios using Excel, PowerPoint, and Word. Requires a bachelor's degree and 3+ years in business functions. Contract, remote, US-based preferred.
Spanish (Spain) Audio Generalist Evaluator Expert
Mercor is recruiting a Spanish audio evaluator at $50/hour to support AI research through transcription and model evaluation tasks. The role suits fluent Spanish (Spain) speakers with strong writing skills and analytical backgrounds who can work 10–20 hours weekly on short-term projects. You'll transcribe audio, develop evaluation standards, grade model outputs, and document quality benchmarks for advanced language model training.
Business Analyst
Turing seeks analytical professionals to evaluate and improve large language models through content analysis, fact verification, and reasoning task creation. You'll review AI outputs, identify logical gaps, and provide detailed feedback to enhance model performance. The role suits detail-oriented researchers comfortable working independently with strong critical thinking skills and excellent written English. Remote contractor position requiring 30 hours weekly with some UTC-8 overlap.
LLM Expert - Chemistry
Turing seeks a chemistry expert with a master's or doctorate to evaluate and improve large language models for scientific applications. You'll develop datasets, benchmarks, and assessment frameworks; evaluate AI-generated responses for accuracy; and build Python-based evaluation pipelines. The role suits chemists or chemical engineers with LLM familiarity seeking full-time remote work across 24 weeks, requiring 4-hour Pacific time overlap daily.
Software Engineer – AI Code Evaluation & Benchmarking (SWE-Bench)
Turing seeks experienced software engineers to evaluate and benchmark AI-generated code for large language models. You'll assess coding solutions, identify correctness issues, debug implementations, and build evaluation datasets. The role suits engineers with strong code review experience and deep software engineering expertise. Minimum 20 hours weekly with 4-hour PST overlap; one-month contractor assignment.
Board Game Reasoning Expert (AI Training & Evaluation)
Turing seeks Board Game Reasoning Experts to develop and evaluate AI training datasets. You'll analyse game scenarios, assess AI reasoning quality, and create evaluation rubrics using expertise in board games, game mechanics, logic, and strategic systems. Requires a bachelor's degree in an analytical discipline and 2+ years' experience in game design, playtesting, or strategy communities. Fully remote contractor role, 20 hours weekly minimum with PST overlap.
AI Quality Analyst (Personalization) - German
Turing seeks a German-fluent AI Quality Analyst to evaluate personalization features in Gemini. You'll design multi-turn prompts using your personal Google account data, then assess how well the model personalizes responses across dimensions including grounding, integration, and helpfulness. The role requires full-time availability with 4-hour PST overlap, analytical rigour, and meticulous attention to nuance in AI output evaluation.
AI Quality Analyst (Personalization) - Hindi
Turing seeks Hindi-fluent AI Quality Analysts to evaluate personalization features in Gemini. You'll design conversational prompts using your personal Google data, assess how well the model personalizes responses, and rank model outputs side-by-side. The role requires analytical rigour, creative prompt design, and meticulous attention to detail. Contractor position, minimum 20 hours weekly with PST overlap required.
AI Quality Analyst (Personalization) - Indonesian
Turing seeks an AI Quality Analyst to evaluate personalization features in Gemini, assessing how the model integrates personal data from Gmail, Search, and YouTube to deliver relevant responses. The role suits Indonesian speakers with analytical backgrounds in policy, law, ethics, linguistics, journalism, or computer science. Work involves designing creative prompts, evaluating model responses for grounding and integration quality, and writing detailed comparative assessments. Contractor position at $15 hourly; 30–40 hours weekly with 4-hour PST overlap required.
AI Quality Analyst (Personalization) - Turkish
Turing seeks Turkish-fluent AI Quality Analysts to evaluate a personalization feature for Gemini. You'll design multi-turn prompts based on your personal Google account data, assess model responses for grounding and integration quality, and rank side-by-side comparisons with detailed written rationales. Requires bachelor's degree or equivalent, data annotation or AI evaluation experience preferred, full-time availability with 4-hour PST overlap. Contractor role, 3-month engagement.
AI Quality Analyst (Gemini) - Chinese
Turing seeks a Chinese-speaking AI Quality Analyst to evaluate Gemini's personalisation feature. You'll design multi-turn conversational prompts using personal context, analyse model responses for grounding and helpfulness, and rank side-by-side comparisons with detailed rationales. Requires fluent Chinese, strong analytical skills, and experience in AI evaluation or data annotation. Three-month contractor role at $15/hour, minimum 4 hours daily with PST overlap.
AI Quality Analyst (Personalization) - Spanish
Turing seeks Spanish-fluent contractors to evaluate personalized AI responses for Gemini at $15 per hour. This three-month role suits those with analytical and creative abilities who can design conversational prompts, assess model quality across grounding and integration dimensions, and provide detailed feedback. Work involves 4–40 weekly hours with 4-hour PST overlap, requiring full-time availability in your timezone and use of your primary Google account.
AI Quality Analyst (Personalization) - Arabic
Turing seeks Arabic-fluent AI Quality Analysts to evaluate personalization features in Gemini. The role involves designing multi-turn prompts using personal Google account data, assessing how well the model personalizes responses, and writing detailed comparative analyses. Candidates need strong analytical skills, prompt engineering experience, and willingness to work in a global 24-hour operations team. Contractor basis, 3 months.
AI Quality Analyst (Personalization) - Portuguese
Turing seeks a Portuguese-fluent AI Quality Analyst to evaluate personalization features in Gemini. You'll design multi-turn conversational prompts using your own Google account data, then assess how well the model leverages personal information from Gmail, Search, and YouTube to generate helpful, grounded responses. The role demands analytical rigour, creative prompt design, and meticulous attention to subtle quality differences. Suitable for candidates with background in policy, law, ethics, linguistics, or computer science, preferably with prior AI evaluation or annotation experience.
AI Trainer & Evaluator
Micro1 seeks AI trainers to evaluate and annotate AI-generated responses across business, finance, healthcare, legal and marketing domains. Paying $20–$40 per hour, this remote contractor role suits graduates with strong critical reading skills and attention to detail. You'll score AI outputs against rubrics, provide feedback to improve model performance, and document findings. Prior experience in content evaluation or AI training is valued.
.NET Engineer
Micro1 seeks .NET engineers to train AI systems on technical problem-solving at $30–$90 per hour. This remote contracting role suits developers with substantial production experience in .NET, C#, AWS, and Azure. You'll evaluate model outputs, review code quality, and share engineering expertise to improve AI reasoning. No AI background required.
Python Developer
Earning $50–$100 per hour, this remote contract role on micro1 invites experienced backend developers to evaluate AI coding tools by testing models in real-world workflows. You'll design and maintain REST and GraphQL APIs, optimise databases, and provide detailed feedback through incident reports and surveys during intensive testing cycles. Requires 3+ years' Python experience and familiarity with Cursor.
Go Developer
Micro1 seeks experienced Go developers (5+ years) for part-time remote contract work evaluating alpha-stage AI coding tools, paying $30–$90 per hour. You'll build REST and GraphQL endpoints, conduct rigorous testing of AI models within Cursor, identify bugs and edge cases, and provide detailed feedback to research teams. Suits developers with strong Go fundamentals, familiarity with AI-powered tools, and genuine enthusiasm for technical exploration.
Rust Developer
Contract role paying $30–$90 per hour on micro1. Experienced Rust backend developers will design and optimise REST and GraphQL APIs whilst testing AI-powered developer tools like Cursor. The work involves intensive 4-day testing bursts, detailed bug reporting, and collaboration with a research team. Requires 5+ years' backend experience, proficiency with databases and security practices, and strong communication skills.
Business Document Expert (Japanese Speaker)
Paying $30–$70 per hour, this remote contract role on micro1 suits bilingual business professionals with 3+ years' experience and fluency in Japanese. You'll evaluate AI-generated business documents across finance, strategy, marketing and operations, ensuring they meet Fortune 500 standards. The work involves Excel, PowerPoint and Word expertise, critical feedback on AI outputs, and designing realistic professional scenarios.
Business Document Expert (Korean Speaker)
Micro1 seeks a Korean-fluent Business Document Expert at $30–70/hr to evaluate AI-generated deliverables across finance, strategy, marketing, and operations. You'll assess outputs against Fortune 500 standards, provide structured feedback, and design realistic business scenarios using Excel, PowerPoint, and Word. Requires a bachelor's degree and 3+ years in business functions. Contract, remote, US-based preferred.
English (New Zealand) Audio Generalist Evaluator Expert
Mercor is seeking a New Zealand English audio evaluator at $50/hour for a short-term AI research project. You'll transcribe and analyse audio content, develop evaluation standards, test language models, and support benchmarking work. The role suits native or near-native NZ English speakers with strong writing skills, ideally from linguistics, humanities, journalism or technical backgrounds, committed to 10–20 hours weekly.
AI Quality Analyst - Portuguese (Portugal)
Turing seeks a Portuguese-based AI Quality Analyst to evaluate personalization features in Gemini at $15 hourly. You will design conversational prompts, assess how the model uses personal data from Gmail, Search and YouTube, and rank model responses on grounding, integration and helpfulness. Requires bachelor's degree, Portuguese fluency, willingness to use your personal Google account, and availability for full-time hours with PST overlap. Three-month contractor role.
Python Developer
Earning $50–$100 per hour, this remote contract role on micro1 invites experienced backend developers to evaluate AI coding tools by testing models in real-world workflows. You'll design and maintain REST and GraphQL APIs, optimise databases, and provide detailed feedback through incident reports and surveys during intensive testing cycles. Requires 3+ years' Python experience and familiarity with Cursor.
Go Developer
Micro1 seeks experienced Go developers (5+ years) for part-time remote contract work evaluating alpha-stage AI coding tools, paying $30–$90 per hour. You'll build REST and GraphQL endpoints, conduct rigorous testing of AI models within Cursor, identify bugs and edge cases, and provide detailed feedback to research teams. Suits developers with strong Go fundamentals, familiarity with AI-powered tools, and genuine enthusiasm for technical exploration.
Rust Developer
Contract role paying $30–$90 per hour on micro1. Experienced Rust backend developers will design and optimise REST and GraphQL APIs whilst testing AI-powered developer tools like Cursor. The work involves intensive 4-day testing bursts, detailed bug reporting, and collaboration with a research team. Requires 5+ years' backend experience, proficiency with databases and security practices, and strong communication skills.
Trade Surveillance Expert (Market Abuse & Conduct Monitoring)
Mercor is recruiting Trade Surveillance Experts at $100/hour to help develop and evaluate AI systems for market abuse detection and capital markets compliance. The role suits professionals with 5+ years' experience in trade surveillance, market monitoring, or regulatory oversight who have used platforms like Nasdaq SMARTS or NICE Actimize. You will analyse trading scenarios, assess AI-generated alerts, develop evaluation rubrics, and provide expert feedback to improve compliance monitoring models.
Human Baseliner for Open-Ended ML Research Tasks
Mercor seeks experienced ML engineers and researchers as human baseliners, paying $75–$90 hourly. You'll complete open-ended ML research tasks in sandboxed environments, establishing performance benchmarks against frontier AI agents. Requires 3+ years' ML experience (including PhD time), top-100 university or FAANG background, expertise in PyTorch/JAX/TensorFlow, and deep hands-on knowledge in pretraining, reinforcement learning, post-training, dataset curation, or model architecture. Minimum 20 hours weekly commitment.
LLM Expert - Chemistry
Turing seeks a Chemistry LLM Expert to develop datasets, benchmarks, and evaluation frameworks for language models in chemistry and materials science. You'll create reference answers and grading rubrics, assess AI-generated responses for scientific accuracy, and build Python-based evaluation pipelines. The 24-week contract requires a Master's or Ph.D. in Chemistry, Chemical Engineering, or Materials Science, plus Python proficiency and familiarity with LLM evaluation and prompt engineering.
AI Quality Analyst (Gemini) - Chinese
Turing seeks a Chinese-fluent AI Quality Analyst to evaluate Gemini's personalization features. The role involves designing multi-turn conversational prompts, assessing model responses for grounding and helpfulness, and writing detailed evaluation rationales. Suited to those with analytical backgrounds and experience in AI quality evaluation, data annotation, or content moderation.
Funds Attorney
$80–$105 per hour. micro1 seeks qualified funds attorneys for part-time contract work evaluating and improving AI systems trained on legal contract analysis. You'll review AI responses to redlining scenarios, create evaluation frameworks, and collaborate with product teams to refine contract review solutions. Requires J.D. from ABA-accredited school, active US bar admission, and minimum two years' funds department experience at a corporate law firm.
Psychiatry Expert
$130–$180 per hour. Mercor seeks psychiatry physicians—attendings, final-year residents, or fellows—to design clinical scenarios, write reference responses, and grade AI model outputs on mental health reasoning. Work covers diagnostic formulation, medication management, risk assessment, and capacity evaluation. Entirely asynchronous, 20 hours weekly.
Internal Medicine Expert
$130–$180 per hour. Mercor seeks Internal Medicine physicians to train frontier AI models on healthcare reasoning. You'll design clinical scenarios, write reference responses, and grade AI outputs against evidence-based standards. The role suits IM attendings, hospitalists, subspecialists, final-year residents, and fellows. Work is fully remote and asynchronous at 20 hours weekly.
Family Medicine / Primary Care Physician/MD (San Francisco based, Talent Network)
Mercor seeks family medicine or internal medicine MDs (board-certified or eligible) with 2+ years clinical experience for San Francisco-based AI training work. Compensated at $170–$190 hourly. You will annotate clinical text and EHR data, validate AI-generated medical outputs, and contribute to safe, explainable AI system development. Academic hospital background preferred. Part of a talent network for ongoing projects.
M&A Attorney
Micro1 seeks experienced M&A attorneys at $80–105 per hour for part-time contract work developing AI systems that evaluate and redline legal agreements. You'll review AI responses to contract scenarios, create evaluation frameworks, and advise product teams on improving contract analysis tools. Requires J.D., active U.S. bar admission, and minimum two years' M&A experience in a corporate law firm. Ideal for lawyers with tech sector background or legal tech interest.
Virtual Assistant Expert (AI Operations & Executive Support)
Mercor is recruiting Virtual Assistant Experts at $60/hour to support a leading AI research laboratory. The role centres on executive support, operations coordination, and AI evaluation work including calendar management, research synthesis, document organisation, and meeting coordination across distributed teams. Suited to detail-oriented professionals with 2+ years in administrative or operations roles and strong written communication. Remote, fully independent position requiring proficiency with standard workplace tools.
Medical Expert
Mercor offers $130–$180 hourly for US-based physicians (attendings, final-year residents, or board-certified/eligible fellows) to train frontier AI models on healthcare reasoning. You'll design clinical scenarios, write high-quality reference responses, grade model outputs, and provide structured feedback. Work is fully remote and asynchronous, with 20 hours per week as standard.
AI Analyst – Google Wallet Evaluation (US)
Turing seeks detail-oriented AI Analysts in the US to evaluate Gemini model responses for a Google Wallet project. This ten-week full-time contract suits recent graduates and professionals with everyday experience of Google Wallet and willingness to learn AI tools. Work involves submitting prompts, assessing output quality, documenting findings in Google Sheets, and providing structured feedback on user experiences.
AI Analyst – Google Wallet Evaluation (US)
Turing seeks US-based AI Analysts to evaluate Google Wallet functionality via Gemini models over 10 weeks full-time. You'll submit prompts, assess model responses, document findings in Google Sheets, and provide feedback on user experience. Requires active Google Wallet use with linked payment methods, stored passes, and a Plaid account. No technical background needed—attention to detail and analytical thinking matter most.
Legal Expert
$100–$150 per hour. Mercor seeks US-licensed legal practitioners with 2+ years' employment or labour law experience to evaluate AI-generated responses on workplace disputes and employment matters. The role involves assessing model outputs for accuracy and real-world applicability, providing written feedback, and participating in calibration sessions. Flexible, part-time commitment of 6–15 hours weekly.
Document Review Expert
Mercor seeks detail-oriented document review experts at $35–$40 hourly to evaluate and improve AI-generated responses for research projects. You'll apply structured evaluation guidelines, identify nuances and errors, and provide honest critical assessment. Ideal candidates have document-heavy professional experience in paralegal, executive assistant, consulting, banking or legal roles. Native English fluency essential; US-based only.
Utilisation Management / Case Management leader (RN/Physician-advisor)
$100–$150 hourly. Mercor seeks experienced RN UM directors, case management managers, and physician advisors to evaluate AI tools for clinical review and medical-necessity determination. You will assess AI outputs against InterQual, MCG, and Milliman criteria, manage physician advisor programmes, oversee UM operations, and provide structured feedback to improve AI training datasets. Requires 5+ years UM/case management experience with 2+ years in leadership, active clinical licensure, and expertise in CMS regulations and payer processes.
Revenue-cycle analytics / decision-support / RCM reporting leader
Mercor seeks experienced RCM analytics and decision-support leaders at $100/hour to evaluate AI-generated revenue cycle intelligence outputs. The role suits healthcare finance professionals with 5+ years' analytics experience and 2+ years in leadership, with proven expertise in KPI dashboards, financial modelling, and healthcare BI tools. You'll develop reporting frameworks, conduct root cause analyses, and annotate AI outputs to strengthen AI training datasets across patient access, coding, billing, and collections.
Revenue-cycle Executive (VP/Sr. Director Revenue Cycle, or RCM-focused Finance Leader)
Mercor is recruiting experienced revenue cycle executives at $162/hour to evaluate AI tools aimed at transforming healthcare financial performance. The role suits VP-level and senior director candidates with 10+ years' leadership experience in RCM operations, payer relations, and financial transformation. You will assess AI-generated analyses, oversee end-to-end revenue cycle strategy, and provide structured feedback to improve AI training datasets. Based fully remote.
Payment-posting & Reconciliation Manager
$85/hour. Mercor seeks experienced Payment-posting & Reconciliation Managers to evaluate AI tools automating healthcare cash posting and payment workflows. You'll oversee ERA processing, EOB posting, and reconciliation operations while assessing AI-generated outputs for accuracy. The role requires 5+ years in cash posting or revenue cycle work, with 2+ years management experience and deep knowledge of remittance processing across multiple payers.
Patient Financial Services Leader
Mercor is recruiting a Patient Financial Services Leader at $92/hour to evaluate AI systems designed for self-pay revenue recovery. You'll lead collections operations, assess AI-generated patient communications for compliance and accuracy, and provide structured feedback for AI training. The role suits experienced revenue cycle leaders with deep knowledge of FDCPA compliance, payment plan administration, and patient financial engagement. You'll work directly with a leading AI research lab on frontier healthcare applications.
Underpayment & Managed-care Contract Specialist
Mercor seeks a US-based Underpayment & Managed Care Contract Specialist at $85/hour to evaluate AI systems designed for payment variance detection and contract compliance. You'll analyse underpayments across commercial and managed care contracts, interpret payer terms, identify systematic payment gaps, and provide feedback to train AI models. This role suits healthcare revenue professionals with 5+ years' experience in contract interpretation and recovery operations seeking exposure to frontier AI applications in healthcare finance.
A/R Follow-up Manager
Mercor is recruiting experienced A/R Follow-up Managers at $75/hour to evaluate AI systems designed to automate accounts receivable and payer collections workflows. You'll assess AI-generated follow-up recommendations, manage claim resolution across multiple payers, and provide structured feedback to improve AI training datasets. This suits revenue cycle professionals with 5+ years' A/R experience and at least 2 years in management roles.
Denials Management & Appeals Manager
Mercor is recruiting denials management and appeals managers at $93/hour to evaluate AI tools for denial prevention and appeal automation. The role suits experienced healthcare revenue cycle professionals with 5+ years' background and management experience. You'll assess AI-generated appeals, analyse denial trends, coordinate prevention strategies, and provide structured feedback to train AI systems used across payer operations.
Medical Billing Manager
$80/hour. Mercor seeks experienced billing managers to evaluate AI systems designed for medical claims and revenue cycle workflows. You'll assess AI-generated billing outputs for accuracy and compliance, oversee claims submission operations, and train datasets on healthcare billing rules. The role suits professionals with 5+ years' medical billing experience, including 2+ years managing teams, combined with expertise in payer requirements, EDI standards, and billing platforms.
Medical Revenue Manager
Mercor seeks a Medical Revenue Manager at $88/hour to evaluate AI tools for healthcare revenue integrity. You'll assess AI-generated charge reviews, conduct charge audits, and maintain CDM systems whilst providing feedback to improve AI training datasets. Suited to charge capture and revenue integrity professionals with 5+ years' experience and CMS/OIG compliance expertise.
Risk-adjustment / HCC coding leader
Mercor is recruiting experienced risk-adjustment and HCC coding leaders at $110/hour to evaluate AI tools that improve risk score accuracy in Medicare Advantage, Medicaid, and ACA programmes. You'll review AI-generated coding assignments against clinical records, oversee RADV audit processes, monitor KPIs, and provide structured feedback to train AI systems. The role demands 5+ years' coding experience with at least 2 years in leadership, plus expertise in CMS-HCC methodologies and regulatory compliance. Ideal candidates hold relevant coding credentials and understand value-based healthcare settings.
Clinical Documentation Integrity (CDI) Leader
Mercor is recruiting experienced CDI leaders at $84/hour for a US-based AI research lab. You'll evaluate AI tools designed to enhance clinical documentation accuracy, coding integrity, and revenue compliance. The role suits clinical documentation specialists with 5+ years' experience and at least 2 years in management, combining physician query oversight, DRG optimisation knowledge, and ability to assess AI-generated clinical recommendations against compliance standards. Work fully remote, guiding AI system improvements.
Patient Financial Clearance Leader
Mercor is recruiting a Patient Financial Clearance Leader at $135/hour to evaluate AI tools for financial clearance and patient assistance workflows. The role requires five years' experience in patient financial counselling with at least two years in leadership, plus expertise in charity care, Medicaid screening, and revenue cycle management. Responsibilities include assessing AI-generated recommendations, ensuring regulatory compliance, and annotating outputs for AI training. Candidates should possess strong knowledge of 501(r) regulations and proficiency with EHR platforms.
Pharmacy Prior Authorization & Specialty-Medication Access Specialist
Mercor is recruiting pharmacy prior authorization specialists at $75/hour to evaluate AI systems designed for pharmacy benefit management and specialty drug workflows. The role suits professionals with 5+ years' experience in pharmacy PA, specialty medication access, or PBM, ideally holding CPhT or CPAP credentials. You'll assess AI-generated recommendations for accuracy and payer compliance, manage prior authorization requests across commercial and government payers, handle appeals, and provide structured feedback to train AI models.
Prior Authorisation Manager
Mercor is recruiting Prior Authorisation Managers at $150/hour to train AI systems for healthcare workflows. You'll evaluate AI-generated authorisation recommendations, manage end-to-end payer workflows, and annotate clinical outputs to improve AI capabilities. Requires 5+ years' prior authorisation experience, 2+ years in management, clinical licensure or relevant certification, and expertise with commercial, Medicare Advantage, and Medicaid requirements. Ideal for experienced clinical reviewers ready to shape AI development in healthcare revenue cycle.
Insurance Verification & Benefit Manager
Mercor is recruiting insurance verification and benefits managers at $105/hour to evaluate AI systems designed to automate front-end revenue cycle operations. You'll assess AI-generated eligibility outputs, oversee verification workflows across multiple payer types, and provide training feedback to improve AI accuracy. The role suits experienced professionals with 5+ years in insurance verification or benefits management, plus expertise in EDI transactions, payer systems, and EHR platforms.
Patient Access Leader
Mercor seeks an experienced Patient Access Leader at $80/hour to evaluate AI tools for healthcare revenue cycle workflows. You'll assess AI-generated outputs, develop registration policies, monitor KPIs, and ensure HIPAA and CMS compliance. Requires 5+ years in patient access with 2+ years management experience, proficiency in Epic/Cerner/Meditech, and deep knowledge of insurance verification and point-of-service collections. Fully remote, US-based.
First-Line Supervisors of Police and Detectives
Mercor is engaging experienced first-line supervisors of police and detectives for a three to four week AI research project. The role involves creating domain-specific deliverables and reviewing peer work to advance machine learning systems. You'll work remotely at your own pace, with flexible hours scaling from 20 to 40 per week. Payment includes a $100 onboarding fee, then $1,500–$1,700 on first task completion, followed by hourly rates.
Image Evaluation Generalist
Earning $20–$30 per hour on micro1, this contractor role suits those with strong visual judgement and analytical skills. You'll evaluate images against detailed guidelines, document inconsistencies, and write clear assessments to train AI systems. No AI background required; domain expertise and methodical review experience—whether from arts, QA, research, or similar fields—are valued.
Image Evaluation Generalist
Earning $20–$30 per hour on micro1, this contractor role suits those with strong visual judgement and analytical skills. You'll evaluate images against detailed guidelines, document inconsistencies, and write clear assessments to train AI systems. No AI background required; domain expertise and methodical review experience—whether from arts, QA, research, or similar fields—are valued.
Video related professional for AI training
Micro1 seeks experienced video professionals at $65–$80 per hour to train AI systems. Working remotely as a contractor, you'll analyse video content, provide detailed feedback on AI-generated outputs, and assist in curating training datasets. The role suits filmographers, editors, VFX specialists, and motion artists with strong technical knowledge and communication skills. No AI experience is necessary; your domain expertise in video production is what matters.
Video Editor
Micro1 seeks a video editor at $65–$80 per hour on a contract basis to support AI training via remote work. The role involves evaluating animations and visual effects, organising media assets, and developing evaluation frameworks for AI systems in storytelling and robotics. Candidates need a bachelor's degree, proven post-production expertise including audio editing, and experience with animation or educational content. No AI background required; domain knowledge in video and visual media production is essential.
QA Specialist - Audio Annotation & Diarization (French)
Turing seeks a French-native QA specialist to validate high-quality multilingual audio datasets. The role involves reviewing recorded conversations, checking transcription accuracy and speaker diarization, verifying timestamps, and auditing metadata for safety and privacy compliance. Suited to linguists, language teachers, or professional transcriptionists with meticulous attention to detail and expertise in complex conversational dynamics. Four-week freelance contract.
QA Specialist - Audio Annotation & Diarization (Arabic)
Turing seeks a native Arabic speaker to QA transcribed, multi-channel audio recordings for multilingual AI systems. The role involves verifying audio fidelity, validating human-corrected transcriptions against strict accuracy targets, confirming speaker identification and timestamps in complex overlapping dialogue, and auditing metadata and content for safety and privacy compliance. Ideal candidates include linguists, language teachers, or professional transcriptionists with meticulous attention to detail and deep familiarity with natural, unnormalized speech patterns. Four-week contract position.
QA Specialist - Audio Annotation & Diarization (German)
Turing seeks a German-native QA specialist to validate transcribed multi-channel audio recordings and speaker diarization for multilingual AI systems. The role demands meticulous review of audio fidelity, transcription accuracy against strict word-error-rate targets, speaker attribution in complex overlapping dialogue, and metadata integrity. Ideal candidates include linguists, language teachers, or professional transcriptionists with exceptional attention to detail and native German proficiency. Four-week freelance contract.
QA Specialist - Audio Annotation & Diarization (French)
Turing seeks a French-native QA specialist to validate high-quality multilingual audio datasets. The role involves reviewing recorded conversations, checking transcription accuracy and speaker diarization, verifying timestamps, and auditing metadata for safety and privacy compliance. Suited to linguists, language teachers, or professional transcriptionists with meticulous attention to detail and expertise in complex conversational dynamics. Four-week freelance contract.
QA Specialist - Audio Annotation & Diarization (Japanese)
Turing seeks a native Japanese speaker to serve as QA specialist for a four-week contractor assignment validating transcribed, multi-channel audio recordings and diarization outputs. The role demands meticulous review of audio fidelity, transcription accuracy, speaker identification, and metadata integrity across complex multi-speaker conversations. Ideal candidates possess linguistics expertise, professional transcription experience, or language teaching background with demonstrable attention to detail and familiarity with JSON-formatted data structures.
Physics Professor/Researcher (PhD)
Micro1 seeks physics PhDs or advanced doctoral candidates to develop and review physics problems, solutions, and explanatory content for AI training. The role involves evaluating model outputs for scientific accuracy and pedagogical effectiveness, applying teaching and research expertise to ensure rigorous, authentic deliverables. Contractors work remotely and asynchronously with project teams. Pay ranges £55–£71 per hour (converted from $70–$90 USD). Ideal for academics with strong publication records and interdisciplinary experience.
Generalist
Micro1 is recruiting Generalists at $10–$15 per hour to contribute to AI training workflows. The role suits versatile professionals with strong English skills and multilingual capability who excel at reviewing datasets, crafting task guidelines, and evaluating model outputs. No AI background is required; your cross-disciplinary expertise and ability to work independently in distributed settings are what matter.
AI Jailbreak & Prompt-Injection Security Expert
Earning $50–$90 per hour, this remote contractor role with micro1 suits security researchers and adversarial ML specialists. You'll design methodologies for evaluating AI system safety, conduct LLM red teaming and prompt-injection testing, build regression test suites, and develop evaluation frameworks to stress-test models against real-world threats. The role requires 2+ years in adversarial machine learning or AI safety, with preference for advanced qualifications and community recognition through published research or open-source contributions.
Video related professional for AI training
Micro1 seeks experienced video professionals at $65–$80 per hour to train AI systems. Working remotely as a contractor, you'll analyse video content, provide detailed feedback on AI-generated outputs, and assist in curating training datasets. The role suits filmographers, editors, VFX specialists, and motion artists with strong technical knowledge and communication skills. No AI experience is necessary; your domain expertise in video production is what matters.
Video Editor
Micro1 seeks a video editor at $65–$80 per hour on a contract basis to support AI training via remote work. The role involves evaluating animations and visual effects, organising media assets, and developing evaluation frameworks for AI systems in storytelling and robotics. Candidates need a bachelor's degree, proven post-production expertise including audio editing, and experience with animation or educational content. No AI background required; domain knowledge in video and visual media production is essential.
Member of Technical Staff, Frontier AI
micro1 seeks a Member of Technical Staff for Frontier AI at $100–$130 per hour, offering full-time remote work. This hands-on role bridges research, data, and deployed systems, requiring ownership of evaluation initiatives, ML dataset design, and failure analysis. You'll translate real-world system behaviour into structured research frameworks, work across teams to raise signal quality, and ensure research claims are defensible and production-ready. Suits those with experience in applied research, RL systems, or agentic AI.
Python Developer
Earning $50–$100 per hour, this remote contract role on micro1 invites experienced backend developers to evaluate AI coding tools by testing models in real-world workflows. You'll design and maintain REST and GraphQL APIs, optimise databases, and provide detailed feedback through incident reports and surveys during intensive testing cycles. Requires 3+ years' Python experience and familiarity with Cursor.
Music Expert — AI Evaluation & Annotation (Generative Music)
Mercor pays $60–$100 hourly for music evaluators and annotators working on generative-music AI. Roles suit sharp listeners without formal training, professional musicians, and audio specialists. Work involves comparing AI-generated tracks, writing technical descriptions, or timing lyrics to audio. Flexible hours with initial calibration required. Genres include pop, hip-hop, rock, and electronic.
Computer Vision Expert
Earning $80–$110 hourly, this part-time remote role suits experienced computer vision practitioners in the US. Working roughly 20 hours weekly through Mercor on behalf of a leading AI lab, you'll design demanding vision tasks, build executable tests in Python, and evaluate frontier model performance. The focus spans detection, segmentation, recognition, and multimodal reasoning. You'll identify capability gaps and collaborate with other specialists to maintain evaluation consistency.
Member of Technical Staff, Legal Research
micro1 seeks a Member of Technical Staff for legal AI research. The stated hourly rate is $7–$8; however, the full-time base salary is $200,000–$250,000 with equity and performance bonuses. You'll design evaluation frameworks for AI legal agents, conduct original research on legal reasoning and workflow automation, and develop benchmarks for complex legal tasks. Requires a JD, LLM, SJD, or equivalent with deep expertise in corporate law, contracts, litigation, compliance, IP, employment law, or policy. Best suited to researchers with interdisciplinary experience and strong analytical writing skills.
Member of Technical Staff, Finance Research
micro1 seeks a Member of Technical Staff for Finance Research to build evaluation frameworks for AI agents in financial domains. This full-time remote role suits PhD-holders or advanced finance professionals with deep expertise in capital markets, risk management, or quantitative finance. You'll conduct original research, develop benchmarks, curate datasets, and collaborate across research and engineering teams to advance enterprise financial AI capabilities.
Russian Audio Generalist Evaluator Expert (San Francisco Bay Area)
Mercor is recruiting a Russian Audio Generalist Evaluator Expert at $50/hour for a short-term engagement in the San Francisco Bay Area. The role involves transcribing and annotating Russian audio content, developing evaluation standards for language models, and conducting quality assurance work. Ideal candidates are fluent in Russian and English with strong analytical skills; college students or those with linguistics, translation, or research backgrounds are particularly suited to this audio AI research project.
Music Expert — AI Evaluation & Annotation (Generative Music)
Mercor pays $60–$100 hourly for music evaluators and annotators working on generative-music AI. Roles suit sharp listeners without formal training, professional musicians, and audio specialists. Work involves comparing AI-generated tracks, writing technical descriptions, or timing lyrics to audio. Flexible hours with initial calibration required. Genres include pop, hip-hop, rock, and electronic.
Finance Expert (US based)
Turing seeks finance experts (US-based) with 2+ years' experience in capital markets, trading, investment banking, private equity, or related fields to evaluate and improve AI language models. You'll develop assessment rubrics, collaborate with researchers on training methods, and shape benchmarks for financial AI applications. Flexible, 10–30 hours weekly for approximately one month, with potential extension. No AI background required.
Remote Legal Expert
Turing seeks U.S. legal experts holding a J.D. and active or inactive bar admission, with 3+ years' practice or teaching experience. The role involves evaluating AI-generated responses to legal hypotheticals, applying structured rubrics to assess reasoning and accuracy, and providing detailed feedback to refine language models. Flexible freelance engagement of 10–30 hours weekly, initially one month with possible extension.
Remote Finance & Research Analyst
Turing seeks finance specialists with 2+ years' experience across capital markets, trading, investment banking, private equity, or related domains. You'll evaluate AI language models on financial tasks, develop assessment rubrics, and collaborate with researchers to enhance model performance. No AI background required—only strong financial expertise and excellent English communication. Flexible remote work, 10–30 hours weekly for approximately one month.
No live roles match your search.
AI training work is organised by profession, task and software — not by topic or sector. Try your field (for example “nursing” or “Python”), clear the filters, or browse the categories further down the page. The always-open talent pools below are a good place to start.
