ML engineering is judgement about how models are built and where they break. AI systems need engineers who can review that reasoning.
This isn’t advisory work and it isn’t labelling. You’re reviewing real ML and data work and explaining your judgement clearly enough that an AI can learn from it — paid for your expertise, by the hour.
Applied Clinical Judgement connects qualified people to vetted platforms, and Sean Key personally vouches for those he refers. We’re paid a referral fee by the platform on a successful placement — never by you. The roles below are live today.
27 live Machine Learning Engineers roles · updated daily
Machine Learning Engineer Talent Network
Mercor's Machine Learning Engineer Expert Network offers $70–$250 hourly rates for contract work on AI model training and evaluation projects. Suited to experienced ML engineers with Python, PyTorch/TensorFlow, and MLOps expertise, this rolling application pools candidates for future opportunities ranging 15–30 hours weekly. Work remotely on frontier AI research tasks at your own pace once vetted.
Forward Deployed Engineer
micro1 seeks a Forward Deployed Engineer at $180,000–$250,000 base salary (with equity and performance bonuses) for a full-time, US-based role combining research, infrastructure, and partner engagement. You'll work directly with leading AI labs and enterprises designing data systems, implementing ML pipelines, developing LLM applications including multi-agent workflows, and building infrastructure for model inference and evaluation. Suited to experienced Python engineers comfortable with ambiguity, product ownership, and translating AI problems into production systems.
Member of Technical Staff, Coding Research
micro1 seeks a Member of Technical Staff for coding research on a full-time remote basis with a base salary of $140,000–$180,000 USD. You will design evaluation frameworks and benchmarks for frontier coding agents, develop datasets and assessment protocols, and analyse model behaviour to improve performance. The role suits software engineers with 3+ years' experience in ML, AI research, or evaluation, with strong Python or C++ skills and familiarity with LLMs and coding systems.
Member of Technical Staff, Forward Deployed (US Gov)
micro1 seeks a Member of Technical Staff to develop and deploy agentic AI systems for U.S. Government missions at $40–60/hr. Based hybrid in Washington, D.C., the role spans model experimentation, infrastructure, and forward-deployed work with strategic partners. You'll build LLM applications, design data pipelines, and own systems across discovery through deployment. Requires strong Python expertise, LLM experience, and comfort in high-security environments. Security clearance eligibility and government mission background preferred.
AI/ML Engineer
micro1 seeks an AI/ML Engineer at $70–$200 per hour for a full-time hybrid role in Washington, D.C. You will design production-grade LLM and RAG systems, orchestrate multi-agent frameworks, and build secure data pipelines across government cloud environments. The work combines frontier AI development with mission-critical government initiatives, requiring hands-on expertise in Python, cloud AI services, and modern DevOps practices. U.S. citizenship or Green Card status and security clearance eligibility required.
Forward Deployed Engineer, U.S. Government
Micro1 seeks a Forward Deployed Engineer for full-time hybrid work in Washington, D.C., supporting U.S. Government missions. The role combines building agentic AI systems, LLM applications, and data pipelines with forward-deployed collaboration on mission-critical projects. You'll own systems across discovery, architecture, deployment, and iteration. Base salary $200,000–$260,000 USD plus equity and performance bonuses.
Machine Learning Engineer
Contractor role paying $40–$100 per hour via micro1. Machine Learning Engineers will design and develop models using Python, TensorFlow, PyTorch or scikit-learn, whilst managing data pipelines with MongoDB. The work encompasses model evaluation, hyperparameter tuning, and operationalisation in enterprise environments. Suits experienced practitioners with proven track records delivering real-world ML solutions and strong technical documentation skills.
MLE Bench – ML Engineers
Turing seeks Machine Learning Engineers with 3+ years' experience for benchmark-driven evaluation work on production ML systems. You'll build and modify training pipelines, prepare datasets, debug complex codebases, and collaborate on real-world ML engineering tasks. Strong Python proficiency, hands-on pipeline experience, and understanding of ML fundamentals required. Minimum 20 hours weekly with 4-hour PST overlap; 3-month contractor role.
Python Machine Learning Engineer
Turing seeks a Python Machine Learning Engineer with 4+ years' hands-on experience to design and deploy end-to-end ML solutions. The role demands expertise across supervised/unsupervised learning, NLP, computer vision, and statistical modelling, with proven ability to build production-grade systems. You'll translate business objectives into robust architectures, collaborate across functions, and stay abreast of AI research. Competitive ML experience is valued. Contractor basis, minimum 20 hours weekly with PST overlap.
Senior Software Engineer – C++ (LLM Evaluation & Repository Validation)
Turing seeks experienced C++ engineers (3+ years) to evaluate LLM performance on real-world software tasks. You'll analyse GitHub repositories, set up development environments, assess test coverage, and help identify challenging coding problems for AI systems. Work involves hands-on coding, Docker configuration, and collaboration with researchers building evaluation datasets. Fully remote; minimum 20 hours weekly with PST overlap required.
Senior LLM Engineer
Turing seeks a Senior LLM Engineer with 7–12 years' experience for a full-time, hands-on role in India. You'll design and deploy generative AI systems using Python, Langchain, and RAG pipelines, translating business requirements into production-ready solutions. The position demands deep technical proficiency in LLM internals, cloud platforms, and system architecture, alongside cross-functional collaboration with engineering, data, and business teams.
AI Engineer
Contractor AI Engineer role on micro1, paying $30–$90 per hour. You'll design and deploy production machine learning models, build ML pipelines with CI/CD practices, and manage Kubernetes workloads on AWS. The role suits engineers with strong programming skills in Python or Java and proven expertise in ML algorithms and cloud infrastructure. Domain knowledge matters more than prior AI experience.
Machine Learning Engineer
Contractor role paying $40–$100 per hour via micro1. Machine Learning Engineers will design and develop models using Python, TensorFlow, PyTorch or scikit-learn, whilst managing data pipelines with MongoDB. The work encompasses model evaluation, hyperparameter tuning, and operationalisation in enterprise environments. Suits experienced practitioners with proven track records delivering real-world ML solutions and strong technical documentation skills.
Human Baseliner for Open-Ended ML Research Tasks
Mercor seeks experienced ML engineers and researchers as human baseliners, paying $75–$90 hourly. You'll complete open-ended ML research tasks in sandboxed environments, establishing performance benchmarks against frontier AI agents. Requires 3+ years' ML experience (including PhD time), top-100 university or FAANG background, expertise in PyTorch/JAX/TensorFlow, and deep hands-on knowledge in pretraining, reinforcement learning, post-training, dataset curation, or model architecture. Minimum 20 hours weekly commitment.
LLM Expert - Chemistry
Turing seeks a Chemistry LLM Expert to develop datasets, benchmarks, and evaluation frameworks for language models in chemistry and materials science. You'll create reference answers and grading rubrics, assess AI-generated responses for scientific accuracy, and build Python-based evaluation pipelines. The 24-week contract requires a Master's or Ph.D. in Chemistry, Chemical Engineering, or Materials Science, plus Python proficiency and familiarity with LLM evaluation and prompt engineering.
Member of Technical Staff, Research Engineering
micro1 seeks a Research Engineer to develop reinforcement learning systems at scale. The full-time remote role, paying $140,000–$180,000 USD base salary, involves architecting RL environments, designing training pipelines, building synthetic data systems, and establishing evaluation frameworks. You'll fine-tune open-source models and contribute to benchmark releases. Suited to engineers with deep RL experience, proven track records scaling RL systems, and familiarity with automated evaluation and data generation workflows.
Member of Technical Staff, Enterprise AI
micro1 seeks a Member of Technical Staff, Enterprise AI on a full-time remote basis, paying $100–$130 hourly ($140–$180k annually). You'll embed within enterprise AI workflows as a research partner, identifying failure modes and running tight experimental cycles to improve system performance. Requires a master's in computer science, machine learning or related field, with proven ability to design datasets, evaluate ML systems, and translate operational problems into structured research. Suit those comfortable with ambiguity, RL environments, and fast iteration.
AI Engineer
Contractor AI Engineer role on micro1, paying $30–$90 per hour. You'll design and deploy production machine learning models, build ML pipelines with CI/CD practices, and manage Kubernetes workloads on AWS. The role suits engineers with strong programming skills in Python or Java and proven expertise in ML algorithms and cloud infrastructure. Domain knowledge matters more than prior AI experience.
AI/ML Engineer
micro1 seeks an AI/ML Engineer at $70–$200 per hour for a full-time hybrid role in Washington, D.C. You will design production-grade LLM and RAG systems, orchestrate multi-agent frameworks, and build secure data pipelines across government cloud environments. The work combines frontier AI development with mission-critical government initiatives, requiring hands-on expertise in Python, cloud AI services, and modern DevOps practices. U.S. citizenship or Green Card status and security clearance eligibility required.
Member of Technical Staff, Forward Deployed (US Gov)
micro1 seeks a Member of Technical Staff to develop and deploy agentic AI systems for U.S. Government missions at $40–60/hr. Based hybrid in Washington, D.C., the role spans model experimentation, infrastructure, and forward-deployed work with strategic partners. You'll build LLM applications, design data pipelines, and own systems across discovery through deployment. Requires strong Python expertise, LLM experience, and comfort in high-security environments. Security clearance eligibility and government mission background preferred.
AI/ML Engineer
micro1 seeks an AI/ML Engineer at $70–$200 per hour for a full-time hybrid role in Washington, D.C. You will design production-grade LLM and RAG systems, orchestrate multi-agent frameworks, and build secure data pipelines across government cloud environments. The work combines frontier AI development with mission-critical government initiatives, requiring hands-on expertise in Python, cloud AI services, and modern DevOps practices. U.S. citizenship or Green Card status and security clearance eligibility required.
Member of Technical Staff, Forward Deployed (US Gov)
micro1 seeks a Member of Technical Staff to develop and deploy agentic AI systems for U.S. Government missions at $40–60/hr. Based hybrid in Washington, D.C., the role spans model experimentation, infrastructure, and forward-deployed work with strategic partners. You'll build LLM applications, design data pipelines, and own systems across discovery through deployment. Requires strong Python expertise, LLM experience, and comfort in high-security environments. Security clearance eligibility and government mission background preferred.
AI/ML Engineer
micro1 seeks an AI/ML Engineer at $70–$200 per hour for a full-time hybrid role in Washington, D.C. You will design production-grade LLM and RAG systems, orchestrate multi-agent frameworks, and build secure data pipelines across government cloud environments. The work combines frontier AI development with mission-critical government initiatives, requiring hands-on expertise in Python, cloud AI services, and modern DevOps practices. U.S. citizenship or Green Card status and security clearance eligibility required.
Machine Learning Engineer
Contractor role paying $40–$100 per hour via micro1. Machine Learning Engineers will design and develop models using Python, TensorFlow, PyTorch or scikit-learn, whilst managing data pipelines with MongoDB. The work encompasses model evaluation, hyperparameter tuning, and operationalisation in enterprise environments. Suits experienced practitioners with proven track records delivering real-world ML solutions and strong technical documentation skills.
Computer Vision Expert
Earning $80–$110 hourly, this part-time remote role suits experienced computer vision practitioners in the US. Working roughly 20 hours weekly through Mercor on behalf of a leading AI lab, you'll design demanding vision tasks, build executable tests in Python, and evaluate frontier model performance. The focus spans detection, segmentation, recognition, and multimodal reasoning. You'll identify capability gaps and collaborate with other specialists to maintain evaluation consistency.
Machine Learning & NLP Expert
$80–$110 per hour. A leading AI lab seeks experienced ML and NLP practitioners to design challenging real-world tasks, generate reference solutions, and evaluate frontier model outputs on Mercor. You'll identify reasoning gaps and capability limitations through rigorous testing. Suited to those with applied industry or graduate research backgrounds who can work independently for approximately 20 hours weekly.
Machine Learning Engineer
Contractor role paying $40–$100 per hour via micro1. Machine Learning Engineers will design and develop models using Python, TensorFlow, PyTorch or scikit-learn, whilst managing data pipelines with MongoDB. The work encompasses model evaluation, hyperparameter tuning, and operationalisation in enterprise environments. Suits experienced practitioners with proven track records delivering real-world ML solutions and strong technical documentation skills.
No live roles match your search.
AI training work is organised by profession, task and software — not by topic or sector. Try your field (for example “nursing” or “Python”), clear the filters, or browse the categories further down the page. The always-open talent pools below are a good place to start.
What the work looks like
Reviewing the model's work
Read what an AI produced in your field and judge whether the reasoning holds — mark where it went wrong.
Setting hard problems
Write the realistic, demanding tasks that separate competent work from confident-but-wrong.
Judging AI answers
Compare two AI outputs and say which is stronger, and why — your written reasoning is what the model learns from.
Common questions
How much does it pay?
Hourly and contractor-based, varying with seniority and role. Every role card shows its pay band. You invoice as an independent contractor and choose your hours.
Can I do this alongside my current job?
Yes — the work is flexible and part-time by design. Check your employer's policy on outside work first; ACJ can't advise on that.
Who is ACJ, and what's your part in this?
Applied Clinical Judgement is run by Sean Key. We connect qualified people to vetted AI-training platforms (Mercor, micro1, Turing), and Sean personally vouches for the people he refers. We're paid a referral fee by the platform on a successful placement — never by you.
How do I get started?
Find a role below that fits, and apply through the link — it carries Sean's referral. If you'd like him to vouch for you or talk it through first, book a short call.
Sean Key vouches for the people he refers
I’m Sean Key, editor of Applied Clinical Judgement. After 29 years in the NHS I help qualified professionals find legitimate, well-paid AI-training work — and I’ll personally vouch for you when you apply.
Applied Clinical Judgement is a referral intermediary, not an employer or recruiter. We refer candidates to third-party platforms (Mercor, micro1, Turing) and may earn a referral fee on a successful placement. We never charge candidates. Pay rates are set by the platforms and may change. PRAG-DEL-SOL-ONE LTD · Co. 07204925 · VAT 987-3626-64 · ICO ZC086000.
