**Fluent Language Skills Required:** English
_**Blended rate to support multiple geographies_
• *Why This Role Exists**
Mercor partners with leading AI teams to improve the quality, usefulness, and reliability of general-purpose conversational AI systems. These systems are used across a wide range of everyday and professional scenarios, and their effectiveness depends on how clearly, accurately, and helpfully they respond to real user questions.
In healthcare-related scenarios, accuracy and clarity are essential. This project focuses on **evaluating and improving how conversational AI systems respond to medical and healthcare topics**. Your expertise helps ensure responses are factually correct, clearly explained, and aligned with real-world healthcare knowledge and communication standards.
• *What You’ll Do**
- **Write and refine prompts** to guide model behavior in healthcare scenarios
- **Evaluate LLM-generated responses** to healthcare-related queries for accuracy, reasoning, clarity, and completeness
- **Conduct fact-checking of all medical and healthcare claims** using trusted public sources and authoritative references
- **Annotate model responses** by identifying strengths, areas of improvement, and factual or conceptual inaccuracies
- **Assess tone, completeness, and appropriateness of responses** for real-world healthcare use
- Ensure **model responses align with expected conversational behavior** and system guidelines
- **Apply consistent evaluation standards** by following clear taxonomies, benchmarks, and detailed evaluation guidelines
• *Who You Are**
- You have **a minimum of 5 years of real-world professional experience in Healthcare**, supported by an **associated expert degree** (e.g., MD, DO, RN, NP, PA, PharmD, MPH, or equivalent)
- You have experience in **one or more of the following sub-domains**:
- General Clinical Care
- Specialty Medicine or Surgery
- Diagnostics, Imaging & Laboratory Medicine
- Public Health, Healthcare Systems & Administration
- You have **significant experience using large language models** (LLMs) and understand how and why people use them
- You have **excellent writing communication skills for complex medical topics**
- You have **strong attention to detail** and are **comfortable evaluating clinical reasoning and medical explanations**, identifying subtle inaccuracies or gaps that others may overlook
• *Nice-to-Have Specialties**
- Prior experience with RLHF, model evaluation, or data annotation work
- Experience writing or editing high-quality medical or healthcare-related content
- Experience in clinical documentation, charting, or patient communication, including explaining medical information to non-clinical audiences
- Familiarity with evaluation rubrics, benchmarks, or quality scoring systems
• *What Success Looks Like**
- You identify medical inaccuracies, unclear explanations, or unsafe reasoning patterns
- Your feedback improves the clarity and reliability of healthcare-related AI responses
- You deliver reproducible evaluation artifacts that strengthen model performance
- Mercor customers trust their AI systems in healthcare contexts because you’ve rigorously evaluated them
• *Why Join Mercor**
At Mercor, healthcare professionals play a direct role in shaping how AI systems communicate about medical and health-related topics. This role allows you to apply your expertise beyond traditional settings while contributing to the development of more accurate and reliable healthcare AI systems.