Scale AI training operations by evaluating and stress-testing next-generation AI models. Perform high-level RLHF and model evaluation: analyze AI-generated content across domains, fact-check complex outputs, catch subtle hallucinations, and provide detailed constructive feedback to improve model reasoning and accuracy. Simulate real-world scenarios to gauge model utility. Requirements include native-level English (C1/C2), strong analytical rigor with ability to spot logical flaws, edge cases, and anomalies, autonomy and organization for reliable remote work. Bonus for background in research, editorial work, data annotation, or tech evaluation; emphasis on refined judgment and 'good taste' for high-quality content. Up to 1,000 contractor openings; hourly rate varies by expertise, complexity, and location. Application requires completing an AI-driven interview link sent via email after applying.

Stealth company

Roles

Compensation

Tech stack

Location

Description

Similar jobs