Description

AI Reviewer available immediately for RLHF & AI evaluation projects — Remote worldwide

Hi, I’m Stéphane — an AI evaluation specialist with 3.5 years of experience in content moderation and policy enforcement for a major social media platform.

I specialize in high-precision decision-making, safety analysis, and complex evaluation frameworks — now applied to AI training and LLM evaluation.

🔍 What I bring to your project:

• Experience working on high-scale moderation systems with strict accuracy and SLA requirements

• Expertise in policy-based evaluation and nuanced decision-making

• Proven ability to maintain high accuracy (95%+) in high-volume environments

• Strong analytical skills for detecting edge cases, inconsistencies, and hidden risks

• Deep understanding of content safety, compliance, and contextual classification

🤖 AI & LLM experience:

• AI response evaluation (quality, safety, factuality, relevance)

• RLHF tasks (ranking, comparison, prompt evaluation)

• Data annotation & labeling (text, image, video, audio)

• Multimodal analysis (image-text coherence, contextual alignment)

🎯 Focus areas:

I’m particularly interested in advanced AI projects involving:

• LLM testing & evaluation

• Safety and alignment

• Complex annotation workflows

💬 Why work with me:

• Reliable, detail-oriented, and fast learner

• Strong consistency in long-term projects

• Clear communication and professional delivery

💰 Rate: ~$200/day (~$27/hour), flexible depending on project scope.

Available for freelance missions — flexible depending on project needs.

Languages

Portuguese
Native or bilingual
French
Native or bilingual
English
Fluent
Spanish
Basic

Workplace preferences

Remote only

Primarily works remotely

Outlier
AI Trainer / RLHF Evaluator
TECH
March 2026 - Today (3 months)
Contributed to the evaluation and optimization of large language models (LLMs) through high-level Reinforcement Learning from Human Feedback (RLHF) workflows, supporting the development of reliable and production-ready AI systems.

🔍 Core Contributions:

• Performed in-depth evaluation of AI-generated outputs, assessing accuracy, coherence, safety, and contextual relevance.
• Ranked and compared model responses to improve alignment, quality, and user-facing performance.
• Executed advanced data annotation across text and multimodal datasets, including image-text coherence validation.
• Conducted prompt evaluation and stress-testing to identify edge cases, inconsistencies, and potential failure modes.

⚙️ Methodology & Expertise:

• Applied rigorous evaluation frameworks and complex guidelines to ensure consistency and scalability.
• Demonstrated strong analytical judgment in identifying subtle errors, risks, and nuanced contextual issues.
• Contributed to high-quality datasets used for training and refining AI systems.

🎯 Specialization:
• AI Quality Evaluation • RLHF • LLM Testing • Multimodal Validation • AI Safety

💼 Focused on delivering high-precision AI evaluation for advanced and production-level AI systems.
RLHF Trainer (Reinforcement Learning from Human Feedback) AI Trainer - Data Annotator Multimodal Evaluator Multilingual evaluator Fast-Checking
Accenture
CONTENT MODERATOR
SOCIAL NETWORKS
November 2022 - Today (3 years and 7 months)
Lisbon, Portugal
Ensured the quality, safety, and compliance of user-generated content for a major social media platform, operating in high-volume and high-stakes environments.

🔍 Key Contributions:

• Reviewed and evaluated large volumes of content (text, image, video, audio) to enforce platform policies and safety standards.
• Applied advanced contextual judgment to classify complex and ambiguous cases, including sensitive and high-risk content.
• Identified, flagged, and escalated critical issues, ensuring rapid and accurate decision-making.

⚙️ Performance & Impact:

• Achieved 95% evaluation accuracy (target: 90%), demonstrating strong analytical precision and consistency.
• Recognized as Top Performer (French Market) for two consecutive months (2025), based on quality and reliability metrics.

🎯 Core Strengths:

• Policy-based evaluation & complex decision frameworks.
• Risk detection, edge case analysis & content safety.
• High attention to detail in fast-paced environments.

💡 This experience directly supports AI evaluation tasks such as RLHF, LLM response assessment, and multimodal data analysis.
Multimodal Evaluator Portuguese French Safety Evaluation Quality Auditing