You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Stéphane F.SF

Stéphane F.

AI Reviewer | RLHF Evaluator | LLM Evaluation

€189/day
Lisbon, PT
0-2 years

Average response time: 1 hour

About Stéphane

AI Reviewer available immediately for RLHF & AI evaluation projects — Remote worldwide

Hi, I’m Stéphane — an AI evaluation specialist with 3.5 years of experience in content moderation and policy enforcement for a major social media platform.

I specialize in high-precision decision-making, safety analysis, and complex evaluation frameworks — now applied to AI training and LLM evaluation.

🔍 What I bring to your project:

• Experience working on high-scale moderation systems with strict accuracy and SLA requirements
• Expertise in policy-based evaluation and nuanced decision-making
• Proven ability to maintain high accuracy (95%+) in high-volume environments
• Strong analytical skills for detecting edge cases, inconsistencies, and hidden risks
• Deep understanding of content safety, compliance, and contextual classification

🤖 AI & LLM experience:

• AI response evaluation (quality, safety, factuality, relevance)
• RLHF tasks (ranking, comparison, prompt evaluation)
• Data annotation & labeling (text, image, video, audio)
• Multimodal analysis (image-text coherence, contextual alignment)

🎯 Focus areas:

I’m particularly interested in advanced AI projects involving:
• LLM testing & evaluation
• Safety and alignment
• Complex annotation workflows

💬 Why work with me:

• Reliable, detail-oriented, and fast learner
• Strong consistency in long-term projects
• Clear communication and professional delivery

💰 Rate: ~$200/day (~$27/hour), flexible depending on project scope.

Available for freelance missions — flexible depending on project needs.
  • Portuguese

    Native or bilingual

  • French

    Native or bilingual

  • English

    Fluent

  • Spanish

    Basic

Remote only
Primarily works remotely

Experience

  • Outlier
    AI Trainer / RLHF Evaluator
    TECH
    March 2026 - Today (3 months)
    Contributed to the evaluation and optimization of large language models (LLMs) through high-level Reinforcement Learning from Human Feedback (RLHF) workflows, supporting the development of reliable and production-ready AI systems.

    🔍 Core Contributions:

    • Performed in-depth evaluation of AI-generated outputs, assessing accuracy, coherence, safety, and contextual relevance.
    • Ranked and compared model responses to improve alignment, quality, and user-facing performance.
    • Executed advanced data annotation across text and multimodal datasets, including image-text coherence validation.
    • Conducted prompt evaluation and stress-testing to identify edge cases, inconsistencies, and potential failure modes.

    ⚙️ Methodology & Expertise:

    • Applied rigorous evaluation frameworks and complex guidelines to ensure consistency and scalability.
    • Demonstrated strong analytical judgment in identifying subtle errors, risks, and nuanced contextual issues.
    • Contributed to high-quality datasets used for training and refining AI systems.

    🎯 Specialization:
    • AI Quality Evaluation • RLHF • LLM Testing • Multimodal Validation • AI Safety

    💼 Focused on delivering high-precision AI evaluation for advanced and production-level AI systems.
    RLHF Trainer (Reinforcement Learning from Human Feedback) AI Trainer - Data Annotator Multimodal Evaluator Multilingual evaluator Fast-Checking
  • Accenture
    CONTENT MODERATOR
    SOCIAL NETWORKS
    November 2022 - Today (3 years and 7 months)
    Lisbon, Portugal
    Ensured the quality, safety, and compliance of user-generated content for a major social media platform, operating in high-volume and high-stakes environments.

    🔍 Key Contributions:

    • Reviewed and evaluated large volumes of content (text, image, video, audio) to enforce platform policies and safety standards.
    • Applied advanced contextual judgment to classify complex and ambiguous cases, including sensitive and high-risk content.
    • Identified, flagged, and escalated critical issues, ensuring rapid and accurate decision-making.

    ⚙️ Performance & Impact:

    • Achieved 95% evaluation accuracy (target: 90%), demonstrating strong analytical precision and consistency.
    • Recognized as Top Performer (French Market) for two consecutive months (2025), based on quality and reliability metrics.

    🎯 Core Strengths:

    • Policy-based evaluation & complex decision frameworks.
    • Risk detection, edge case analysis & content safety.
    • High attention to detail in fast-paced environments.

    💡 This experience directly supports AI evaluation tasks such as RLHF, LLM response assessment, and multimodal data analysis.
    Multimodal Evaluator Portuguese French Safety Evaluation Quality Auditing

Recommendations

Be the first to recommend Stéphane

Help this freelancer shine by sharing your experience working together.

These freelancer profiles also match your criteria

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Skill set

Categories