You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Danial KhiljiDK

Danial Khilji

Data Scientist

€289/day
Preston, GB
3-7 years

Average response time: 1 hour

About Danial

I’m a Data Scientist specializing in knowledge representation, graph-based systems, and applied machine learning.

I design and productionize semantic systems for audience and identity mapping, using embeddings, probabilistic matching, and graph-based approaches. My work on the Audience Translator platform has enabled 10K+ users to map taxonomies to platforms like Meta and Google.

I build end-to-end solutions combining LLMs, retrieval (RAG), AI-Agents, and structured data, including taxonomy enrichment, automated validation workflows, and scalable Python pipelines.

More recently, I’ve focused on AI agent systems, developing a LangGraph-based multi-agent framework to automate ad-tech platform research, extracting API insights, audience taxonomies, and reach estimates.

I’m particularly interested in building intelligent systems at the intersection of structured knowledge, retrieval, and AI agents.

My portfolio:
  • English

    Native or bilingual

Can work on-site
Preston (up to 50km)

Experience

  • WPP
    Data Scientist
    TECH
    June 2023 - Today (3 years)
    London, UK
    Designed and optimized a semantic knowledge representation system, modelling relationships between audience using semantic similarity and weighted graph similarity, enabling >10K users in just one quarter to map source taxonomies to ad-tech platforms like Meta and Google.

    Partnered with engineers to automate mappings pipelines using Airflow, reducing turnaround time from weeks to under one week through automated similarity, KPI calculations, and reporting.

    Built and maintain a Python package for taxonomy similarity, adopted across teams, with features for data cleaning, KPI generation, model evaluation, results evaluation and configurable weighting of embedding models.

    Integrated pre-trained and fine-tuned BERT-based models, developing a weighted similarity algorithm, and continuously improving mapping performance.

    Developed an evaluation framework with text corruption strategies, ground-truth validation, and confusion matrices to measure model accuracy.

    Implemented LLM-based features (GPT-4, Gemini 1.5 pro) for taxonomy enrichment and lightweight RAG, improving semantic accuracy without on external databases.

    Built an LLM supervision layer, using GPT-4/4o, to pre-validate mappings before human review, significantly reducing manual efforts improving efficiency.

    Investigated geo-targeted advertising inaccuracies in Snowflake, improving device and email matching via probabilistic methods; developed a QA framework for partner data validation and onboarding.

    Consulted on email matching optimization for campaigns using probabilistic matching algorithms and normalization techniques.

    Built a LangGraph-based multi-agent system for ad-tech platform research to retrieve API details, audience taxonomies, and reach estimates, and automatically generate validated markdown reports with details and code snippets.

    Developed MCP server to expose semantic similarity system to AI Agents.
    Python Data science LLMs LangGraph Snowflake
  • Choreograph (WPP Company)
    Junior Data Scientist
    TECH
    March 2022 - June 2023 (1 year and 3 months)
    London, UK
    Participated in WPP Data Challenge #4 and #5. Winner of Data Challenge #5

    Contributed to the Audience Knowledge Graph (AKG) project, developing taxonomy mapping module, data cleaning steps for data clean rooms, and Looker Studio dashboard for monitoring each module progress among other tasks.

    Created python package to expose taxonomy mapping project to different teams within company. Package uses pre-trained BERT language models from Hugging Face library to generate vector embeddings which are used to calculate cosine similarity.

    Focus on taxonomy mapping work also known as Named Entity Resolution (NER). Improved the previously developed algorithm by removing loops and reducing time complexity from O (n) to O (1).
    Python python package Embedding models semantic similarity Machine Learning Algorithms
  • WPP
    WPP NextGen Leader
    TECH
    June 2022 - August 2022 (2 months)
    London, UK
    − 10 weeks internship in WPP to give an understanding of how, as a group of agencies, WPP work in the creative world, help their clients grow and build advertisement campaigns.
    Advertising Tech Lead marketing campaigns

Recommendations

Be the first to recommend Danial

Help this freelancer shine by sharing your experience working together.

These freelancer profiles also match your criteria

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Education

  • MSc Applied
    University of Central Lancashire
    2021
    MSc Applied
  • B.E
    College of EME, National University of Sciences and Technology
    2020
    B.E

Certifications

Skill set

Categories