Description

I’m a Data Scientist specializing in knowledge representation, graph-based systems, and applied machine learning.

I design and productionize semantic systems for audience and identity mapping, using embeddings, probabilistic matching, and graph-based approaches. My work on the Audience Translator platform has enabled 10K+ users to map taxonomies to platforms like Meta and Google.

I build end-to-end solutions combining LLMs, retrieval (RAG), AI-Agents, and structured data, including taxonomy enrichment, automated validation workflows, and scalable Python pipelines.

More recently, I’ve focused on AI agent systems, developing a LangGraph-based multi-agent framework to automate ad-tech platform research, extracting API insights, audience taxonomies, and reach estimates.

I’m particularly interested in building intelligent systems at the intersection of structured knowledge, retrieval, and AI agents.

My portfolio:

Industry field of expertise

Languages

English
Native or bilingual

Workplace preferences

Can work on-site

Preston (up to 50km)

WPP
Data Scientist
TECH
June 2023 - Today (3 years)
London, UK
Designed and optimized a semantic knowledge representation system, modelling relationships between audience using semantic similarity and weighted graph similarity, enabling >10K users in just one quarter to map source taxonomies to ad-tech platforms like Meta and Google.

Partnered with engineers to automate mappings pipelines using Airflow, reducing turnaround time from weeks to under one week through automated similarity, KPI calculations, and reporting.

Built and maintain a Python package for taxonomy similarity, adopted across teams, with features for data cleaning, KPI generation, model evaluation, results evaluation and configurable weighting of embedding models.

Integrated pre-trained and fine-tuned BERT-based models, developing a weighted similarity algorithm, and continuously improving mapping performance.

Developed an evaluation framework with text corruption strategies, ground-truth validation, and confusion matrices to measure model accuracy.

Implemented LLM-based features (GPT-4, Gemini 1.5 pro) for taxonomy enrichment and lightweight RAG, improving semantic accuracy without on external databases.

Built an LLM supervision layer, using GPT-4/4o, to pre-validate mappings before human review, significantly reducing manual efforts improving efficiency.

Investigated geo-targeted advertising inaccuracies in Snowflake, improving device and email matching via probabilistic methods; developed a QA framework for partner data validation and onboarding.

Consulted on email matching optimization for campaigns using probabilistic matching algorithms and normalization techniques.

Built a LangGraph-based multi-agent system for ad-tech platform research to retrieve API details, audience taxonomies, and reach estimates, and automatically generate validated markdown reports with details and code snippets.

Developed MCP server to expose semantic similarity system to AI Agents.
Python Data science LLMs LangGraph Snowflake
Choreograph (WPP Company)
Junior Data Scientist
TECH
March 2022 - June 2023 (1 year and 3 months)
London, UK
Participated in WPP Data Challenge #4 and #5. Winner of Data Challenge #5

Contributed to the Audience Knowledge Graph (AKG) project, developing taxonomy mapping module, data cleaning steps for data clean rooms, and Looker Studio dashboard for monitoring each module progress among other tasks.

Created python package to expose taxonomy mapping project to different teams within company. Package uses pre-trained BERT language models from Hugging Face library to generate vector embeddings which are used to calculate cosine similarity.

Focus on taxonomy mapping work also known as Named Entity Resolution (NER). Improved the previously developed algorithm by removing loops and reducing time complexity from O (n) to O (1).
Python python package Embedding models semantic similarity Machine Learning Algorithms
WPP
WPP NextGen Leader
TECH
June 2022 - August 2022 (2 months)
London, UK
− 10 weeks internship in WPP to give an understanding of how, as a group of agencies, WPP work in the creative world, help their clients grow and build advertisement campaigns.
Advertising Tech Lead marketing campaigns

Check out Danial's experience

Be the first to recommend Danial

Help this freelancer shine by sharing your experience working together.

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

Baptiste Duhen

Fullstack developer

4.6

(4)

Amed Hamou

Senior Lead Developer

(2)

Audrey Champion

Web developer

4.3

(3)

Signup to reveal

MSc Applied
University of Central Lancashire
2021
MSc Applied
B.E
College of EME, National University of Sciences and Technology
2020
B.E

Hands-On Essentials: Data Warehousing Workshop
Snowflake
2022
https://achieve.snowflake.com/375fe8fb-ad6b-4a6f-805c-06a409fad00d#acc.vHPUZag2
Data analysis Databases Snowflake Data science Data Warehouse
Hands-On Essentials: Data Application Builders Workshop
Snowflake
2022
https://achieve.snowflake.com/974d0ca9-5896-41c0-878d-404b6b993663#acc.fvT97zFX
Data analysis Databases Snowflake Data science Data Warehouse

Danial's certifications are only visible to Malt Community members

Danial Khilji

Data Scientist

About Danial

Experience

Recommendations

These freelancer profiles also match your criteria

Education

Certifications

Skill set

Categories