Wendy Ran Wei
Wendy Wei
New New book Advanced Retrieval-Augmented Generation is ready for pre-order

Machine Learning Engineer
& Book Author

Wendy is an ML Engineer specializing in search, recommendation, and personalization systems at scale, with 10 years of experience spanning social networks (Twitter/X, Pinterest, Meta), e-commerce (Coupang), and marketplace platforms. She currently works at Airbnb building search and recommendation systems.

Her academic background includes a PhD in Statistics with a minor in Computer Science from The Ohio State University, with doctoral research in network sampling and estimation theory. She founded TigerLab AI, an open-source toolkit for LLM safety evaluation, and has taught online courses on Python and machine learning.

Wendy is the author of Advanced Retrieval-Augmented Generation: Bridging Large Language Models and Knowledge Graphs (Wiley-IEEE Press) and Data Science: Methods and Practice (China Machine Press) — two technical books bridging research and industry practice. Beyond technical work, she is committed to teaching, mentoring, and advancing best practices for scalable, production-grade ML systems.


Books

Publications

Advanced Retrieval-Augmented Generation book cover
English Book — Wiley-IEEE Press · August 2026

Advanced Retrieval-Augmented Generation: Bridging Large Language Models and Knowledge Graphs

A complete guide from the foundations of information retrieval to the cutting-edge frontiers of RAG. Bridging LLMs and knowledge graphs, this book provides theoretical principles, practical techniques, and hands-on frameworks for building reliable AI systems that minimize hallucinations and improve factual correctness. Covers Graph-RAG architectures, KG construction, RAG pipeline engineering, and production-ready implementations using LlamaIndex, Neo4j, and leading Graph-RAG frameworks.

ISBN: 978-1-394-37468-7 · 1st Edition
Data Science: Methods and Practice book cover
Chinese Book — China Machine Press · Nov 2025

Data Science: Methods and Practice
数据科学:方法与实践

A comprehensive data science textbook spanning 5 parts and 16 chapters — covering statistics, machine learning, deep learning, data engineering, product analytics, A/B experimentation, and domain applications in search, recommendation, advertising, NLP, and large language models. Combines methods with real-world industry practice.

ISBN: 978-7-111-79219-2

Background

Education

The Ohio State University — US

PhD, Statistics · Minor in Computer Science

Dissertation: On Estimation Problems in Network Sampling

Research: Statistical methods for estimation challenges arising from sampled network data structures

Award: Winner, 2013 Capital One National Data Analytics/Modeling Competition

Zhejiang University — China

B.S., Mathematics & Statistics

Uppsala University — Sweden

Statistics (Exchange)

Career

Experience

Search & Retrieval Recommendation Systems & Ranking LLMs & GenAI RAG · Agents Deep Learning Neural Networks Product Analytics A/B Testing · Metrics Data Mining Personalization NLP & Knowledge Graphs · Embeddings Statistics & Network Analysis
Social Networks
Twitter/X · Pinterest · Meta
E-Commerce & Marketplace
Airbnb · Coupang
Startups & Consulting
Founder · Advisor · Consultant
Author & Educator
Books · Courses · Teaching
2025 — Present
ML Engineer
Airbnb
Ranking, Relevance & Personalization. Retrieval and ranking models, GenAI-powered search.
2023 — 2025
ML Engineer
Coupang
Search retrieval & relevance. GenAI/LLM initiatives for search quality.
2023
Founder & Advisor
Stealth AI Startup
Founded TigerLab AI. Mentored in AI/LLM space. Taught Python courses.
2021 — 2023
ML Engineer
Facebook/Meta
Video recommendation systems. Producer value estimation & creator growth.
2019 — 2020
ML Engineer
Pinterest
Ads quality ML models. Advertiser recommendations & workflow readiness.
2016 — 2019
ML Engineer
Twitter/X
Explore Tab personalization. Topic relevance, Trends & Events. Intern 2014–15.

Innovation

Patents

Patent · US & European

Detecting Scripted or Otherwise Anomalous Interactions with Social Media Platform

Methods and systems for identifying automated, scripted, or malicious interactions on social media platforms using machine learning-based detection models.

US20180046475A1 → EP3497609B1 →

Projects

Open Source

TigerLab AI

Open-source LLM toolkit featuring AI safety evaluation metrics (TASS & TAST), benchmarking across major language models, and tools for responsible AI development.

More on GitHub

Repositories covering ML research, algorithm explorations, course materials, and data science projects.


Beyond Work

Hobbies

🏔️

Sports

⛷️ Skiing 🏂 Snowboarding ⛸️ Ice Skating 🏄 Surfing 🏊 Swimming 🤿 Scuba Diving 🚣 Stand-Up Paddleboarding 🪂 Skydiving 🧗 Rock Climbing 🏓 Pickleball 🏸 Badminton
🎨

Art, Music & Dance

🎹 Piano 🎵 Guzheng (Chinese Zither) 🎤 Singing 🎶 Acapella 💃 Dancing 🩰 Ballet 🧘 Yoga 🖌️ Acrylic Painting ✒️ Chinese Calligraphy
🌿

Life

🐱 Cats 🌱 Gardening 🎲 Board Games 📚 Reading ✍️ Writing 🍵 Tea 🕯️ Meditation ✈️ Travel 🚗 Road Trips 🎭 Cultural Festivals