Ruowang (Jackie) Zhang

Ruowang (Jackie) Zhang

Tech Lead Manager · Staff Research Engineer · AI Data Systems

EXPERIENCE


Contextual AI

Tech Lead Manager / Staff Research Engineer — AI Data Systems

Jul 2025 – Present · Mountain View, CA

  • TLM of the data platform team, owning the large-scale multimodal document parsing, embedding and indexing pipeline, as well as the underlying vector search and model inference infra
  • Supporting the research team to co-design advanced tools for dynamic and agentic index building and exploration
  • Collaborating on building and evaluating in-house SOTA AI models (parsing, embedding, reranking, GLM) to maximize end-to-end RAG performance

Databricks

Senior Software Engineer — Distributed Data Systems

Aug 2021 – Jul 2025 · Mountain View, CA

  • One of the top contributors to the open-source Delta Lake storage format, and a contributor to Apache Spark
  • Built core abstractions for cost-free schema updates (column mapping) in Delta Lake tables
  • Tech lead of schema inference/evolution & file-source connector across multiple ingestion product surfaces; engineered solutions ingesting O(100) PBs/month

Reactive.xyz

Founder — Web & App Development Studio

Jan 2015 – Present · Melbourne / Remote

  • Managed up to 5 student developers from Australia and the US using an agile workflow
  • Led and delivered 20+ web/app projects to production — see reactive.xyz

EDUCATION


Purdue University

Doctor of Engineering (D.Eng) — Part-time / Online

Research: Agentic LLM with RL, Multimodal knowledge representation & retrieval · Advised by Prof. Jing Gao

May 2025 – Present

Columbia University

M.S., Computer Science — Machine Learning Track

GPA: 4.07 · Columbia Video Network (Part-time / Online)

May 2021 – Feb 2024 · Remote

University of Michigan, Ann Arbor

B.S., Computer Science & Data Science

GPA: 3.765 · Summa Cum Laude · James B. Angell Scholar · Research with Prof. Mao, Prof. Koutra, Prof. Mozafari

Sep 2016 – May 2020 · Ann Arbor, MI

SKILLS


LANGUAGES

C++ · Java · Scala · Python · Go
Swift · Obj-C · P4 · SQL · TypeScript

DATA & ML

Spark · Hadoop · Delta Lake · Kafka
TensorFlow · PyTorch · RAG / LLMs
MySQL · Mongo · Cassandra · Redis

FRAMEWORKS & INFRA

Spring · Django · React · Vue
Docker · Kubernetes
AWS · GCP · Azure

GET IN TOUCH

Open to interesting problems in data systems, AI infrastructure, and applied ML.