Hoodie was originally developed at Uber, to achieve low latency database ingestion, with high efficiency. It has been in production since Aug 2016, powering ~100 highly business critical tables on Hadoop, worth 100s of TBs(including top 10 including trips,riders,partners). It also powers several incremental Hive ETL pipelines and being currently integrated into Uber’s data dispersal system.

Talks & Presentations

  1. “Hoodie: Incremental processing on Hadoop at Uber” - By Vinoth Chandar & Prasanna Rajaperumal Mar 2017, Strata + Hadoop World, San Jose, CA

  2. “Hoodie: An Open Source Incremental Processing Framework From Uber” - By Vinoth Chandar. Apr 2017, DataEngConf, San Francisco, CA Slides Video

  3. “Incremental Processing on Large Analytical Datasets” - By Prasanna Rajaperumal June 2017, Spark Summit 2017, San Francisco, CA. Slides Video


  1. “The Case for incremental processing on Hadoop” - O’reilly Ideas article by Vinoth Chandar
  2. “Hoodie: Uber Engineering’s Incremental Processing Framework on Hadoop” - Engineering Blog By Prasanna Rajaperumal