2 minute read

This year, I put reading data science related Medium & blog posts on my resolution list – the plan is to read three posts on Friday night and three on Sunday night every week, topic could vary, as long as the title catches my eyes :). Very foturnately, I also find a friend doing this with me so we can share thoughts, great articles, and brainstorming. And not surprisingly, in just two months, my reading notes have grown to 20+ pages. Therefore, I decided to pick the best ones I read in the last two months, and share it here.

Product Experimentation and Causal Inference

  1. Experimentation Analysis at Lime: A very systematic article on experimentation framework, process, and considerations
  2. Casual Impact @ Coursera Series: Introduced four most common causal inference techniques with practical examples at Coursera
    I - Controlled Regression
    II - Instrumental Variables
    III - Regression discontinuity
    IV - Difference-in-difference
  3. Key challenges with Quasi Experiments at Netflix: challenges and solutions of Quasi Experiments at Netflix
  4. There is more to experimentation than A/B: Introduced how Booking.com built their non-randomised experiment platform
  5. Causal Inference Cheatsheet (related reading): A very good summary of causal inference techniques, robustness, pros and cons

Customer Lifetime Value

  1. Rethinking Customer Lifetime Value using Machine Learning at Hellofresh: Detailed approach on how Hellofresh calculates LTV given the its flexible subscription that can be paused anytime
  2. Calculating customer lifetime value: A Python solution: LTV implementation at Azure, a more traditional way
  3. Customer Behavior Modeling: Buy-til-you-Die Models: Introduced Buy-till-you-Die (BYTD) model - a family of statistical models specifically built to address CLV problems

Sentiment Analysis

  1. Sentiment Analysis: Concept, Analysis and Applications: General ideas of sentiment analysis and examples
  2. Simplifying Sentiment Analysis using VADER in Python (on Social Media Text): VADER is a rule-based sentiment analysis model specifically designed for social media text

Data Infrastructure

  1. Data Quality at Airbnb (1 and 2): Talks about how Airbnb improved their data quality across the company
  2. Growth Engineering at Netflix: Introduced the Growth Engineering team at Netflix and some use cases
    1 - Accelerating Innovation
    2 - Automated Imagery Generation
    3 - Creating a Scalable Offers Platform
  3. Analytics at Netflix: Who We Are and What We Do: Introduced the data-related functions and teams at Netflix
  1. Supporting Content Decision Makers with Machine Learning: How Netflix utilized transfer learning to optimize content display
  2. Improving Deep Learning for Ranking Stays at Airbnb: Talked about the considerations and solutions to improve the listing ranking model at Airbnb