Reading Notes 2021 Jul - Aug
This is my fourth blog of this series, summarising the great posts Elise and I came across during our Friday and Sunday night reading sessions. As you can see, we read a lot about experimentations and causal inferences recently as we always find more interesting posts when digging deeper into this field. Hope you enjoy the reading as well :)
Experimentation and Causal Inference
- Experiments at Airbnb: A pretty old post by Airbnb covering their experiment infra and common considerations
- A/B Tests for Lyft Hardware: Different from most experimentation posts that focus on UI testing, this article talks about how Lyft tests hardware
- An A/B Test Loses Its Luster If A/A Tests Fail: An introduction to why A/A test is important and how to properly do it
- How User Interference May Mess Up Your A/B Tests: How user interferences impact A/B tests and potential solutions to it
- Switchback Tests and Randomized Experimentation Under Network Effects at DoorDash: Following the above one, a more detailed explanation of switchback tests by DoorDash
- Experimentation in a Ridesharing Marketplace (Part I Interferences Across a Network, Part II Simulating a Ridesharing Marketplace, Part III Bias and Variance): A series of articles from Lyft discussing the challenges of ridesharing marketplace experimentation and how they tackled it
- How Wish A/B Tests Percentile: Wish talks about how it tests percentile in experiments
- How Not To Run an A/B Test: Why not peak early testing results and how to do it correctly
- Why We Moved Away from Conversion Rate As a Primary Metric: An interesting real case of how to choose the best metric for experiments
- How to Use Causal Inference In Day-to-Day Analytical Work (Part 1, Part 2): A comprehensive introduction to basic casual inference techniques
- A collection of articles on common causal inference approaches:
a. A Practitioner’s Guide To Difference-In-Differences Approach
b. An Ultimate Guide to Matching
c. Regression Discontinuity Design: The Crown Jewel of Causal Inference
d. A Practitioner’s Guide To Interrupted Time Series - How Airbnb Measures Future Value to Standardize Tradeoffs: Airbnb introduces how it sets up a standardized framework to measure future value using propensity score matching
- Causal Inference — Estimating Long-term Engagement: Compares the techniques to estimate long-term engagement with causal inference
Others
- 3 Common Mistakes When Fighting Customer Churn: Discussion of common mistakes regarding improving customer churn
- Using Sentiment Score to Assess Customer Service Quality: How Airbnb used sentiment model to complement NPS
- Task-Oriented Conversational AI in Airbnb Customer Support: A case study of how Airbnb used task-oriented dialog system to handle mutual cancellations
- Using Chatbot to Provide Faster COVID-19 Community Support: AI-based application of helpbot to facilicate COVID support
- Possible Bias in Surveys: Three types of common bias in surveys and how we should improve the design to avoid them
- Statistics: Are you Bayesian or Frequentist?: An easy-to-understand article talking about the difference between Bayesian and Frequentist
- Product Requirements Document (PRD): A Guide for Product Managers: How to write a good PRD doc
Snack – A Small Case Discussion
Last month, I saw a post on LinkedIn:
Today I learned that LinkedIn helps 6 individuals find a job every minute. This isn’t a random number, but a rigorously computed metric that’s tracked constantly. Isn’t that amazing?
This post immediately caught my eye. So Elise and I spent some time brainstorming this metric might have been defined. And I am sharing it.
There are two major components to this metric – whether a user is searching for opportunities on LinkedIn, and whether the user found a job. We started with finding the proxies to both components:
- whether a user is searching for opportunities on LinkedIn:
a. P0 (highest confidence) - active messages or positive message reaction to recruiter’s messages (for example, click the schedule time to chat link in a message)
b. P1 - LinkedIn apply actions (apply via LinkedIn or click on the apply button that leads to the job site)
c. P2 - Other actions including browsing job boards, react to job opportunity LinkedIn posts, subscribe to LinkedIn premium, turn on ‘open to opportunity’ badge, etc. - whether the user found a job:
a. Most direct – job change on profile (when job start time > job searching start time)
b. Some indirect proxies (less accurate) – turn off ‘open to opportunity’ badge, decrease in LinkedIn apply or activity interval, etc. – but this could be a result of lost hope on finding a job on LinkedIn, so overall not very helpful here
Next step, we can link these proxies together, and use a sort of backward attribution method to to build up the metric:
If a user updated their job on LinkedIn profile, we can use below signals to attribute the job hunting success to LinkedIn:
- Strong signals: They once ‘LinkedIn apply’ to the new employer or clicked the job post of that employer, or had message with someone (recruiter/employee) from the new employer;
- Moderate signals: They once interacted with posts or people from the new company;
- Weak signals: Other activities on LinkedIn (proxies discussed in above section, but not relevant to the new employer).
As you can see, the strong signals are the ones we have most confidence to tell that the job hunting success can be attributed to LinkedIn, but we might not want to include the weak signals as that could over-estimate this metric.
We also talked about potentially using surveys and pop-ups to collect true flags – for example, asking a user ‘whether you found this job with the help of LinkedIn’ when they update profile – and collection data from the signals discussed above to train and improve a classification model to improve the accuracy of this metric.