Reading Notes 2022 Sep - Oct
This is a summary of the great Medium posts I came across in the past two months. Hope you enjoy it :)
Causal Inference
- Causal Forecasting at Lyft (Part I,Part II): Lyft team introduces their causal forecasting framework with real examples and explanations
- Beyond A/B Test : Speeding up Airbnb Search Ranking Experimentation through Interleaving: How Airbnb uses Interleaving techniques to test search ranking algorithms and the benefits
- Don’t Be Seduced by the Allure: A Guide for How (Not) to Use Proxy Metrics in Experiments: A great framework by Meta on when, why, and how to use proxy metrics in experiments
- Mean vs Median Causal Effect: How to estimate treatment effect on quantiles using quantile regression
- How Product Teams Can Build Empathy Through Experimentation: A great interview with Travis Brooks, Netflix Product Manager for Experimentation Platform, talking about how to build products that user like with experimentations
Machine Learning
- Why SHAP Values Might not be Perfect: Talks about how SHAP values lack causal structure and potential solutions to it
- SHAP for Categorical Features with CatBoost: How to use SHAP to interpret categorical variables in CatBoost
- How to Use UMAP For Much Faster And Effective Outlier Detection: How UMAP can be used to speed up outlier detection
- 5 Unusual Ways Bias Can Sneak into Your Models: Common sources of bias when building ML models
- Managing Biases in Recommender Systems: Common biases in recommender systems and how to handle them
- A Curated List of Important Time Series Forecasting Concepts: Quick refresh on time series concepts
- Top Python libraries for Time Series Analysis in 2022: Walkthroughs popular time series packages in Python
- Machine Learning for Fraud Detection in Streaming Services: Netflix team talks about considerations and learnings for fraud detection in streaming services
- Don’t use One-Hot Encoding Anymore: Alternatives to One-Hot Encoding when dealing with categorical variables
- How Instacart Uses Embeddings to Improve Search Relevance: An introduction of the ITEMS (the Instacart Transformer-based Embedding Model for Search) framework to improve search performance
- Forecasting Something That Never Happened: How We Estimated Past Promotions Profitability: A very detailed case study on how to estimate the impact of promotions ran in the past
Analytics
- Why does Self-Service BI Fail and What could Enterprises Do to Turn the Tide?: Common barriers that make self-service BI less efficient or not working
- Is “Self-Service” Data’s Biggest Lie?: A debate on why and why not self-service analytics works
- Analytics and Product-Market Fit: A great framework to measure the product-market-fit of a new product
- The 10 Best Data Visualizations of 2022: 10 hand-picked great data visualizations on Reddit this year
- Detailed Dashboard Design Guidelines Used by Professionals: Great guidance on how to design dashboards that convey information clearly
- Visualization Tools with Python: Popular visualization packages in Python
- Not All Data Requests Are Urgent, So Start by Asking These 5 Questions: Five things important to ask when you get ad-hoc data requests