Reading Notes 2023 Jan - Feb

3 minute read

This is the first article of the reading notes series in 2023. It summarises the Medium blogs I read in the past two months. Hope you will enjoy the reading as well.

Machine Learning and Causal InferencePermalink

Building a Dynamic Pricing Capability (in under 90 days): Detailed walkthrough of building a Competitive Price Index Elasticity model for dynamic pricing
The Science (and Art) of Estimating Price Elasticities: Different ways to estimate price elasticities
Using Rideshare Data to Evaluate Racial Bias in the Issuance of Speeding Citations: Data Scientists at Lyft used rideshare data to estimate the racial inequities in traffic-related police punishment
How to Build a Causal Inference Machine Learning Model to Explore Whether Global Warming is Caused by Human Activity: A case study of using Causal Inference techniques and DoWhy package to evaluate the causation between Human Acitivty and Global Warming
Causal Machine Learning for Creative Insights: How Netflix used Causal Machine Learning to establish causality between artwork and its success
Understanding Causal Trees: How to use causal trees to estimate heterogeneous treatment effects
Matching, Weighting, or Regression?: Use matching, weighting, or regression for causal inference
Understanding Meta Learners: Use Meta-learners (S-learner, T-learner and X-learner) to understand if a causal effect is different for different users
Multi-touch Attribution: The Fundamental to Optimizing Customer Acquisition: An introduction of multi-touch attribution framework
Using Sklearn Pipelines to Streamline your Machine Learning Process: A very clear step by step walkthrough of Sklearn pipeline
Learning to Rank Using XGBoost: How to use XGBoost to train a Learning to Rank model
Is There Always a Tradeoff Between Bias and Variance?: What is bias and variance tradeoff and if there is always one
Overfitting, Underfitting, and Regularization: Understand basic machine learning concepts of overfitting and underfitting
Understanding Gradient Boosting: A Data Scientist’s Guide: A clear explanation of gradient boosting and why it works
Scaling Media Machine Learning at Netflix: Netflix talks about their media machine learning framework
Discovering Creative Insights in Promotional Artwork: Netflix talks about top-down and bottom-up approaches to discover creative insights
Grid Search and Random Search Are Outdated. This Approach Outperforms Both.: Introduces Bayesian search and compares its performance with Grid Search and Random Search
A Quick Guide to Design Rigorous Machine Learning Experiments: Different things to consider when evaluating a domain-specific machine learning approach vs. a generic machine learning technique
Uncovering the Limitations of Traditional DiD Method: Traditional DiD method may give significantly misleading estimates of the treatment effects when there are multiple time periods and variations in the treatment timing

DS CareerPermalink

The One Metric that All Data Teams Need to Track for Success: Discuss the best north star metric for data teams
Data ROI: How to Estimate the Value of Your Data & Analytics Projects: Different directions to estimate DS project values
What I’ve Learned from Interviewing more than 300 Data Scientists: Important things to stand out as a DS candidate
What’s Next for Analytics in 2023?: Some trends to watch out for analytics
The UX of Data: How to empower everyone with data
Product Thinking for Data Teams: Use product thinking to drive data projects
Data Storytelling 101: Essential Strategies for Data Scientists and AI Practitioner: An effective framework for storytelling in data science

OthersPermalink

Can ChatGPT Write Better SQL than a Data Analyst?: An interesting experimentation on making ChatGPT to write SQL
6 Ways ChatGPT Can Help Your Data & Analytics Team: Use ChatGPT to empower DS Analytics teams
Data Science and ChatGPT: Five things ChatGPT can help with day to day Data Science work
The New Google Analytics 4: Differences between the old Universal Analytics tag and the new Google Analytics 4
A Beginner’s Guide to Markov Chains, Conditional Probability, and Independence: A detailed explanation of Markob Chains basics
How We Cut ~95% Cost for Analytics Reporting and What We Have Learned: A great case study of how an organization can cut data infra costs by optimizing data storage, pipeline and queries

Share on

X Facebook LinkedIn Bluesky

Yu Dong

Reading Notes 2023 Jan - Feb

Machine Learning and Causal InferencePermalink

DS CareerPermalink

OthersPermalink

Share on

You May Also Enjoy

Weekly Viz 2025-07-21

My 2025 Weekly Vizzes

Weekly Viz 2025-07-14

Weekly Viz 2025-07-07