Reading Notes 2022 Nov - Dec
This is the summary of the great blog posts I read in November and December. This is the last one of the year – I am glad I keep up with this habit of reading blog posts every Friday and Sunday night for another year. Hope you will enjoy the reading as well. Happy new year :)
Causal Inference
- What to Do When Your Experiment Returns a Non-Statistically Significant Result: Common reasons of a non-stat sig A/B testing results and ways to handle/communicate it
- Fooled by Statistical Significance: What is really statistical significance and common misunderstandings
- Uplift Modeling — A Bridge between Causal Inference, Machine Learning and Personalization: How to use Uplift Modeling to measure the impact of a marketing campaign
- We Increased Conversion Rates by Over 20% Doing This: A great case study of increasing conversion rates with great design, user testing, and iterative experiments
- Using Causal ML Instead of A/B Testing: Why Causal ML could be more handy than A/B testing in some cases
- Why Spillover Effects Bias Your AB Testing Results and Ways to Overcome them: An explanation of Spillover Effects and common solutions to it
- Notifications: why less is more — how Facebook has been increasing both user satisfaction and app usage by sending only a few notifications: The Facebook Notifications Data Science team shares their findings on notification volume and categories and its impact on user satisfaction and app usage using long-term experiments
- Experiments on Returns on Investment: How to estimate the impact and confidence interval on ROI using experiments and the Delta Method
Machine Learning
- 5 Biggest Trends In Data Science In 2022: Five noticeable trends in DS including Tiny ML, Auto ML, Data-Driven Customer Experience, AIaaS (AI as a Service), and Augmented Analytics
- Why is Mean Squared Error (MSE) So Popular?: Why people like MSE and where we should use MSE
- What’s the Difference Between a Metric and a Loss Function?: How to differentiate the two and what are the best measures in each case
- What’s your computer’s favorite metric?: Still the same series – why MSE is the easiest metric for computers to optimize for
- Why is MSE = Bias² + Variance?: A great walkthrough and breakdown of MSE
- Difference Between Normalization and Standardization: What are the different normalization and standardization methods and when to use them
- D.A.R.T — Your New Weapon Against Overfitting in Boosting Models: How to use D.A.R.T(Dropouts meet Multiple Additive Regression Trees) to avoid overfitting in boosting models
- PyCaret 3 is coming… What’s New?: New functions in PyCaret 3
- Top Python Packages for Feature Engineering: Three useful Python packages for feature engineering – featuretools, feature-engine, and tsfresh
- New Series: Creating Media with Machine Learning: The first post of a blog series highlighting Machine Learning efforts for content creation at Netflix
- Match Cutting at Netflix: Finding Cuts with Smooth Visual Transitions: How Netflix uses machine learning techniques to help match the cuts in two shots
- Building Airbnb Categories with ML and Human-in-the-Loop: ML and human efforts behind the launch of Airbnb Categories feature
Data Career
- 6 Habits to Include in Your Daily Routine for a Long, Happy Career as a Data Scientist: Six pieces of great advices that will benefit your DS career
- Top 3 Tools to Promote Your Work in Analytics and Data Science: How to communicate and promote your great DS work to gain more visibility and impact
- 12 Books to Expand Your Worldview as a Data Professional: A great list of data science-related books
- The 3 Stages of Data Maturity in an Organization: Starting with data -> Scaling with data -> Leading with data
- Making the case for Analytics Product Managers: What is Analytics Product Manager and why we need this position
- 6 Reasons Why Companies Fail at Data Governance: Why would data governance initiatives fail in companies
Others
- How to Get Actionable Insights from Customer Feedback: A framework to collect and utilize the customer feedback in product development
- A Non-Exhaustive List Of ‘Silent’ Mistakes in SQL That Can Ruin Your Analysis: Some common SQL mistakes that you should keep an eye on
- SQL Query Optimization: Level Up Your SQL Performance Tuning: A great list of SQL query optimization tips
- I Modified An SQL Query From 24 Mins Down To 2 Seconds - A Tale of Query Optimization: A very good real-life case study of optimizing SQL query
- 5 SQL Bad Habits You Need to Break: Some common bad ways to write SQL queries
- Start Using Google Trends as part of Our Data Analysis: A great example of how to derive some actionable business insights from Google Trends
- Introducing ChatGPT!: An amazing reading on what is ChatGPT and its limitation (with a real example)
- Creating a Customer Health Index: A Terrific Tool to Measure and Improve Customer Experience and Drive Growth: why we need a Customer Health Index and how to create one
- The Many Layers of Data Lineage: How to better structure data lineage to make it more understandable and useful
- Top 10 Data Visualizations of 2022 Worth Looking at!: 10 great visualizations that worth checking out
- BONUS – Five Great Programming and Data Science Memes