4 minute read

This is the summary of the great blog posts I read in November and December. This is the last one of the year – I am glad I keep up with this habit of reading blog posts every Friday and Sunday night for another year. Hope you will enjoy the reading as well. Happy new year :)

Causal Inference

  1. What to Do When Your Experiment Returns a Non-Statistically Significant Result: Common reasons of a non-stat sig A/B testing results and ways to handle/communicate it
  2. Fooled by Statistical Significance: What is really statistical significance and common misunderstandings
  3. Uplift Modeling — A Bridge between Causal Inference, Machine Learning and Personalization: How to use Uplift Modeling to measure the impact of a marketing campaign
  4. We Increased Conversion Rates by Over 20% Doing This: A great case study of increasing conversion rates with great design, user testing, and iterative experiments
  5. Using Causal ML Instead of A/B Testing: Why Causal ML could be more handy than A/B testing in some cases
  6. Why Spillover Effects Bias Your AB Testing Results and Ways to Overcome them: An explanation of Spillover Effects and common solutions to it
  7. Notifications: why less is more — how Facebook has been increasing both user satisfaction and app usage by sending only a few notifications: The Facebook Notifications Data Science team shares their findings on notification volume and categories and its impact on user satisfaction and app usage using long-term experiments
  8. Experiments on Returns on Investment: How to estimate the impact and confidence interval on ROI using experiments and the Delta Method

Machine Learning

  1. 5 Biggest Trends In Data Science In 2022: Five noticeable trends in DS including Tiny ML, Auto ML, Data-Driven Customer Experience, AIaaS (AI as a Service), and Augmented Analytics
  2. Why is Mean Squared Error (MSE) So Popular?: Why people like MSE and where we should use MSE
  3. What’s the Difference Between a Metric and a Loss Function?: How to differentiate the two and what are the best measures in each case
  4. What’s your computer’s favorite metric?: Still the same series – why MSE is the easiest metric for computers to optimize for
  5. Why is MSE = Bias² + Variance?: A great walkthrough and breakdown of MSE
  6. Difference Between Normalization and Standardization: What are the different normalization and standardization methods and when to use them
  7. D.A.R.T — Your New Weapon Against Overfitting in Boosting Models: How to use D.A.R.T(Dropouts meet Multiple Additive Regression Trees) to avoid overfitting in boosting models
  8. PyCaret 3 is coming… What’s New?: New functions in PyCaret 3
  9. Top Python Packages for Feature Engineering: Three useful Python packages for feature engineering – featuretools, feature-engine, and tsfresh
  10. New Series: Creating Media with Machine Learning: The first post of a blog series highlighting Machine Learning efforts for content creation at Netflix
  11. Match Cutting at Netflix: Finding Cuts with Smooth Visual Transitions: How Netflix uses machine learning techniques to help match the cuts in two shots
  12. Building Airbnb Categories with ML and Human-in-the-Loop: ML and human efforts behind the launch of Airbnb Categories feature

Data Career

  1. 6 Habits to Include in Your Daily Routine for a Long, Happy Career as a Data Scientist: Six pieces of great advices that will benefit your DS career
  2. Top 3 Tools to Promote Your Work in Analytics and Data Science: How to communicate and promote your great DS work to gain more visibility and impact
  3. 12 Books to Expand Your Worldview as a Data Professional: A great list of data science-related books
  4. The 3 Stages of Data Maturity in an Organization: Starting with data -> Scaling with data -> Leading with data
  5. Making the case for Analytics Product Managers: What is Analytics Product Manager and why we need this position
  6. 6 Reasons Why Companies Fail at Data Governance: Why would data governance initiatives fail in companies

Others

  1. How to Get Actionable Insights from Customer Feedback: A framework to collect and utilize the customer feedback in product development
  2. A Non-Exhaustive List Of ‘Silent’ Mistakes in SQL That Can Ruin Your Analysis: Some common SQL mistakes that you should keep an eye on
  3. SQL Query Optimization: Level Up Your SQL Performance Tuning: A great list of SQL query optimization tips
  4. I Modified An SQL Query From 24 Mins Down To 2 Seconds - A Tale of Query Optimization: A very good real-life case study of optimizing SQL query
  5. 5 SQL Bad Habits You Need to Break: Some common bad ways to write SQL queries
  6. Start Using Google Trends as part of Our Data Analysis: A great example of how to derive some actionable business insights from Google Trends
  7. Introducing ChatGPT!: An amazing reading on what is ChatGPT and its limitation (with a real example)
  8. Creating a Customer Health Index: A Terrific Tool to Measure and Improve Customer Experience and Drive Growth: why we need a Customer Health Index and how to create one
  9. The Many Layers of Data Lineage: How to better structure data lineage to make it more understandable and useful
  10. Top 10 Data Visualizations of 2022 Worth Looking at!: 10 great visualizations that worth checking out
  11. BONUS – Five Great Programming and Data Science Memes