2 minute read

This is my fifth blog of this series (and second to last for this year), summarising the great posts Elise and I came across during our Friday and Sunday night reading sessions. Hope you enjoy the reading as well :)

Experimentation

  1. Online Experiments Tricks — Variance Reduction: Talks about common variance reduction methods in experimentations
  2. Improving Experimentation Efficiency at Netflix with Meta Analysis and Optimal Stopping: Two techniques Netflix use to improve their experimentation efficiency
  3. How to Reduce A\B Testing Duration using Surrogate Metrics:How to use the Surrogate Metrics to better estimate long-term impact
  4. Bayesian A/B Testing in 5 Minutes: Quick walkthrough of Bayesian A/B testing steps
  5. Bayesian A/B Testing - Part 0 - Introduction, Part I - Conversions, Part II - Revenue, Part III - Test Duration, Part IV - Choosing a Prior : More on the same topic

Machine Learning & Analytics

  1. Product Analytics: Engagement Model: How to build an engagement model to gain insights into user behavior and product development
  2. 5 Techniques to Work with Imbalanced Data in Machine Learning: Common techniques to handle imbalance data
  3. 7 Oversampling Techniques to Handle Imbalanced Data: Same topic as the above one, but dive deep into the various oversampling techniques
  4. A Zero Math Understanding of Bayesian Optimization: A very good analogy and explanation of Bayesian Optimization
  5. Advertiser Recommendation Systems at Pinterest: Talks about potential product opportunities using the insights generated from the machine learning model at Pinterest ads product – a very good example of how to create user-facing value from data science
  6. The Machine Learning Behind Delivering Relevant Ads: Also from the Pinterest ads team, but touches more on how the ads delivery algorithm is implemented
  7. Marketing Mix Modeling - Introduction to Marketing Mix Modeling in Python, An Upgraded Marketing Mix Modeling in Python: A series of two articles that explains the marketing mix modeling very clearly with great insights into how to combine marketing concepts into it
  8. Top 5 Time Series Analytics: General introduction on very useful time series analytics techniques
  9. Hate Black-box Models? Time to Change That With SHAP: A great overview of what is SHAP value and how it helps with model interpretability
  10. 10 Exciting Examples of Machine Learning Applications in Healthcare: Examples of how machine learning could be used to improve healthcare
  11. Why You’ll Regret Training ML Models: General considerations before you starting training your machine learning models
  12. Predict Customer Churn (the right way) using PyCaret: A detailed walkthrough on how to use PyCaret to quickly build a churn prediction model
  13. MOUSE Movement Modelling to Predict Online Fraud: A very interesting idea of tracking mouse movement data to detect fraud

Data Platform

  1. Automating Data Production At Scale - Part 1: How Airbnb designed a data protection platform
  2. Signs You Are Using Data Visualization Tools Wrong: Discussion on how you should use your data visualization tools within an organization

Others

  1. Emojis in Your Data: A fun reading on how emojis are stored in database
  2. Virtual Presentation Tips for Data Scientists: Great advice on data scientists’ presentations
  3. The 10 Best Data Visualizations of 2021: 10 insightful data visualizations the author selected from Reddit
  4. Key Metrics for Data Science Team Success: A great article on how to measure the success of the data science team as a leader
  5. Designing and evaluating metrics: Key principles on designed good metrics