3 minute read

This post summarises the Medium blogs I read in the past two months. Hope you will enjoy the reading as well.

DS, Analytics

  1. Creative Fatigue: How advertisers can improve performance by managing repeated exposures: Meta talks about their analysis on creative fatigue and how to control it
  2. Using Graphs to Model and Analyze the Customer Journey: How the DS team at Microsoft uses graph to present customer journey
  3. Warden: Real Time Anomaly Detection at Pinterest: How Pinterest uses their Real Time Anomaly detection tool Warden to detect real time ML model drift, and dect spams
  4. Innovating Faster on Personalization Algorithms at Netflix Using Interleaving: How Netflix uses Interleaving on testing personalization algorithms, and how it is faster than the traditional A/B testing
  5. When You Should Prefer “Thompson Sampling” Over A/B Tests: What is Thompson Sampling and why it could be better than A/B tests
  6. Choosing the Right Path: Churn Models vs. Uplift Models: How to create an uplift model to better handle the churn problem

Machine Learning

  1. Visualizing Shapley Values Over Time: This post introduces several good ways to visualize Shapley values and help with model interpretation
  2. Why You Should Stop Using the ROC Curve: Detailed explanation of the differences between ROC Curve and PR Curve with examples
  3. An ML Based Approach to Proactive Advertiser Churn Prevention: How Pinterest team used GBDT to predict advertiser’s churn likelihood and validated with experimentation
  4. From Clusters To Insights; The Next Step: How to detect the driving features behind the cluster labels
  5. Twitter’s recommendation algorithm is now open source. What does it tell us?: Some observations from the recommendation algorithm that Twitter open sourced
  6. Representation Online Matters: Practical End-to-end Diversification in Search and Recommender Systems: Pinterest team walks through how they ensure diversification in search and recommender systems
  7. 19 Most Elegant Sklearn Tricks I Found After 3 Years of Use: This post talks about some Sklearn methods or tips that are less known but absolutely helpful

DS Career

  1. Build More Analyses, Build Less Dashboards: Why and how to change the mindset of buidling too many dashboards
  2. What I Am Doing to Stay Relevant as a Data Analyst: Several ways to always keep up with data analytics skills
  3. The Role of Product Data Science: A good summary of the main responsibilities as Product DS
  4. Crossing the Bridge: A Comparison of Data Science in Academia and Industry: This post compares the how DS work is different in academia and industry
  5. 12 Mental Models for Data Science: Important things to keep in mind as a data scientist
  6. What I Look For in Every Data Analyst Candidate: Important characteristics as a data analyst, from the perspective of a hiring manager
  7. Why Data Scientists don’t get a seat at the table and what they can do about it: How to get involved in product and strategy conversations as a data scientist
  8. Transform Your 1:1 Meetings into a Source of Insight: Suggestions on improving 1:1s

LLM

  1. Pandas AI — The Future of Data Analysis: An interesting new package that uses OpenAI API to run analytics with human language
  2. How GPT Models Work: Explains the algorithm behind GPT models on a high level
  3. Will Generative AI Replace the Need for Data Analysts?: Discusses the analytics use case of Generative AI and if it will actually replace data analysts
  4. I Used ChatGPT (Every Day) for 5 Months. Here Are Some Hidden Gems That Will Change Your Life: Some great tips on using ChatGPT better
  5. From Chaos to Clarity: Streamlining Data Cleansing Using Large Language Models: An example that uses LLM to process and clean messy data

Others

  1. Metis: Building Airbnb’s Next Generation Data Management Platform: An introduction of Airbnb’s data management platform and how it evolved
  2. Which Team Should Own Data Quality?: Discusses different options of manage data quality in industry
  3. Why You Should Become A Data Product Manager In 2023: What is data product manager and why it could be a good career choice
  4. The Unforgettable 15: Exploring the Best Data Visualizations of All Time (2023): Great visualizations to check out