Reading Notes 2025 May - Jun
My Medium Articles!
In the past two months, I continued writing articles per month on Medium and Towards Data Science. Here you go:
- The Secret Power of Data Science in Customer Support: Most data science stories focus on product or marketing. But Customer Support is another data goldmine. n my latest article, I walk through how I have worked closely with our Customer Support (CX) team, helping them track performance, plan resources, optimize internal processes, and identify customer pain points.
- Rethinking Data Science Interviews in the Age of AI: AI is rewriting the day-to-day of data scientists. This transformation also poses a challenge to hiring managers: how to find the best talent that will thrive in the AI era? In this article, I discuss what hiring managers and candidates should do to adapt.
Reading List in Past Two Months
Now, let’s talk about the great articles I came across in the past two months.
Data Science & Analytics
- Using Causal Inference for Measuring Marketing Impact: How BBC Studios Utilises Geo Holdouts and CausalPy: How BBC Studio uses Geo Holdout-Based Bayesian Synthetic Control to evaluate the impact of OOH campaigns.
- Anomaly Detection in Time Series Using Statistical Analysis: Engineers at Booking.com talk about how they used statistical methods to build an anomaly detection system.
- Statistically Speaking: How To (Properly) Report A/B Testing Results: Common errors with reporting A/B testing results, including overstating certainty, confusing test settings with test results, misinterpreting p-values, misinterpreting confidence interval, and ignoring external validity.
- I Teach Data Viz with a Bag of Rocks: Using a bag of rocks to illustrate the principle of data visualization.
- Time Series Linear Regression Explained: How linear regression can work for time series forecasting.
- You can have it all: Parallel Testing in A/B Testing: Explains why running multiple experiments simultaneously is not only feasible but also beneficial, explore its key advantages and potential challenges, and share best practices for successful implementation.
- How to Find the Right Distribution for Your Data: A Practical Guide for Non-Statistician: How the author created a visual tool for people to test distribution of their data.
- Get More Explainability Than Just SHAP With ALIBI In Python: Alibi is a new open-sourced Python library that help with model explainability and works with both black-box and white-box explainability on local and global insights.
- The Metric Tree Trap: A Metric Tree is a hierarchical decomposition of a top-level business goal into actionable sub-metrics. Why it could be misleading and doesn’t work as expected.
- How to Lose Money With “Statistically Significant” Decisions: Trade-offs to consider for experimentations other than statistical significance.
- The 10 Weirdest, Most Brilliant Algorithms Ever Devised and What They Actually Do: 10 unconventional yet brilliant algorithms—from Marching Cubes to quantum‐inspired methods—that have (or could) revolutionize fields like graphics, cryptography, optimization, and fault tolerance.
- Compelling New Visualization Picks for Inspiration — DataViz Weekly: Five interesting new visualizations across different topics.
- From Default Python Line Chart to Journal-Quality Infographics: A very practical step-by-step breakdown of how to turn a default matplotlib line chart to a professional-looking, clean visualization.
- Meta’s Centralized Approach to Decision Record-Keeping: How Meta built a centralized catalog of experiment decisions to enable record-keeping and long-term decision making.
- Holdout Groups Need Not Be a Lost Opportunity: How to determine the control group percentage under different scenarios.
- How Did Airbnb Build Their Semantic Layer?: A great walkthrough of Airbnb’s data infrastructure evolution and the design principles of Airbnb Minerva.
Data Career
- Should Data Scientists Pivot to AI?: The ML Researcher to AI Engineer profession spectrum and how to take a leap into AI.
- What Data Engineers Honestly Want To Tell Data Analysts: Important data engineering related knowledge that can help data analysts to collaborate better with data engineers.
- The Dark Side of Data Science Jobs: The reality behind the fancy data scientist title.
- 3 Reasons Why Data Science Projects Fail: Things you should avoid to thrive as a data scientist: 1. The solution was not actionable. 2.The observational data and causal insight conundrum. 3. The solution was overly complex.
AI and LLM
- The Biggest Problem With Text To SQL Workloads, And How To Fix It.: Challenges with TexttoSQL system come from the end user behavior and how metadata can help.
- 99% of AI Startups Will Be Dead by 2026 — Here’s Why: Why the OpenAI API wrapper startups have a fragile foundation and can easily die.
- Let Users Talk to Your Databases: Build a RAG-Powered SQL Assistant with Streamlit: Build a database-agnostic RAG pipeline allowing users to access data in Amazon Redshift, BigQuery, and a SQLite database.
- What is the Future of Power BI and Business Intelligence?: Six trends of BI tools with the development of data roles and under the age of AI.
- The BI Industry Is Missing Its ChatGPT Moment: Generative AI is starting to creep into BI but not in the way we need it to.
- How I Automated 80% of My Data Analysis Using AI Tools: A great example of how to use AI to improve DA workflows and more ideas.
- Stop Chasing “Efficiency AI.” The Real Value Is in “Opportunity AI.”: Opportunity AI means using artificial intelligence to solve previously impossible problems and create entirely new business and operating models. How to make your companies AI-Native?