Reading Notes 2025 Jan - Feb
My Medium Articles!
In the past two months, I continued writing articles per month on Medium and Towards Data Science. Here you go:
- Unlocking the Power of Machine Learning in Analytics: Practical Use Cases and Skills: Ever confused by all the different titles in data science today? Do jobs with Data Scientist, Analytics titles also require Machine Learning skills? In this article, I share how the Machine Learning skills play a role in analytics with real-world use cases.
- DeepSeek V3: A New Contender in AI-Powered Data Science: DeepSeek is a new powerful player in the AI space. Its performance is on par with ChatGPT and Claude, but with a much lower cost. In this article, I evaluated its capabilities in data science use cases, including writing and optimizing SQL queries, conducting Exploratory Data Analysis (EDA), and training machine learning models.
- Mastering 1:1s as a Data Scientist: From Status Updates to Career Growth: Early in my career, I sometimes felt lost in 1:1s—unsure of what to discuss beyond status updates. Now, after being on both sides (as an IC and a manager), I’ve learned how great 1:1s can drive career growth, alignment, and impact. In this article, I shared my takeaways of running effective 1:1s as a data scientist.
Reading List in Past Two Months
Now, let’s talk about the great articles I came across in the first two months in 2025.
Data Science & Analytics
- Don’t Be Afraid to Use Machine Learning for Simple Tasks: Machine learning isn’t a complicated tool that can only be used for advanced use cases. It is a fantastic tool for creating robust and easy-to-maintain solutions to simple problems.
- Was your Marketing Campaign Effective? Let Regression Discontinuity Design Help You! — A Practical Python Tutorial: Explains Regression Discontinuity method for causal inference with a marketing campaign evaluation example.
- SHAP Value Dilution: How XGBoost Feature Sampling Misleads: If your XGBoost model uses feature sampling (colsample_bytree) and has highly correlated features, SHAP values can be misleadingly diluted. The author explained it with real examples.
- 4-Dimensional Data Visualization: Time in Bubble Charts: An innovative visualization type to plot 4-dimensional data.
- Z-Score and Modified Z-Score: Outlier detection with Z-score and the modified version.
- What are Isolation Forests?: Explains the isolation forest method for outlier detection.
- Advanced A/B testing techniques: CUPED, interleaving, and multi-armed bandits: An overview of CUPED, interleaving and Multi-armed bandits to overcome the challenges of traditional A/B testing.
- Mastering Data Visualization: Practical Tips You Need To Know: A great list of tips to remember when making data visualizations.
- Effective Data Visualization: 9 Valuable Tips to Increase the Quality of Your Charts: Important things to remember for intuitive visualizations.
- Forecasting@Meta: Balancing Art and Science: The DS team at Meta explains how they validate forecast with quantitative methods and qualitative criteria.
Data Career
- Three Crucial Data Lessons That I Learned from a Data Conference That’s Not Related to AI: Learnings about cost containment, value of the data teams, and data storytelling.
- The Politics of Analytics: Data analytics in the industry is about controlling information, controlling resources, and controlling decision rights.
- Data vs. Business Strategy: How to truly make your organization data-driven? Understand how to implement business strategy and data strategy.
- How to Build a Competency Framework for Data Science Teams: How to design an appropriate career ladder for the data science team.
- The Meta Mistake And Why Nobody Is Immune To Low Performance: Discussion of Meta’s company culture and why everyone could be the ‘low performer’.
- These Are the Jobs AI Will Replace: A reflection of the past 2 years and what jobs are more likely to be replaced by AI.
AI and LLM
- How to Evaluate LLM Summarization: A comprehensive quantitative framework to evaluate summaries generated by LLM.
- LLM for Data Visualization: How AI Shapes the Future of Analytics: An exploration of using AI agents to create data visualization based on a business question.
- How GenAI Tools Have Changed My Work as a Data Scientist: The writer talks about how to use GenAI to increase productivity, focus on the important, increase the quality of work, and learn faster and easier.
- Synthetic Data Generation with LLMs: Use synthetic data generated from LLM to evaluate RAG.