4 minute read

My Medium Articles!

In the past two months, I continued writing articles per month on Medium and Towards Data Science. Here you go:

  1. From Data Scientist IC to Manager: One Year In: I just hit my one-year mark as a manager! It’s been a busy, challenging, and incredibly rewarding year. In this article, I reflect on three key themes that have shaped my experience leading a data team: Prioritization, empowerment, and recognition.
  2. What Being a Data Scientist at a Startup Really Looks Like: Should you join a startup as a data scientist? I share my five-year experience at Brex here – Startups offer speed, variety, visibility, and cutting-edge exposure. But they also bring chaos, shifting priorities, and the need to wear hats you may not enjoy.

Reading List in Past Two Months

Now, let’s talk about the great articles I came across in the past two months.

Data Science & Analytics

  1. The Evolution of A/B Testing: scale, methodology, technology, pitfalls and debates: A great overview of everything about A/B testing.
  2. How Meta Solves Data Lineage At Scale: Data lineage is the process of tracing data’s journey through various systems, from its source to its final destination. This article talks about Meta’s solution of data lineage.
  3. Why is dbt So Popular?: dbt is a CLI tool that lets us efficiently transform data with SQL. It encourages dat analysts to take more responsibility for managing data transformations with simple structure.
  4. Beyond Predictions: Uplift Modeling & the Science of Influence (Part I): How to build a Uplift Model with tree-based models.
  5. Switchback Tests and Randomized Experimentation Under Network Effects at DoorDash: How to use Switchback tests to solve the network effect problem in a marketplace business like DoroDash.
  6. Table Compare: Safeguarding Data Integrity at Meta: How does Meta ensure data reliability with the Automated Data Output Testing.
  7. Managing Technical Debt in the Age of Vibe Coding: Unreviewed AI-assisted code rapidly piles up mountains of technical debt and how to get the best from AI coding assistants.
  8. 12 Visualization Hacks That Turn Data Into Stories: A good collection of useful visualization examples.
  9. Marginal Effect of Hyperparameter Tuning with XGBoost: A technical explanation of how hyperopt works when tuning hyperparameters and the tradeoff between large search spaces and narrower search spaces.
  10. How I’d Learn Machine Learning Again, After 6 Years: How to learn ML and Deep Learning the best way today.
  11. Common Data Science Mistakes and How to Avoid Them: Five mistakes to avoid in machine learning and analytics projects.

Data Career

  1. What’s Wrong With Data Teams: Why most data teams are expensive consulting shops, and how to fix it.
  2. How We Oversaturated the Data Science Job Market: Why the data science job market is now oversaturated, what’s wrong with bootcamps, and how to be more competitive.
  3. Why Your Data Analysts Aren’t Making Recommendations: The conflicts between what’s desired vs. what the DAs are asked, and how to better understand your requests.
  4. How Top 1% Data Science Candidates Land More Interviews: Four rules to help DS candidates to land more interviews.
  5. How to Write Insightful Technical Articles: The motivations behind writing technical articles, different types, and how to find your niche.
  6. Meta’s Data Scientist’s Framework for Navigating Product Strategy as Data Leaders: The four quadrants of product data science questions and how to navigate through them.
  7. How to Prioritize: Knowing how to stack rank all the things on your plate in terms of priority is extremely important at all levels of the team.
  8. What I Actually Do As An Applied Scientist at Twitch/Amazon: What is Applied Scientist and how does it different from other data professions.
  9. “Data Analysts, stop playing small!” — You have one of the most creative jobs in tech.: Data analysts are actually at the heart of the business to solve business problems.
  10. Dashboard Dysfunctorrhea: How The Best Leaders Actually Use Data: Are all the dashboards being used? Why it create problems? What are the alternatives?
  11. Quantitative vs Qualitative Data in Data Analytics: The statement that “quantitative data is objective and qualitative data is subjective” is wrong.
  12. “Connecting the Dots” in Data Analytics: What is actually “connecting the dots” and the paths to it.
  13. The Generalist: The New All-Around Type of Data Professional?: Data generalists are now flourishing due to the emergence of Cloud Services, the explosion of startup companies, and the evolution of Artificial Intelligence tools.

AI and LLM

  1. How to Learn LLMs From Scratch: Five essential stages to learn LLMs with relevant resources.
  2. Large Language Models: A Short Introduction: A quick walkthrough of key LLM concepts.
  3. LLM Evaluation: Practical Tips at Booking.com: The LLM-as-a-judge framework to evaluate LLM model performance.
  4. Toward Digital Well-Being: Using Generative AI to Detect and Mitigate Bias in Social Networks: A machine learning pipeline designed to detect and mitigate bias in user-generated content.
  5. How to Perform Comprehensive Large Scale LLM Validation: How to conduct LLM validation and evaluation.
  6. Water Cooler Small Talk, Ep 8: Should ChatGPT Be Blocked at Work?: What is the root problem of people leaning to block ChatGPT at work.
  7. How to Develop Powerful Internal LLM Benchmarks: Create company internal benchmarks to understand LLM performance for your own use cases.