4 minute read

My Medium Articles!

In the past two months, I continued writing data science and AI contents on Medium. I am super excited to have more than 1k followers now! Below are the articles I posted lately. You can also find a copy on my blog.

  1. Building a Standout Data Science Portfolio: A Comprehensive Guide: My tips on how to set up a data science portfolio, its content strategy, and what makes a good portfolio.
  2. Evaluating ChatGPT’s Data Analysis Improvements: Interactive Tables and Charts: My evalation of ChatGPT’s new interactive tables and charts feature, with my assumption of ChatGPT’s future development.
  3. Navigating Data Science: B2C vs. B2B Analytics: Differences between data science and analytics at B2C and B2B business based on my industry experiences.
  4. ChatGPT vs. Claude vs. Gemini for Data Analysis (Part 1): Evaluation of which AI tool writes the best SQL query based on thier accuracy, efficiency, formatting, and explanation.
  5. Build a RAG-Based Chatbot to Retrieve Visualizations in 3 Steps: A step-by-step guide to creating a visualization discovery chatbot with OpenAI API, FAISS, and Streamlit
  6. ChatGPT vs. Claude vs. Gemini for Data Analysis (Part 2): Who’s the Best at EDA?: Compare ChatGPT, Claude, and Gemini in tackling Exploratory Data Analysis
  7. ChatGPT vs. Claude vs. Gemini for Data Analysis (Part 3): Best AI Assistant for Machine Learning: How AI can accelerate your ML projects from feature engineering to model training

Reading List in Past Two Months

Now, let’s talk about the great articles I came across in July and Augest:

Data Science & Analytics

  1. Rethinking How We Evaluate The New York Times Subscription Performance: An exploration into The New York Times Growth Data team’s process of designing and building a new subscription reporting model.
  2. Forget Statistical Tests: A/B Testing Is All About Simulations: How to understand A/B testing intuitively with simulations
  3. My First Billion (of Rows) in DuckDB: An experimentation in DuckDB, showing its strengths
  4. The Ultimate Guide to Finding Outliers in Your Time-Series Data (Part 1): This article explores both visual and statistical methods to identify outliers effectively in time-series data
  5. The Ultimate Guide to Finding Outliers in Your Time-Series Data (Part 2): Built up on the last article to cover machine learning methods for outlier detection.
  6. 9 Key Differences Between B2B and B2C Marketing: A great summary of marketing in B2B vs. B2C, informing different data strategies.
  7. Delivering Faster Analytics at Pinterest: Pinterest shares their experience of launching our Analytics app on StarRocks.
  8. Friendly Introduction to Deep Learning Architectures: Short but easy-to-understand summary of CNN, RNN, GAN, Transformers, and Encoder-Decoder Architectures.
  9. Predictive Marketing Mix Modeling with GLOP: The Perfect Cocktail Shaker: How to use GLOP (Google Linear Optimization Package) to optimize Return on Ad Spend (ROAS).
  10. Why Polars Destroy Pandas in All Possible Ways for Data Scientists?: Compare Polars and Pandas and explains why Polars has better performance.
  11. Improve Your Next Experiment by Learning Better Proxy Metrics From Past Experiments: Netflix’s new method to establish the causal relationship for long-term outcomes.

Data Career

  1. A Product Manager’s Guide to Roadmap Prioritization for Data Analytics Team: How to prioritize data analytics work using frameworks like RICE Score Ranking and Stack Ranking.
  2. The 4 Boring Ways I Doubled My Company’s Revenue In Less Than 30 Days: Simple and data-driven revenue growth strategy.
  3. Leading by Doing: Lessons Learned as a Data Science Manager and Why I’m Opting for a Return to an Individual Contributor Role: The author’s experience as a IC vs. manager in data science, and why they decided to return to an IC role.
  4. How Do I Become Chief Analytics Officer?: What does the career path looks like for data science analytics, and what makes a good Chief Analytics Officer.

AI and LLM

  1. 17 (Advanced) RAG Techniques to Turn Your LLM App Prototype into a Production-Ready Solution: Important techniques to improve the performance of a RAG pipeline.
  2. Building RAG application using Langchain 🦜, OpenAI 🤖, FAISS: Walks through an example of creating a PDF chatbot with RAG.
  3. From Data to Visualization with the OpenAI Assistants API and GPT-4o: How to create an AI assistant to conduct data visualization tasks.
  4. Claude-3: Data Analysts, Prepare for a New Challenger!: Review Calude-3 models’ capability in tasks like writing SQL queries, image analysis, and web page summaries.
  5. How I Built ‘University Course Finder’ Using RAG: A real example of creating a RAG application with Verba.
  6. Multimodal RAG — Intuitively and Exhaustively Explained: A brief introduction to RAG, and discusses various methods to build a RAG application from different types of data.
  7. How I Built My First RAG Pipeline: An overview of RAG framework with code examples.
  8. Let’s Build AI-Powered Case Discovery for Law Firms From Scratch: Use AI to retrieve law case documents.
  9. 5 Proven Query Translation Techniques to Boost Your RAG Performance: 5 very practical tips to improve the RAG performance with clear examples.
  10. Start With Why AI: When AI solutions are appropriate.
  11. The Evolution of SQL: How to design a text-to-SQL solution with AI.
  12. Don’t Limit Your RAG Knowledgebase to Just Text: Use images as the data source of your RAG.
  13. A busy person’s Intro to AI Agents: The history of AI agents and what AI agents can do.
  14. Is Prompt Engineering Dead?: Introduce Anthropic’s prompt generator, which is a powerful tool designed to simplify the process of creating effective prompts for AI models like Claude.