1 minute read

Company Size vs. DS Team Size

This week, let’s continue exploring the 2021 Kaggle ML&DS Survey. My initial thoughts this week was to look at the pay distribution between male and female, controlling degree, years coding, country, etc. However, I realized after controlling those factors, there are very few data points for me to draw any reliable conclusion. Therefore, I had to switch to another topic – looking at the correlation between company size and DS team size. These are from the categorical values collected from two questions “What is the size of the company where you are employed?” and “Approximately how many individuals are responsible for data science workloads at your place of business?”.

My Visualization

Looking at two ordinal categorical variable, I think some common choices are heatmap, stacked percentage bar chart, and stacked area plot. Since we want to compare within each company size band, the distribution of DS team size, let’s use a simple stacked area plot here.

Please notice that all the visualizations are designed for desktop view, so it is recommended to view them on a desktop device.

Dashboard link

Insights

  • AS one can imagine, there is a strong positive correlation between company size and DS team size;
  • Among small companies with size < 50 people, almost 90% of them have 0 or < 5 people DS team;
  • But looking at companies with size between 1k to 10k, 30%+ of them have a 20+ people DS team, and the number increase to almost 60% for company over 10k people;
  • Also interesting that there are always 10%+ company among all size bands that do not have anyone doing data science work, and I would imagine that highly correlated with the industry;
  • Last but not least, given this is a survey ran on Kaggle, I think there could be some bias introduced as respondents are more likely to work in a DS function.

Follow this link to find more weekly vizzes :)