Data Science Double BAM! | Joshua Starmer of StatQuest on The Artists of Data Science Podcast
On this episode of The Artists of Data Science, we get a chance to hear from Josh Starmer a data scientist who has helped empower learners from all over the globe by breaking down complicated statistics and machine learning topics into small bite sized pieces that are easy to understand.
You may know Joshua from his youtube channel StatQuest, where he’s beloved by his audience of over 320,000 subscribers and 15 million viewers.
Joshua shares with us his powerful journey from being a cellist and music composer to getting his PhD in computational biology and then creating StatQuest.
This episode is packed with advice, wisdom, and tips for developing a creative process and facing your fears. It was a great honor interviewing Joshua!
Some notable segments from the show
[9:05] How music has helped Joshua become more creative
[17:19] Inspiration for StatQuest
[24:00] The most challenging part of creating content
[28:02] The most misunderstood concept from statistics and machine learning
[36:38] How Joshua approaches his creative endeavours
Where to listen to the show
Add alt text
Joshua journey in statistics and data science
Joshua’s journey into stats began when he took a statistics class as a graduate student because he thought the women in the class were cute. He thought that he had to become good at stats to get their attention.
[3:20] “It’s a little embarrassing. So I first got interested in statistics as a graduate student. I wasn’t a statistics major, but I had to take statistics classes and I thought the women in the statistics program were pretty cute. And I thought that if I wanted to get their attention, I had to be good at statistics. So I studied a lot. I mean that’s kind of how it all started. I was in these classes and I was like, how do I get. How do I get these people attention?”
Where is the field headed in 2–5 years?
Joshua sees the field of data science becoming more and more important. He believes that the field is currently like the wild west, where there is a lot of data being generated and people don’t know exactly what to do with it.
[4:18] “I just assume that next year they’re gonna have a whole new way of generating crazy amounts of data, doing something new, and they’re going to need us to come in and make sense of it. Some of that’s gonna be using established statistics, some of that is just making stuff up as we go. I just see it becoming more and more important.”
What will separate great data scientists from the rest of them?
Joshua believes that the great data scientists are the ones that understand the main concepts, without getting lost in the details. This allows them to stay focused on the more important ideas without getting caught in the hype.
[5:23] “I think, and this is true of any field. I feel like the great people are the ones that understand the main ideas and don’t get lost in the details, because when you understand the main ideas, you can see a tool for what it truly is and what it’s truly worth. And you don’t get swept up in all the hype. And our field is full of hype and that’s good and bad. You know it attracts people to the field. Smart people get sucked into it as well. And then the bad thing is, you have to kind of recognize despite the hype, tools are only good at doing certain things. And if you know the main ideas, you will know what tool is the right one for the right job. I think those are going to be the great Data scientists.”
Key takeaways from the episode
[6:38] Music theory is a way to break down music into its components of harmony, rhythm and melody. Learning how to break down music into its individual components has allowed Joshua to carry over this skillset into machine learning and data science, where he now does this for algorithms.
Inspiration for StatsQuest
[17:19] StatsQuest began as a way to teach basic statistical fundamentals to others. Videos were recorded as a reference, so that new employees at the genetics lab that Joshua worked at could learn some of these stats topics. These videos snowballed on Youtube and grew to what StatsQuest is today.
Most challenging part of creating content
[24:00] There are many aspects of creating content that can be terrifying. But the most terrifying aspect is beginning. Staring at the blank page and creating the rough draft can be a daunting task. It is so simple, yet so scary.
Most misunderstood concept from statistics and machine learning
[28:04] The most misunderstood concept in statistics is that the probability of some specific measurement can be zero. When you want to be precise, having a very specific measurement starts having a smaller probability, getting closer to zero.
Data Science: Art or science?
[34:36] It’s a little bit of both. Some tools will always have their place in data science that have been around for many years, and these tools are based in solid statistical theory. On the other hand, there is an art to how the data can be presented.
The creative process in data science
[36:38] Being creative is critical in data science. All data scientists create insights out of data. That is the very nature of the field. If you are someone who doesn’t believe that you are creative, think again. Humans are innately creative. We create things all the time.
Everything begins with the blank page. You need to start somewhere to begin creating. This part can be very scary. This part is similar for all creative endeavours.
The key difference is with music, you never know when you are done. You know when you are done with your publication or creating videos, because you are trying to target a few main points. With music, you have to decide when you are done.
[9:38] “I pick up my guitar, my ukulele, and I start playing, and my head just completely clears.”
[19:52] “what I really want people to take home is that anyone can understand these things [statistics]. Ninety nine times out of 100, the only thing between them and understanding is fancy terminology and fancy notation”
[23:31] “It’s probably a good thing that I’m a little nervous…because it pushes me just a little harder to make sure that what I’m talking about is correct”
[33:16] “…if you want to educate someone…you have to relate with them and you have to see the material from their perspective.”
The one thing that Josh wants you to learn from their story
[39:46] You don’t have to be the smartest kid in math class to be good at data science. I struggle with learning certain concepts, and I need to take things slowly. But taking things slowly has allowed me to be able to break concepts down into understandable and easy to explain pieces for others. If I can do it, so can anyone else.
From the lightning round
Future of StatQuest in the next 2–4 years
Josh want’s to see StatQuest grow! Right now it’s just him running the show — which is no doubt a lot of work. “If I can work with like-minded people, I can cover a lot more information”. He’s got a vision to create a StatQuest curriculum and maybe even a StatQuest University.
From my dad: Do something you are passionate about, and focus on the main idea.
From my boss: Do the most important thing you can do.
Advice to 18 year old self
“When I was 18, I wanted to be a professional cello player. I wanted to write music and create film scores. I wouldn’t give any advice to my younger self, because I probably wouldn’t believe what I would end up being passionate about.”
Neal Stephenson’s books. They contain action and philosophy, and who doesn’t love that combination.
Source of motivation
“I won’t be here forever, so I need to get things done now.”
Find Joshua Online
You are welcome to share the below transcript (up to 500 words) in media articles (e.g., The New York Times, LA Times, The Guardian), on your personal website, in a non-commercial article or blog post (e.g., Medium), and/or on a personal social media account for non-commercial purposes, provided that you include attribution to “The Artists of Data Science” and link back to the https://theartistsofdatascience.fireside.fm/articles URL.
For the sake of clarity, media outlets with advertising models are permitted to use excerpts from the transcript per the above.
The transcript for this episode can be found here.