Episode 57 - Data Science, Analogies, and Nearest Neighbors
There's a lot of confusion over what a "data scientist" is these days, and the job titles can mean multiple things to multiple people. It's also getting tougher for new entrants into the job market. Max breaks it down and gives advice for those looking to break in.
On the technical side, Max starts with the argument from analogy, a philosophical argument that works alongside Bayes which we use without knowing every single day. Then, a connection to machine learning is made with a description of the Nearest Neighbor (kNN) algorithm, and the specific use case at Foursquare.
Related Links
The main article, by Vicki Boykis, is called Data Science is Different Now, posted on her Github Blog last month.
For a good breakdown on all the roles related to data science and machine learning engineering, check out the article Why you shouldn’t be a Data Science Generalist
And related - How a Data Scientist can Improve his productivity
Foursquare Grows up and Beyond the Check-In: an example of an article written about Data Scientists in 2011 around the time I was interviewing with and starting at Foursquare
For the argument by analogy, here is an in-depth article about it from the Stanford Encyclopedia of Philosophy. Or if you prefer, a YouTube video on Arguments from Analogy by Philosophy classroom.
For examples of K-Nearest-Neighbor, read more about Foursquare’s Snap-to-Place technology.
An interesting example of KNN can also be seen in the visualization of Foursquare’s Taste Map - YUM!
Related Episodes
Episode 31 on Causality, which Analogy arguments can overlook
Episode 23 on Natural Language Processing
Episode 21 on the Philosophy behind Bayes Rule
Episode 16 on Overfitting and Underfitting
Episode 7 on working with Dennis Crowley