We write and speak about the intersection of data science research, practice, and leadership.


Creating a Next-Generation Financial Dataset from Scratch with NLP and Active Learning
I was delighted to be invited to give a talk at the inaugural spaCy IRL conference in Berlin, Germany. My talk highlighted one of my group's projects — using natural language processing and active learning to collect data about companies' environmental, social, and governance practices.
Managing Data Science in the Enterprise
Josh Poduska, Chief Data Scientist at Domino Data Lab, and I teamed up to talk about how to manage data science as an organizational capability, the data science project lifecycle, common organization structures for data science teams, and more at the Strata Data Conference in New York.
Facilitating Data Science Collaboration and Faster Innovation: Panel Discussion
I joined Elena Grewal, Head of Data Science at Airbnb, Sivan Aldor-Noiman, Head of Data Science and Data Engineering at Wellio, and moderator Nancy Hersh to talk about ideas for accelerating data science progress at startups, tech companies, and established enterprises.
Word Embeddings Under the Hood: How Neural Networks Learn from Language
This talk provides a clear explanation from first principles about how neural networks learn rich vector representations for words. Along the way, we get a "minimum viable introduction" to the basic concepts of how neural networks work.
Modern NLP in Python at PyData DC
This 90-minute PyData tutorial provides an overview of powerful tools and modeling techniques used for natural language processing in Python.