Ask a Data Scientist: In conversation with Rodrigo Araujo

Rodrigo Araujo is a Data Scientist at Stradigi AI. With a PhD in Machine Learning from the University of Waterloo and a wealth of industry experience, Rodrigo provides significant contributions to our Research and Data Science team that’s responsible for building, maintaining, and optimizing our Automated Data Science Workflows.

Here, Rodrigo dives into the importance of interpretability, and answers a handful of frequently asked questions, highlighting the impact that subject matter experts from all categories — sales, marketing, operations, finance, HR and more — can expert with Kepler, all thanks to the powerful capabilities of ready-made Machine Learning.

Within Stradigi AI’s Data and Research Science team, there are groups of experts who work on specific elements of Kepler. What’s your main priority and how do you fit into the world of Automated Data Science Workflows?

Right now, my main focus is the interpretability of models. Every user needs to understand why decisions are made by a specific ML model. So, I ensure that every task has a clear associated explanation, to deconstruct the black box of Machine Learning. It’s important for users to understand the ‘why’ behind their results, not only to satisfy their curiosity, but also to help enrich their conversations about results on predictions, forecasts and so on, when they’re speaking to other team members. Interpretability provides essential context around ML models.

Can you describe your approach to developing interpretability for Kepler?

My job is to ensure that we’re using the best approach to interpretability, at all times. As we began to build out the key features of Kepler, we conducted extensive research to examine all of the different approaches. We benchmarked and compared all of the tactics we could employ before choosing our specific technique and integrating it within Kepler.

Kepler has a unique “no ML experience required” capability built into it. What do you do to ensure Kepler balances the right level of ML complexity and sophistication with seamless user experience?

Basically, our job is abstracting the technicalities in all of the right places. Even though there is complex and rigorous data science and software engineering within the platform, we focus on automating the very specific places where a data scientist would typically intervene to make a decision, or perform a data science task, which is invariably impacted by data size, data type, data cleanliness. With Kepler, those interventions are already taken care of.

Our goal is to automate where it makes sense, and involve the user where it makes sense, too. For example: users answer key questions about their business before we begin automated feature engineering, as internal experts know their data and challenges better than anyone else.

Where do you see the future of data science and Kepler?

In the future, people will continue to find tools and ways of working smarter that empowers them to do the work that means the most to them. For example, business analysts are great at understanding what insights are meaningful and how they might impact an organization, but they don’t necessarily have data science knowledge or experience. Kepler helps them employ more complex models to get the insights they care about, while not necessarily having to have a huge change in the way they work. If Kepler can help people do what they do best at work, then we’re really getting somewhere.

This is an excerpt from our latest white paper, Inside Kepler’s Automated Data Science Workflows. For more rich Machine Learning insights, you can download it here.