Automate your data science project structure in three easy steps

Streamline your data science code repository and tooling quickly and efficiently Originally published here Good Code is its own best documentation Dr. Rachael Tatman, in one of her presentation, highlighted the importance of code reproducibility in a very subtle way : “Why should you care about reproducibility? Because the person most likely to need to reproduce … Continue reading Automate your data science project structure in three easy steps

Interpretable or Accurate? Why Not Both?

Building interpretable Boosting Models with IntepretML Originally published here As summed up by Miller, interpretability refers to the degree to which a human can understand the cause of a decision. A common notion in the machine learning community is that a trade-off exists between accuracy and interpretability. This means that the learning methods that are more … Continue reading Interpretable or Accurate? Why Not Both?

A better way to visualize Decision Trees with the dtreeviz library

An open-source package for decision tree visualization and model interpretation Originally published here It is rightly said that a picture is worth a thousand words. This axiom is equally applicable for machine learning models. If one can visualize and interpret the result, it instills more confidence in the model’s predictions. Visualizing how a machine learning … Continue reading A better way to visualize Decision Trees with the dtreeviz library

What it takes to become a World No 1 on Kaggle

In conversation with Guanshuo Xu: A Data Scientist, Kaggle Competitions Grandmaster(Rank 1), and a Ph.D. in Electrical Engineering. In this series of interviews, I present the stories of established Data Scientists and Kaggle Grandmasters at H2O.ai, who share their journey, inspirations, and accomplishments. The intention behind these interviews is to motivate and encourage others who want … Continue reading What it takes to become a World No 1 on Kaggle