Streamline your data science code repository and tooling quickly and efficiently Originally published here Good Code is its own best documentation Dr. Rachael Tatman, in one of her presentation, highlighted the importance of code reproducibility in a very subtle way : “Why should you care about reproducibility? Because the person most likely to need to reproduce … Continue reading Automate your data science project structure in three easy steps
Building interpretable Boosting Models with IntepretML Originally published here As summed up by Miller, interpretability refers to the degree to which a human can understand the cause of a decision. A common notion in the machine learning community is that a trade-off exists between accuracy and interpretability. This means that the learning methods that are more … Continue reading Interpretable or Accurate? Why Not Both?
An open-source package for decision tree visualization and model interpretation Originally published here It is rightly said that a picture is worth a thousand words. This axiom is equally applicable for machine learning models. If one can visualize and interpret the result, it instills more confidence in the model’s predictions. Visualizing how a machine learning … Continue reading A better way to visualize Decision Trees with the dtreeviz library
Making the most of Google Colab notebooks Colaboratory, or “Colab” for short, are hosted Jupyter Notebooks by Google, They allow you to write and execute Python code via your browser. It is effortless to spin a Colab since it is directly integrated with your Google account. Colab provides free access to GPUs and TPUs, requires … Continue reading Use Colab more efficiently with these hacks
A tutorial on creating Plotly and Bokeh plots directly with Pandas plotting syntax Data exploration is by far one of the most important aspects of any data analysis task. The initial probing and preliminary checks that we perform, using the vast catalog of visualization tools, give us actionable insights into the nature of data. However, the … Continue reading Get Interactive plots directly with pandas.
In conversation with Guanshuo Xu: A Data Scientist, Kaggle Competitions Grandmaster(Rank 1), and a Ph.D. in Electrical Engineering. In this series of interviews, I present the stories of established Data Scientists and Kaggle Grandmasters at H2O.ai, who share their journey, inspirations, and accomplishments. The intention behind these interviews is to motivate and encourage others who want … Continue reading What it takes to become a World No 1 on Kaggle