Create GitHub’s style contributions plot for your Time Series data

Originally published here Github contribution graph shows your repository contributions over the past year. A filled-up contribution graph is not only pleasing to the eye but points towards your hard work, too(unless if you have hacked it). The graph, though pretty, also displays considerable information regarding your performance. However, if you look closely, it is … Continue reading Create GitHub’s style contributions plot for your Time Series data

The curious case of Simpson’s Paradox

Statistical tests and analysis can be confounded by a simple misunderstanding of the data Statistics rarely offers a single “right”way of doing anything — Charles Wheelan in Naked Statistics In 1996, Appleton, French, and Vanderpump conducted an experiment to study the effect of smoking on a sample of people. The study was conducted over twenty years and included 1314 … Continue reading The curious case of Simpson’s Paradox

There is more to ‘pandas.read_csv()’ than meets the eye

A deep dive into some of the parameters of the read_csv function in pandas Pandas is one of the most widely used libraries in the Data Science ecosystem. This versatile library gives us tools to read, explore and manipulate data in Python. The primary tool used for data import in pandas is read_csv().This function accepts the file path of a … Continue reading There is more to ‘pandas.read_csv()’ than meets the eye

A hands-on guide to ‘sorting’ dataframes in Pandas

My tryst with the pandas’ library continues. Of late, I have been trying to look deeper into this library and consolidating some of the pandas’ features in byte-sized articles. I have written articles on reducing memory usage while working with pandas, converting XML files into a pandas dataframe easily, getting started with time series in pandas, and many more. In this article, … Continue reading A hands-on guide to ‘sorting’ dataframes in Pandas

5 Real World datasets for honing your Exploratory Data Analysis skills

The best way to learn data science is by doing it https://www.freepik.com/vectors/data If you are just getting started in Data Science and looking for some cool datasets to play with, this might be the article for you. A lot of courses and books never really move beyond the classic titanic and the Iris datasets. Not that … Continue reading 5 Real World datasets for honing your Exploratory Data Analysis skills