Useful pip commands for Data Science

A look at the most used package management system in Python

Image by Author

An in-depth article was published in the February of 2020 by Sebastian Raschka et al. that studies the role and importance of Python in the Machine Learning ecosystem. The paper titled Machine Learning in Python: Main Developments and Technology Trends in Data Science, Machine Learning, and Artificial Intelligenceput forward a fascinating observation which I’d like to quote here:

Historically, a wide range of different programming languages and environments have been used to enable machine learning research and application development. However, as the general-purpose Python language has seen a tremendous growth of popularity within the scientific computing community within the last decade, most recent machine learning and deep learning libraries are now Python-based.

Python has truly changed the Data Science landscape and emerged as one of the most used libraries in data science, today. This is also quite evident from the sheer number of python packages being created and used. As of July 2020, over 235,000 Python packages could be accessed through PyPI. So what is PyPI?

The Python Package Index (PyPI) is a repository of software for the Python programming language. This repository houses the packages created and shared by the ever-growing Python community. You can install any package from Pypi using pip which is the package installer for Python. Every Python programmer, new or old, uses pip install <package name> multiple times. However, there are other useful pip commands,, especially from a data science perspective which can be extremely useful. This article attempts to explain some of the commonly used pip commands along with their frequently used options.

Getting Started

To begin with, we’ll create a virtual environment. This way, it’ll be easier to show the various pip commands in action. Let’s use venv to create this new virtual environment and name it as env.The python’s venv module is used for creating lightweight “virtual environments.”

# Creating virtual environment in Mac/Linux
python3 -m venv env# Creating virtual environment in Windows
py -m venv env

Once the env environment has been created, we’ll activate it, and then we are good to go.

Let’s start by checking if pip is installed in our environment or not. Technically, if you are using Python 2 >=2.7.9 or Python 3 >=3.4, pip should be already installed. The pip --version command returns the location as well as the version of the pip installed.

Since everything is in place, let’s now look at a few important and most used pip commands, one by one.

1. pip help

If you type pip help in your terminal, you’ll get a single-page scrollable document.

Image by Author

It displays the various commands that can be used with pip, as well as how the commands can be used. Additionally, if you wish to see details concerning a single pip command, you can do:

>>> pip help <command_name>
example: pip help <install>

This brings up the information on the single commands whose details you are interested in.

2. pip list

If want to take a look at all the installed packages, you can do a pip list and it will output all the packages that are currently installed in the environment.

Syntax:

>>> pip list

The output above shows that currently, we have only two packages installed, and out of them, pip itself belongs to an outdated version.

Options

pip list can be used with a bunch of options, for instance:

  • --outdated/ -o for listing all the outdated packages
>>> pip list --outdated or 
>>> pip list -o

It looks like both the installed packages are outdated.

  • –uptodate/ -u for listing all the up-to-date packages
>>> pip list --uptodate or 
>>> pip list -u
  • --format selects the output format for displaying installed packages on the screen. The available options are — columns (default), freeze, or JSON.

3. pip install

The pip install command is used to install a new package. Let’s install the pandas , the bread and butter package for data science, in our virtual environment.

To check whether the pandas’ package has been installed or not, we can do a quickpip list to have a look at all the installed packages.

We can see that pandas, along with its other dependencies has been installed comfortably in the virtual environment.

Options

pip install also has few useful options to be used along.

  • --upgrade/ -U for upgrading all specified packages to the newest available version.
  • --requirement <file>/ -r for installing from the given requirements file. A requirements file is a list of all of a project’s dependencies. This text file contains all the package required including the specific version of each dependency. Here is how a requirement file typically looks like:
A snapshot of a requiremnts.txt file | Image by Author

To install all the packages mentioned in the requirements.txt file, you can simply do:

>>> pip install -r requirements.txt

4. pip show

This command shows information about the installed packages. One can choose the amount of information to be displayed on the screen. Let’s say we want to know details about the pandas package which we know is installed in our environment. To show limited details, we can do pip show pandas:

>>> pip show pandas

In case, you want the complete details, you can use the verbose option with the pip show command,

pip show --verbose pandas

5. pip uninstall

As the name suggests, pip uninstall will uninstall the desired package. As per the documentation, there are few exceptions that cannot be uninstalled. They are:

  • Pure distutils packages installed with python setup.py install, and
  • Script wrappers installed by python setup.py develop.

We’ll now uninstall the pandas’ package that we had recently installed. The process is pretty straight forward, as follows:

>>> pip uninstall pandas

Options

pip uninstall has two options, namely:

  • --requirement <file>/ -r for uninstalling packages from the requirements file.
  • --yes / -y . This option if selected doesn’t ask for confirmation during uninstalling a package.

6. pip freeze

In section 3, we touched upon the need for the requirements file in a project. Well, pip freeze lets you easily create one. It outputs all the installed packages and their version number in requirements format.

The output of the freeze command can then be piped into a requirements file, as follows:

Conclusion and additional resources

These were some of the useful pip commands in Python, which I use in my day-to-day activities. This could be used as a handy resource to learn about pip. There are other commands too which have not been covered in this article. The official documentation is an excellent resource if you are thinking to go deeper into the details.

Originally published here

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s