Use Colab more efficiently with these hacks

Making the most of Google Colab notebooks


Colaboratory, or “Colab” for short, are hosted Jupyter Notebooks by Google, They allow you to write and execute Python code via your browser. It is effortless to spin a Colab since it is directly integrated with your Google account. Colab provides free access to GPUs and TPUs, requires zero configuration, and makes sharing of code seamless.

Colab has an interesting history. It initially started as an internal tool for data analysis at Google. However, later it was launched publically, and since then, many people have been using this tool to accomplish their machine learning tasks. Many students and people who do not have a GPU rely on colab for the free resources to run their machine learning experiments.

This article compiles some useful tips and hacks that I use to get my work done in Colab. I have tried to list most of the sources where I read them first. Hopefully, these tricks should help you to make the most of your Colab notebooks.


1. Using local runtimes 🖥

Typically, Colab provides you with free GPU resources. However, If you have your own GPUs and still want to utilize the Colab UI, there is a way. You can use the Colab UI with a local runtime as follows:

Using local runtimes in Colab | Image by Author

This way, you can execute code on your local hardware and access your local file system without leaving the Colab notebook. The following documentation goes deeper into the way it works.Colaboratory – Google
Colaboratory lets you connect to a local runtime using Jupyter. This allows you to execute code on your local hardware…research.google.com


2. Scratchpad 📃

Do you end up creating multiple Colab notebooks with names like “untitled 1.ipynb” and “untitled 2.ipynb” etc.? I guess most of us are sail in the same boat in this regard. If that’s the case, then the Cloud scratchpad notebook might be for you. The Cloud scratchpad is a special notebook available at the URL — https://colab.research.google.com/notebooks/empty.ipynbthat is not automatically saved to your drive account. It is great for experimentation or nontrivial work and doesn’t take space in Google drive.

Scratchpad in Colab | Image by Author

3. Open GitHub Jupyter Notebooks directly in Colab 📖

Colab notebooks are designed in a way that they can easily integrate with GitHub. This means you can both load and save Colab notebooks to GitHub, directly. There is a handy way to do that, thanks to Seungjae Ryan Lee.

When you’re on a notebook on GitHub which you want to open in Colab, replace github with githubtocolab in the URL, leaving everything else untouched. This opens the same notebook in Colab.

Open GitHub Jupyter Notebooks in Colab | Image by Author

4. Get Notified of completed cell executions 🔔

Colab can notify you of completed executions even if you switch to another tab, window, or application. You can enable it via Tools → Settings → Site → Show desktop notifications (and allow browser notifications once prompted) to check it out.

Enabling cell notifications in Colab| Image by Author

Here is a demo of how the notification appears even if you navigate to another tab.

Get Notified of completed cell executions | Image by Author

Additional Tip

Do you want this same functionality in your Jupyter Notebooks as well ? Well, I have you covered. You can also enable notifications in your Jupyter notebooks for cell completion. For details read 👇

Enabling notifications in your Jupyter notebooks for cell completion
Get notified when your long-running cell finishes execution.towardsdatascience.com


5. Search for all notebooks in drive 🔍

Do you want to search for a specific Colab notebook in the drive? Navigate to the Drive search box and add :

application/vnd.google.colaboratory

This will list all the Colab notebooks in your Google Drive. Additionally, you can also specify the title and ownership of a specific notebook. For instance, if I want to search for a notebook created by me, having ‘Transfer’ in its title, I would mention the following:

Search for all notebooks in drive | Image by Author

6. Kaggle Datasets into Google Colab 🏅

If you are on a budget and have exhausted your GPU resources quota on Kaggle, this hack might come as a respite for you. It is possible to download any dataset seamlessly from Kaggle onto your Colab infrastructure. Here is what you need to do:

  1. Download your Kaggle API Token :
Accessing your Kaggle API Token | Image by Author

On clicking the. ‘Create New API Token’ tab, a kaggle.json file will be generated that contains your API token. Create a folder named Kaggle in your Google Drive and store the kaggle.json file in it.

Kaggle folder containing the kaggle.json file | Image by Author

2. Mount Drive in Colab Notebook

Mounting Drive in Colab Notebook | Image by Author

3. Provide the config path to `kaggle.json` and change the current working directory

import os
os.environ['KAGGLE_CONFIG_DIR'] = "/content/drive/My Drive/Kaggle"
%cd /content/drive/MyDrive/Kaggle

4. Copy the API of the dataset to be downloaded.

For standard datasets, the API can be accessed as follows;

Forbes Billionaires 2021 dataset publically available on Kaggle | Image by Author

For datasets linked to competitions, the API is present under the ‘Data’ tab:

IEEE-CIS Fraud Detection competition publically available on Kaggle | Image by Author

5. Finally, run the following command to download the datasets:

!kaggle datasets download -d alexanderbader/forbes-billionaires-2021-30
or
!kaggle competitions download -c ieee-fraud-detection
Kaggle Datasets into Google Colab | Image by Author

7. Accessing Visual Studio Code(VS Code) on Colab 💻

Do you want to use Colab’s infrastructure without using notebooks? Then this tip might be for you. Thanks to the community’s efforts in creating a package called ColabCode. It is now possible to run VSCode in Colab. Technically it is accomplished via Code Server — a Visual Studio Code instance running on a remote server accessible through any web browser. Detailed instructions for installing the package can be found below.abhi1thakur/colabcode
Installation is easy! $ pip install colabcode Run code server on Google Colab or Kaggle Notebooks ColabCode also has a…github.com

Here is a quick demo of the process.

Accessing Visual Studio Code(VS Code) on Colab | Image by Author

8. Data Table extension 🗄

Colab includes an extension that renders pandas’ dataframes into interactive displays that can be filtered, sorted, and explored dynamically. To enable Data table display for Pandas dataframes, type in the following in the notebook cell:

%load_ext google.colab.data_table
#To diable the display
%unload_ext google.colab.data_table

Here is a quick demo of the same: https://colab.research.google.com/notebooks/data_table.ipynb

Data Table extension in Colab | Image by Author

9. Comparing Notebooks 👀

Colab makes it easy to compare two notebooks. Use View > Diff notebooks from the Colab menu or navigate to https://colab.research.google.com/diff and paste the Colab URLs of the notebooks to be compared, in the input boxes at the top.

Comparing notebooks in Colab | Image by Author

Wrap Up

These were some of the Colab tricks that I have found very useful, especially when it comes to training machine learning models on GPUs. Even though Colab notebooks can only run for at most 12 hours, nevertheless, with the hacks shared above, you should be able to make the most out of your session.


Originally published here

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s