Making the most of Google Colab notebooks
Colaboratory, or “Colab” for short, are hosted Jupyter Notebooks by Google, They allow you to write and execute Python code via your browser. It is effortless to spin a Colab since it is directly integrated with your Google account. Colab provides free access to GPUs and TPUs, requires zero configuration, and makes sharing of code seamless.
Colab has an interesting history. It initially started as an internal tool for data analysis at Google. However, later it was launched publically, and since then, many people have been using this tool to accomplish their machine learning tasks. Many students and people who do not have a GPU rely on colab for the free resources to run their machine learning experiments.
This article compiles some useful tips and hacks that I use to get my work done in Colab. I have tried to list most of the sources where I read them first. Hopefully, these tricks should help you to make the most of your Colab notebooks.
1. Using local runtimes 🖥
Typically, Colab provides you with free GPU resources. However, If you have your own GPUs and still want to utilize the Colab UI, there is a way. You can use the Colab UI with a local runtime as follows:
This way, you can execute code on your local hardware and access your local file system without leaving the Colab notebook. The following documentation goes deeper into the way it works.Colaboratory – Google
Colaboratory lets you connect to a local runtime using Jupyter. This allows you to execute code on your local hardware…research.google.com
2. Scratchpad 📃
Do you end up creating multiple Colab notebooks with names like “
untitled 1.ipynb” and “
untitled 2.ipynb” etc.? I guess most of us are sail in the same boat in this regard. If that’s the case, then the Cloud scratchpad notebook might be for you. The Cloud scratchpad is a special notebook available at the URL —
https://colab.research.google.com/notebooks/empty.ipynbthat is not automatically saved to your drive account. It is great for experimentation or nontrivial work and doesn’t take space in Google drive.
3. Open GitHub Jupyter Notebooks directly in Colab 📖
Colab notebooks are designed in a way that they can easily integrate with GitHub. This means you can both load and save Colab notebooks to GitHub, directly. There is a handy way to do that, thanks to Seungjae Ryan Lee.
When you’re on a notebook on GitHub which you want to open in Colab, replace
githubtocolab in the URL, leaving everything else untouched. This opens the same notebook in Colab.
4. Get Notified of completed cell executions 🔔
Colab can notify you of completed executions even if you switch to another tab, window, or application. You can enable it via
Tools → Settings → Site → Show desktop notifications (and allow browser notifications once prompted) to check it out.
Here is a demo of how the notification appears even if you navigate to another tab.
Do you want this same functionality in your Jupyter Notebooks as well ? Well, I have you covered. You can also enable notifications in your Jupyter notebooks for cell completion. For details read 👇
5. Search for all notebooks in drive 🔍
Do you want to search for a specific Colab notebook in the drive? Navigate to the Drive search box and add :
This will list all the Colab notebooks in your Google Drive. Additionally, you can also specify the title and ownership of a specific notebook. For instance, if I want to search for a notebook created by me, having ‘
Transfer’ in its title, I would mention the following:
6. Kaggle Datasets into Google Colab 🏅
If you are on a budget and have exhausted your GPU resources quota on Kaggle, this hack might come as a respite for you. It is possible to download any dataset seamlessly from Kaggle onto your Colab infrastructure. Here is what you need to do:
- Download your Kaggle API Token :
On clicking the. ‘
Create New API Token’ tab, a
kaggle.json file will be generated that contains your API token. Create a folder named Kaggle in your Google Drive and store the
kaggle.json file in it.
2. Mount Drive in Colab Notebook
3. Provide the config path to `kaggle.json` and change the current working directory
os.environ['KAGGLE_CONFIG_DIR'] = "/content/drive/My Drive/Kaggle"
4. Copy the API of the dataset to be downloaded.
For standard datasets, the API can be accessed as follows;
For datasets linked to competitions, the API is present under the ‘Data’ tab:
5. Finally, run the following command to download the datasets:
!kaggle datasets download -d alexanderbader/forbes-billionaires-2021-30
!kaggle competitions download -c ieee-fraud-detection
7. Accessing Visual Studio Code(VS Code) on Colab 💻
Do you want to use Colab’s infrastructure without using notebooks? Then this tip might be for you. Thanks to the community’s efforts in creating a package called ColabCode. It is now possible to run VSCode in Colab. Technically it is accomplished via Code Server — a Visual Studio Code instance running on a remote server accessible through any web browser. Detailed instructions for installing the package can be found below.abhi1thakur/colabcode
Installation is easy! $ pip install colabcode Run code server on Google Colab or Kaggle Notebooks ColabCode also has a…github.com
Here is a quick demo of the process.
8. Data Table extension 🗄
Colab includes an extension that renders pandas’ dataframes into interactive displays that can be filtered, sorted, and explored dynamically. To enable Data table display for Pandas dataframes, type in the following in the notebook cell:
#To diable the display
Here is a quick demo of the same: https://colab.research.google.com/notebooks/data_table.ipynb
9. Comparing Notebooks 👀
Colab makes it easy to compare two notebooks. Use
View > Diff notebooks from the Colab menu or navigate to
https://colab.research.google.com/diff and paste the Colab URLs of the notebooks to be compared, in the input boxes at the top.
These were some of the Colab tricks that I have found very useful, especially when it comes to training machine learning models on GPUs. Even though Colab notebooks can only run for at most 12 hours, nevertheless, with the hacks shared above, you should be able to make the most out of your session.
Originally published here