01.00 Jupyter Lab and Notebooks

In data science presenting our results is as important as achieving them. The Jupyter project (previously called IPython Notebook) provides a notebook application which allows runnable pieces of code to be mixed with text and images explaining them. Support for other languages, apart from Python, has been added a handful of years ago; although the notebook is still mostly a Python niche.

There are several flavors of notebooks in jupyter. One may be still use (some companies do) old Python 2 libraries and IPython notebooks; Python 2 is dead and buried, time to stop practicing necromancy and update to a modern language. The Jupyter Notebook application superseded the IPython Notebook and has been in use for a long time, although the application is finally starting to show its age. If you are familiar with the Juputer Notebook application feel free to use that to run the examples here, they all work in the Jupyter Notebook application.

What we will use and describe is the Jupyter Lab application which can run notebooks, terminals, edit files and has many features on top of the Jupyter Notebook application. Jupyter Lab will soon supersede the Jupyter Notebook application, therefore we shall learn it for the future. First a note on terminology: The Jupyter Lab is the application that runs among other things jupyter notebooks; The Jupyter Notebook application is the older program used to run jupyter notebooks; A jupyter notebook is a file that can be run by either the Jupyter Lab application or the Jupyter Notebook application. Yes, that's as confusing as it looks. Try to read this paragraph thrice and try to differentiate the concepts in your mind.

From now on we will reference jupyter notebook as the running notebook inside the application - preferably Jupyter Lab - on your machine. Similar to some well known editors (e.g. vi) the jupyter notebook is modal, i.e. it has modes where different commands are accepted. By default it has two modes an edit/input mode and a command/run mode. Some extensions provide more modes.

IPython Kernel

Behind the scenes the notebook is connected to a process which executes the commands and returns their results. The engine (called the kernel by the jupyter project) runs inside a local (or remote) webserver talking to the notebook. Communication between the notebook and the kernel is asynchronous, making for a very responsive interface.

The Interface

Opening Jupyter Lab one sees a left sidebar which shows the contents of the file system, and a right work area presenting buttons to create some of the supported files by Jupyter Lab. The general interface looks like the following:

Jupyter Lab Interface

py-jupyter-lab-interface.svg

We will describe below a typical work flow through the interface below. But first let's write out a reference for each part so we have something to come back to.

Left Sidebar

  • Shows files in the current directory/folder.
  • Interactive view of a portion of the file system, you can enter/exit directories.
  • Starts in the directory where the notebook was started.
  • Will not go above where it was started.
  • Has options to create files (though a launcher) and directories.
  • The context menu (right click) has options for files (e.g. rename, download, delete, ...).
  • Extra tabs on the far left provide information on running notebook kernels.

Menu Bar

  • Contains common mnemonic operations, e.g. save the work.
  • Some keyboard shortcuts are displayed next to the actions.

Work Area

  • Shows currently running notebooks, terminals and file editors.
  • Divided into tabs, each tab may hold a different notebook, terminal or file.
  • Displays the language/kernel of each notebook.
  • Below the tabs action buttons are displayed for notebooks.
  • A floppy disk to save, a play button to run cells, a stop button and some edit buttons.
  • A drop down menu is available to select the type of a cell.

A typical work flow through a session is to: First open the Jupyter Lab and navigate in the left sidebar to the place where a notebook is to be created. Create the notebook from the launcher in the main work area. By default the notebook will be named Untitled.ipynb, one can rename it by right clicking the file in the left sidebar.

Once the notebook is running one would add cells, often interleaved markdown and code cells. Code cells are the default when adding a cell, they contain Python code and can be run with the run button or with a handful of shortcuts. To run a code cell from the keyboard one can do:

  • Alt + Enter to run the cell and insert a new code cell below it.
  • Shift + Enter to run the cell and advance to the next cell.
  • Ctrl + Enter to run the code cell without moving.

Code cells may produce output which is then displayed below the cell. The output can be a printout from the code or be more complex such as displaying an image or a graph. The last shortcut only works with code cells, yet it is probably the most useful shortcut when exploring notebooks written by someone else.

Ctrl + Enter

py-ctrl-enter.svg

Markdown cells exist to annotate the document. The text you are reading has been originally written in markdown cells on a jupyter notebook. One changes from a code cell to a markdown cell by using the drop down menu at the top, or using the shortcuts:

  • Ctrl + 1 makes the cell a code cell
  • Ctrl + 2 makes the cell a markdown cell

There also exist raw cells, these are for jupyter extensions that may create other cell types.

Markdown is a very simple plain text format that can be easily transformed into an HTML presentation. It is similar to LaTeX in that paragraphs are separated by white lines; and is similar to plain text emails, where emphasis is done by surrounding words with asterisks or hyphens. There are several flavors of markdown but one can simply run the cell to see how the syntax presents itself.

The navigation across cells is modal, which means that a different set of commands work when one is editing a cell or when one is moving between cells. One can click on a selected cell or press Enter to enter the edit mode and modify the contents of the cell. By selecting another cell or pressing Esc one exits the edit mode and goes back to command mode, where cells can be moved up and down by dragging.

When one closes the tab containing a notebook, the code running in it keeps going. In order to stop the Python code one must either shutdown the entire Jupyter Lab or select Close and Shutdown Notebookn the File menu. There are many more shortcuts to Jupyter but here is a handful of useful ones:

  • Ctrl + Shift + D Single Document Mode, it hides the tabs until you execute the shortcut again.
  • Ctrl + Shift + Q Close and Shutdown Notebook, as opposed to keep it running in the background.
  • Ctrl + S Saves the notebook.

Interface Extras

Apart from notebooks Jupyter Lab has other uses. Two significant features are the ability to edit text files directly on the file system and the ability to open a terminal in order to perform more advanced operations by hand. We will have a quick reference at these two:

Text Editor

  • Edits a plain text file.
  • The default extension is .txt, changing that will guess the file syntax.
  • The syntax can be explicitly set in the Text Editor Syntax Highlighting menu.
  • Useful to persist some pure python scripts (or other languages).
  • For python the __init__.py file is still needed.

Terminal View

  • An actual PTY connected through a websocket.
  • A full xterm emulator on Linux/MacOS.
  • PowerShell on MS Windows.

Other Jupyter Lab features include a CSV (Comma Separated Value) visualizer and a display for several formats of images and graphs.

IPython (the command)

The ipython program (improved Python) is a command line interface that can be understood as another interface to an IPython Kernel (in reality the IPython Kernel is a modified ipython binary, since ipython is an older project). For quick exploration or just for people that prefer command line tools (your faithful here included), ipython is a good option. Later, after some experimentation, one can move the results to a notebook for presentation.

Extra References