IPython, and by ordination Jupyter, extends the Python language with a handful of commands to streamline interactive work. These command are divided into completions and magics.
Completions (<Tab>
completion) work in IPython and in insert (edit) mode
of a jupyter notebook code cell.
The completions understand the Python language and also know about the functions
and variables currently defined in the kernel.
The only way to get used to the completion is to try it out.
Go on, open a code cell and try some completions.
Python has a built-in help()
function but typing it is lengthly
(6 characters including the brackets).
In IPython you can simply use the ?
character to access the help.
The code cells below output a lot of text,
to save space we will let you try this code on your own instead.
Where the amount of text is not unreasonable we will still output it directly.
import urllib
urllib.request.urlopen?
By using two question marks (??
) one gets the source code of the object.
One could go and look into the source file instead but
to be able to bring up the code without knowing where the source file is,
is quite convenient.
import urllib
urllib.request.urlopen??
The double question mark brings the source of the object,
not the contents of the entire file containing the code for the object.
Depending on what one is searching for this may be nicer (less to read)
or worse (cannot search within the file)
than searching for the source file itself and browsing it.
In Python, modules on the import line are files,
hence one can bring the entire file by asking for the code
of the object representing the module file.
In the case above, to bring the full source file one could do urllib.request??
.
But one may not know what to display help for.
In that case you can use wildcards to get a list of available objects.
(This is similar to searching with filter(..., dir(object))
in plain Python.)
import urllib
urllib.request.*open*?
The special commands that only exist inside IPython/Jupyter start with a %
sign,
these are objects within IPython and are called magics.
A magic is not a Python function, it is a special callable object invoked
inside the interpreter and never reaches the actual Python (kernel) state.
Line magics (that affect a single line of code) start with a single %
,
cell magics (for the entire cell) start with two signs (%%
).
A full tutorial about magics can be viewed by invoking:
%magic
And a reference by running:
%quickref
Or even an even shorter printout listing the available magics by using:
%lsmagic
A handful of useful magics when working inside a jupyter notebook are:
%%writefile
- writes the entire cell to a file%save
- evaluates current line and writes its output to a file%history
- prints the IPython command history (including magics)%xmode
- defines how exceptions are displayed (see exercises that follow)%timeit
- times a single line (or entire cell with %%timeit
) of code%pdb
- the Python DeBugger (when enabled, will start automatically on exceptions)%prun
- profiles a function call in a lineThe full list of magics in IPython is quite long, although most magics are not overly useful when working within a jupyter notebook.
Figuring out which algorithm runs faster is a common task in data science,
therefore we will have a quick look at the %timeit
magic.
This magic can be used both as a line magic - to evaluate how fast a single
line of code runs - or as a cell magic - to evaluate the time of the whole cell.
If you remember, line magics use a single percentage (%
) sign:
%timeit urllib.request.urlopen('https://www.city.ac.uk')
And cell magics use a double percentage sign (%%
):
%%timeit
urllib.request.urlopen('https://www.city.ac.uk')
urllib.request.urlopen('https://www.bbc.co.uk')
The timeit
magic runs the code several times and takes the mean time of all runs.
How many runs are performed is heuristically selected.
It will be the most common timing procedure,
since running the code several times takes away possible issues
with the machine running a different program at the exact same time.
Another timing magic is time
, which runs the code only once.
One needs to run in several time in order to have a good estimate.
The time
magic is analogous to the *nix shell command of the same name,
it provides the running time subdivided into system and user time.
Unless specific needs exist to check between system calls,
CPU running time and IO waits;
timeit
is an easier to use option.
Note: evaluating the timing of network connections, as we have done in the examples, is best made in other ways than plain system timing (e.g. timing at each network hop). Yet, for simplicity we ignore network complexities here.
Magics can do quite complex things, and they can work in a different fashion than plain Python.
Next we see some magics that accept arguments in a similar way to shell commands.
The following saves the first 10 lines of history to a file called hist.py
.
Try to figure out how it works.
%save hist %history -n 1-10
The history shown above gives a hint of what is happening below the hood. A magic is invoked by a method of an IPython object, which exists within the IPython kernel session.
If things go wrong one can enable the debugger, and Python has a default built-in debugger: pdb
.
The debugger will kick in whenever an uncaught exception is raised.
Describing what a debugger is and how to use one is far beyond our goals, moreover a debugger is not necessary for the majority of data analytics tasks. Therefore, if you have never used a debugger before, feel free to ignore this section and also the exercises that involve debugging.
def answer(x):
return x.question
%pdb on
answer(42)
When a code cell fails one can then re-run the cell after adding
%pdb on
to the cell.
Re-running a cell in this fashion allows to find the offending code
and even change the values of variables and continue execution.
But what if a cell is very expensive to run?
One which may take several hours before the exception happens.
Another magic is the %debug
, designed for exceptions
in expensive - long running - cells.
%debug
explicitly invokes the debugger.
In other words, after an unexpected exception occurs
one can invoke %debug
, by itself
in a new cell, to start the debugger on the current trace.
Debugging a stack trace that has already executed
will not allow one to continue the code after correcting the
failure but may allow to find the problem and correct it
in the cell before re-running it.
The debugger session in the notebook uses the input
Python function,
although the function is revamped into a Jupyter interface.
You can use the input
function directly.
Below we create a simple prompt.
food_can = input('Which brand of cat food did you buy today? ')
print('There is a', food_can, 'can in the fridge today.')
There are many more magics out there. Most magics are meant for interactive sessions inside the IPython command line, therefore such magics are not commonly used in notebooks. That said, a few libraries that we will see do use magics to perform some notebook operations.