ipython

Set of utility functions to be used in Jupyter and Jupyter Lab notebooks.

System and CLI


source

run_cli

 run_cli (cmd:str='ls -l')

Runs a cli command from jupyter notebook and print the shell output message

Uses subprocess.run with passed command to run the cli command

Type Default Details
cmd str ls -l command to execute in the cli
run_cli('pwd')
/home/vtec/projects/ec-packages/ecutilities/nbs-dev

Notebook setup


source

nb_setup

 nb_setup (autoreload:bool=True, paths:List[str|pathlib.Path]=None)

Use in first cell of notebook to set autoreload, and add system paths

Always add a path to the directoruy ‘src’ if srs directory exists at the same level as the nbs directory.

When the notebook is not located in a tree including the name nbs, src directory is searched at the same level as the directory in which the notebook is located.

Type Default Details
autoreload bool True True to set autoreload in this notebook
paths List[str | Path] None Paths to add to the path environment variable

By default, ipython.nb_setup() - loads and set autoreload - adds a path to a directory named src when it exists at the same level as where the notebook directory is located. It no such src directory exists, no path is added

ipython.nb_setup assumes the following file structure:

    project_directory
          |--- nbs
          |     | --- current_nb.ipynb
          |     | --- ...
          |
          |--- src
          |     | --- module_to_import.py
          |     | --- ...
          |
          |--- data
          |     |
          |     | ...

For other file structure, specify paths as a list of Path

Before running nb_setup, sys.path does not include the path to the local source directory. After running it, it will be added, unless the directory does not exist.

sys.path
['/home/vtec/projects/ec-packages/ecutilities/nbs-dev',
 '/home/vtec/miniconda3/envs/ecutils/lib/python310.zip',
 '/home/vtec/miniconda3/envs/ecutils/lib/python3.10',
 '/home/vtec/miniconda3/envs/ecutils/lib/python3.10/lib-dynload',
 '',
 '/home/vtec/miniconda3/envs/ecutils/lib/python3.10/site-packages',
 '/home/vtec/miniconda3/envs/ecutils/lib/python3.10/site-packages/PyQt5_sip-12.11.0-py3.10-linux-x86_64.egg',
 '/home/vtec/projects/ec-packages/ecutilities']
nb_setup(autoreload=False)
Added path: /home/vtec/projects/ec-packages/ecutilities/src
sys.path
['/home/vtec/projects/ec-packages/ecutilities/src',
 '/home/vtec/projects/ec-packages/ecutilities/nbs-dev',
 '/home/vtec/miniconda3/envs/ecutils/lib/python310.zip',
 '/home/vtec/miniconda3/envs/ecutils/lib/python3.10',
 '/home/vtec/miniconda3/envs/ecutils/lib/python3.10/lib-dynload',
 '',
 '/home/vtec/miniconda3/envs/ecutils/lib/python3.10/site-packages',
 '/home/vtec/miniconda3/envs/ecutils/lib/python3.10/site-packages/PyQt5_sip-12.11.0-py3.10-linux-x86_64.egg',
 '/home/vtec/projects/ec-packages/ecutilities']

We also can add other specific paths:

path_to_add = str(Path('../nbs').resolve().absolute())
nb_setup(autoreload=False, paths=[path_to_add])
Added path: /home/vtec/projects/ec-packages/ecutilities/nbs
sys.path
['/home/vtec/projects/ec-packages/ecutilities/src',
 '/home/vtec/projects/ec-packages/ecutilities/nbs',
 '/home/vtec/projects/ec-packages/ecutilities/nbs-dev',
 '/home/vtec/miniconda3/envs/ecutils/lib/python310.zip',
 '/home/vtec/miniconda3/envs/ecutils/lib/python3.10',
 '/home/vtec/miniconda3/envs/ecutils/lib/python3.10/lib-dynload',
 '',
 '/home/vtec/miniconda3/envs/ecutils/lib/python3.10/site-packages',
 '/home/vtec/miniconda3/envs/ecutils/lib/python3.10/site-packages/PyQt5_sip-12.11.0-py3.10-linux-x86_64.egg',
 '/home/vtec/projects/ec-packages/ecutilities']

source

install_code_on_cloud

 install_code_on_cloud (package_name:str, quiet:bool=False)

pip install the project code package, when nb is running in the cloud.

Type Default Details
package_name str project package name, e.g. metagentools or git+https://github.com/repo.git@main
quiet bool False install quietly with Trud

When using colab, kaggle or another cloud VM, specicif code must be installed every time from the Python Package Index (PyPI) or its GitHub repo.

When running locally, the project code should be pre-installed as part of the environment

install_code_on_cloud(package_name='metagentools');
The notebook is running locally, will not automatically install project code

Improve output cell formats


source

display_mds

 display_mds (*strings:str|tuple[str])

Display one or several strings formatted in markdown format

display_mds('**bold** and _italic_')

bold and italic

display_mds('**bold** and _italic_',
            '- bullet',
            '- bullet',
            '> Note: this is a note'
)

bold and italic

  • bullet
  • bullet

Note: this is a note


source

display_dfs

 display_dfs (*dfs:pandas.core.frame.DataFrame)

Display one or several pd.DataFrame in a single cell output

df1 = pd.DataFrame(data=np.random.normal(size=(10,5)))
df2 = pd.DataFrame(data=np.random.normal(size=(20,10)))

display_dfs(df1.head(3), df2.head(3))
0 1 2 3 4
0 -0.404293 -1.959411 -0.136628 0.727151 -0.595115
1 2.404413 1.958637 -0.513681 -1.408594 0.950348
2 1.684596 -2.020807 0.395364 2.704780 -1.106685
0 1 2 3 4 5 6 7 8 9
0 -2.236748 -1.984643 1.505316 -0.778855 0.461898 -1.497143 -0.533513 0.450035 0.119140 -0.567164
1 1.016743 -0.181036 -0.593569 -1.842810 1.858056 0.480069 0.266183 1.229341 0.365643 -0.231381
2 -0.567689 1.072291 0.484437 -0.224742 0.624904 -1.132879 1.338664 -0.931461 -0.035472 -0.873919

source

pandas_nrows_ncols

 pandas_nrows_ncols (nrows:int|None=None, ncols:int|None=None)

Context manager that sets the max number of rows and cols to apply to any output within the context

Type Default Details
nrows int | None None max number of rows to show; show all rows if None
ncols int | None None max number of columns to show; show all columns if None

With no context manager, the pandas object are displayed with a maximum of 60 rows and 20 columns.

df = pd.DataFrame(np.random.randint(low=0, high=100, size=(3,50)))
display(df)
0 1 2 3 4 5 6 7 8 9 ... 40 41 42 43 44 45 46 47 48 49
0 78 57 39 53 91 46 7 0 83 92 ... 92 49 11 69 56 93 51 26 76 81
1 31 25 45 99 82 52 4 97 75 59 ... 19 12 78 55 11 75 97 78 97 64
2 38 91 62 45 95 97 97 35 46 24 ... 6 64 31 59 0 83 17 21 54 47

3 rows × 50 columns

Using the context manager, all rows and columns will be displayed

with pandas_nrows_ncols():
    display(df)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
0 78 57 39 53 91 46 7 0 83 92 0 30 68 6 14 26 28 94 1 15 83 64 78 19 26 91 64 66 26 66 38 24 61 32 70 19 97 69 97 2 92 49 11 69 56 93 51 26 76 81
1 31 25 45 99 82 52 4 97 75 59 43 63 0 23 86 25 62 76 44 31 1 68 3 60 64 93 91 13 90 96 96 32 74 74 53 29 11 45 48 76 19 12 78 55 11 75 97 78 97 64
2 38 91 62 45 95 97 97 35 46 24 89 17 95 89 57 85 18 68 94 91 40 77 66 95 98 19 65 82 89 45 75 18 35 60 65 53 7 37 96 74 6 64 31 59 0 83 17 21 54 47

It is also possible to specifically define the number of rows and columns to display

with pandas_nrows_ncols(nrows=2, ncols=6):
    display(df)
0 1 2 ... 47 48 49
0 78 57 39 ... 26 76 81
... ... ... ... ... ... ... ...
2 38 91 62 ... 21 54 47

3 rows × 50 columns

with pandas_nrows_ncols(2,6):
    print(df)
    0   1   2   ...  47  48  49
0   78  57  39  ...  26  76  81
..  ..  ..  ..  ...  ..  ..  ..
2   38  91  62  ...  21  54  47

[3 rows x 50 columns]

Technical background

the context manager uses pandas’s options API

pd.options.display.max_rows, pd.options.display.max_columns
(60, 20)
pd.get_option('display.max_rows'), pd.get_option('display.max_columns')
(60, 20)
pd.describe_option('display.max_rows')
display.max_rows : int
    If max_rows is exceeded, switch to truncate view. Depending on
    `large_repr`, objects are either centrally truncated or printed as
    a summary view. 'None' value means unlimited.

    In case python/IPython is running in a terminal and `large_repr`
    equals 'truncate' this can be set to 0 and pandas will auto-detect
    the height of the terminal and print a truncated object which fits
    the screen height. The IPython notebook, IPython qtconsole, or
    IDLE do not run in a terminal and hence it is not possible to do
    correct auto-detection.
    [default: 60] [currently: 60]
pd.options.display.max_rows = 10
pd.reset_option('display.max_rows')
pd.options.display.max_rows
60

source

display_full_df

 display_full_df
                  (df:pandas.core.frame.DataFrame|pandas.core.series.Serie
                  s)

Display a pandas DataFrame or Series showing all rows and columns

Type Details
df pd.DataFrame | pd.Series DataFrame or Series to display
df = pd.DataFrame(np.random.randint(low=0, high=100, size=(3,50)))
df
0 1 2 3 4 5 6 7 8 9 ... 40 41 42 43 44 45 46 47 48 49
0 11 70 3 84 10 33 54 77 5 70 ... 12 77 41 30 70 70 73 84 27 86
1 8 93 0 34 68 74 71 36 69 33 ... 58 98 42 91 66 30 18 66 51 89
2 24 75 2 94 63 88 49 83 91 14 ... 15 17 24 74 19 64 57 92 23 70

3 rows × 50 columns

display_full_df(df)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
0 11 70 3 84 10 33 54 77 5 70 81 24 93 56 3 74 86 92 7 49 87 33 28 53 82 43 76 52 55 28 99 0 26 1 51 67 68 87 56 64 12 77 41 30 70 70 73 84 27 86
1 8 93 0 34 68 74 71 36 69 33 79 75 78 42 72 32 50 60 91 0 86 26 13 37 23 48 74 38 52 77 71 30 3 48 22 46 2 92 38 48 58 98 42 91 66 30 18 66 51 89
2 24 75 2 94 63 88 49 83 91 14 16 65 65 16 37 92 67 90 47 40 39 50 7 61 93 18 43 86 6 25 39 39 91 75 43 13 11 88 6 91 15 17 24 74 19 64 57 92 23 70
msg = 'should raise a TypeError'
contains = 'df must me a pandas `DataFrame` or `Series`'

test_fail(display_full_df, args=['a string'], msg=msg, contains=contains)