- how to install new packages
- what are environments and why we need them
- the environment available on our Jupyterhub
- how to customize your environment
- documenting you environment in Jupyter notebook
Common tools used for Python package installation :
We recommend you stick with one package manager to avoid conflicts
As mamba is already installed on the jupyterhub it is the obvious choice for OpenReproLab
An example :
Sometimes one application needs a particular version of a package but a different application needs another version. Since the requirements conflict, installing either version will leave one application unable to run. This situation can be resolved by using virtual environments. A virtual environment is a semi-isolated Python environment that allows packages to be installed for use by a particular application or for a particular project.
- fresh environment creation :
mamba create -n nameofmyenv <list of packages> - cloning an existing environment :
conda create --name myclone --clone myenv(example base environment on the jupyterhub) - creation from a requirement file :
conda env create -f environment.yml
with environment.yml :
name: atlaslice
channels:
- conda-forge
- pypi
- defaults
dependencies:
- python=3.11
- xarray=2023.6.0
- numpy=1.25.0
- scipy=1.13.0
- numba
- pandas=1.5.3
- matplotlib=3.7.1
- netcdf4=1.6.2
- jupyter=1.0.0
- ipython=8.12.0
- ipykernel=6.19.2
- ffmpeg=4.2.2
- dask=2023.6.0
- cmocean=3.0.3
- cartopy=0.21.1Managing software environments can sometimes become complex and confusing. To avoid recurring problems, use a single dependency manager. This applies to all languages!
So don't mix pip commands with conda or pipx commands. The only exception are conda and mamba commands which are interchangeable. Think of them as a single command, mamba is simply faster.
Important
Here we'll use Mamba with the mambacommand.
Mamba (or similarly Conda) is a very complete package manager, allowing you to install not only Python packages, but also system packages and other languages such as R. That's why, in practice, Mamba is enough to handle any situation you may encounter.
By default, a Mamba environment is installed in your online JupyterLab. This is a Pangeo environment known simply as "notebook". Pangeo is a community of geoscientists who have listed the most commonly used packages in Python.
mamba list # To obtain a list of the packages in the environment you're usingEvery terminal you open in JupyterLab automatically loads this "notebook" environment by default (this behavior may change if you tinker). So if you're lost, just relaunch your terminal.
mamba info # To learn more about your currently loaded environmentWhen you open a notebook via the JupyterLab launcher, there is a choice available: "Python 3 (ipykernel)" kernel (see below).
A "kernel" is a Mamba environment ready to be used in a notebook. Here, the "Python 3 (ipykernel)" kernel corresponds to the Pangeo "notebook" environment.
Warning
Each terminal can load a different Mamba environment and a notebook kernel is not linked to any terminal environment. These are different contexts which must be configured separately.
This default environment may become insufficient. You may need a specific Python library to manipulate your data, or a system package to complement your Bash script.
In your JupyterLab online, the default Pangeo "notebook" environment cannot be persistently modified. This means that if you install a package, it will disappear the next time you reboot. So you need a new Mamba environment.
mamba create -n my-env-name # To create a new environment that will exist throughout your next sessionsIt is also possible to indicate the python version, useful in case of compatibilty with old softwares using old version of python
mamba create -n my-env-name python=3.8 # To create a new environment with a specific version of python , here 3.8You then need to "activate" this new Mamba environment to be able to use its packages. Activation is not automatic when a new terminal is opened, so you may need to do it if necessary.
conda activate my-env-name # To activate another environmentWarning
Note that the command conda activate ... is used instead of mamba activate ... here, on purpose. If you run mamba activate ..., the terminal will first ask you to run mamba init. However, mamba init permanently alters the behavior of JupyterLab, so I don't recommend it. This is the only exception to the use of mamba and is specific to OpenReproLab.
This "my-env-name" environment is personal and contains almost no packages. It cannot become a notebook kernel because it lacks the essential "ipykernel" package.
mamba install ipykernel # To install the package in the currently activated environmentYou can give it a name like "My first Env" that will be displayed
python -m ipykernel install --name my-env-name --user --display-name "My first Env"After a few minutes at most, you should be able to use this environment as a kernel to run your notebooks.
Don't forget that this environment is almost empty. You may need to install all the necessary libraries again, such as "xarray" or "cartopy".
Once again, don't mix python -m, pip, pipx commands with mamba or conda commands. There's a very good chance that what you're looking to install exists in the Conda "forge", i.e. the Conda/Mamba package list.
Note
Example 1: you need a Python library whose documentation recommends using pip install lib-1? There's a very good chance that mamba install lib-1 will install the same package, or perhaps mamba install lib_1 - check it out on the Web!
Note
Example 2: You need the curl system command? Then mamba install curl will let you use it in a terminal! No need to run apt install curl.
You can do it by hand with markdown syntax in a cell :
The library watermak allows you to do it automatically :





