diff --git a/README.md b/README.md index ba10a91..86218d3 100644 --- a/README.md +++ b/README.md @@ -3,224 +3,267 @@ -**TL;DR**: `%%testcell` prevents your testing cells from affecting the -global namespace. +One good thing about working in Jupyter notebooks is that they make it +easy to quickly test a bit of code by evaluating it in notebook cell. +But one bad thing is that the *definitions* resulting from that +evaluation hang around afterwards, when all you wanted was just to test +that one bit of code. -The Python cell magic `%%testcell` executes a cell without *polluting* -the notebook’s global namespace. This is useful whenever you want to -test your code without having any of the local variables escape that -cell. +`%%testcell` is a simple simple solution to that problem. It lets you +execute notebook cells in isolation. Test code, try snippets, and +experiment freely: no variables, functions, classes, or imports are left +behind. This helps to keep your namespace clean, so that leftover +symbols do not confuse later work. -What’s happening under the hood is that your cell code, before being -executed, is wrapped in a temporary function that will be deleted after -execution. To give you the feeling of *seamless integration* the last -statement is optionally returned like it happens in a normal cell. - -**WARNING:** this don’t protect you from *the side effects of your code* -like deleting a file or mutating the state of a global variable. +**WARNING:** this doesn’t protect you from *the side effects of your +code* like deleting a file or mutating the state of a global variable. [![](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/artste/testcell/blob/main/demo/testcell_demo.ipynb) - [![](https://kaggle.com/static/images/open-in-kaggle.svg)](https://www.kaggle.com/artste/introducing-testcell) +[![](https://modal-cdn.com/open-in-modal.svg)](https://modal.com/notebooks/new/https://github.com/artste/testcell/blob/main/demo/testcell_demo.ipynb) ## Install +Lightweight and reliable: `%%testcell` depends only on IPython and works +in all major notebook environments including Jupyter, Colab, Kaggle, +Modal, Solveit, and the IPython console. + ``` sh pip install testcell ``` -## How to use +## Quick Start + +First import `testcell`: + +``` python +import testcell +``` -just import it with `import testcell` and then use the `%%testcell` cell -magic. +Then use it: ``` python %%testcell -a = "'a' is not polluting global scope" -a +temp_var = "This won't pollute namespace" +temp_var ``` - "'a' is not polluting global scope" + "This won't pollute namespace" ``` python -assert 'a' not in locals() +# temp_var doesn't exist — it was only defined inside the test cell +temp_var # NameError: name 'temp_var' is not defined ``` -What is happening under the hood is that `%%testcell` wraps your cell’s -code with a function, execute it and then deletes it. Adding the -`verbose` keywork will print which code will be executed. +## How it works -NOTE: The actual cell code is enclosed within `BEGIN` and `END` comment -blocks for improved readability. +Import `testcell` and use the `%%testcell` magic at the top of any cell. +Under the hood, your code is wrapped in a temporary function that +executes and then deletes itself. + +Use `verbose` to see the generated wrapper code: ``` python %%testcell verbose -a = "'a' is not polluting global scope" -a +result = "isolated execution" +result ``` - "'a' is not polluting global scope" - -If you’re just interested in seeing what will be executed, but actually -not executing it, you ca use `dryrun` option: - ``` python -%%testcell dryrun -a = "'a' is not polluting global scope" -a +### BEGIN +def _test_cell_(): + #| echo: false + result = "isolated execution" + return result # %%testcell +try: + _ = _test_cell_() +finally: + del _test_cell_ +_ # This will be added to global scope +### END ``` -If you add a semicolon `;` at the end of your last statement no `return` -statement is added and nothing is displayed like a normal jupyter cell. + 'isolated execution' + +## Suppressing output + +Like normal Jupyter cells, add a semicolon `;` to the last statement to +suppress display: ``` python -%%testcell verbose -a = "'a' is not polluting global scope" -a; +%%testcell +calculation = 42 * 2 +calculation; ``` -`testcell` works seamlessly with existing `print` or `display`statements -on last line: +No output is displayed, and `calculation` still doesn’t leak to globals. + +## Skip execution + +Skip cells without deleting code using the `skip` command. + +**IMPORTANT**: This is especially useful in notebook environments like +**Colab, Kaggle, and Modal** where you can’t use Jupyter’s “Raw cell” +type to disable execution. ``` python -%%testcell verbose -a = "'a' is not polluting global scope" -print(a) +%%testcell skip +raise ValueError('This will not execute') ``` - 'a' is not polluting global scope + -Moreover, thanks to `ast`, it properly deals with complex situations -like comments on the last line and multi lines statements +To skip all `%%testcell` cells at once (useful for production runs), +use: `testcell.global_skip = True` + +## Visual marking + +Use `banner` to display a colored indicator at the top of cell output, +making test cells instantly recognizable: ``` python -%%testcell verbose -a = "'a' is not polluting global scope" -(a, - True) -# this is a comment on last line +%%testcell banner +"clearly marked" ``` - ("'a' is not polluting global scope", True) + + + 'clearly marked' + +**The banner adapts to your environment.** In HTML-based notebooks like +Jupyter, it displays as a full-width colored box. In console +environments like IPython, it appears as text with an emoji. -### Skip execution +Colors and emojis are fully customizable through `testcell.params`. -It is possible to skip a cell execution using `skip` command. This is -usueful when you want to keep around the code but don’t actually run it. -It’s also possible to skip **all cells markked with `%%testcell`** using -the following syntax: `testcell.global_skip=True`. +**IMPORTANT**: To enable banners for all `%%testcell` cells, use: +**`testcell.global_use_banner = True`** -
- This cell has been skipped -
- +## Run in complete isolation -### Run in isolation +`%%testcelln` is a shortcut for `%%testcell noglobals` and executes +cells with **zero access** to your notebook’s global scope. Only +Python’s `__builtins__` are available. -`%%testcelln` is a shortcut for `%%testcell noglobals` and executes the -cell in complete isolation from the global scope. This is very useful -when you want to ensure that global variables or namespaces are not -accessible within the cell. +This is powerful for: - **Detecting hidden dependencies**: catch when +your code accidentally relies on global variables - **Testing +portability**: verify functions work standalone - **Clean slate +execution**: run code exactly as it would in a fresh Python session ``` python -aaa = 'global variable' +my_global = "I'm in the global scope" ``` ``` python %%testcell -'aaa' in globals() +'my_global' in globals() ``` - True - ``` python -%%testcell noglobals -'aaa' in globals() + True # my_global is available ``` - False - ``` python %%testcelln -'aaa' in globals() +'my_global' in globals() ``` - False +``` python + False # my_global is NOT available +``` ``` python %%testcelln globals().keys() ``` +``` python dict_keys(['__builtins__']) +``` -With `%%testcelln` inside the cell, you’ll be able to access only to -`__builtins__` (aka: standard python’s functions). **It behaves like a -notebook-in-notebook**. - -``` python -%%testcell -def my_function(x): - print(aaa) # global variable - return x +## Explicit dependencies -try: - my_function(123) -except Exception as e: - print(e) -``` +The `(inputs)->(outputs)` syntax gives you precise control: you can pass +any symbol (variables, functions, classes) into the isolated context and +save only chosen ones back to globals. - global variable +This **forces explicit dependency declaration**, giving you full control +over what enters and exits the cell. It prevents accidental reliance on +symbols from the main context that would hurt you when exporting the +code. ``` python -%%testcelln -def my_function(x): - print(aaa) # global variable - return x - -try: - my_function(123) -except Exception as e: - print(e) +data = [1, 2, 3, 4, 5] ``` - name 'aaa' is not defined +``` python +%%testcelln (data)->(calculate_stats) +# Only 'data' is available, only 'calculate_stats' is saved + +def calculate_stats(values): + return { + 'mean': sum(values) / len(values), + 'min': min(values), + 'max': max(values) + } + +# Test it works +print(calculate_stats(data)) +``` -As you can see from this last example, `%%testcelln` helps you to -identify that `my_function` refers global variable `aaa`. + {'mean': 3.0, 'min': 1, 'max': 5} -**IMPORTANT**: this is *just wrapping your cell* and so it’s still -running on your main kernel. If you modify variables that has been -created outside of this cell (aka: if you have side effects) this will -not protect you. +`calculate_stats` now **exists in globals**. No test code or +intermediate variables leaked. ``` python -aaa +calculate_stats([10, 20, 30]) ``` - 'global variable' + {'mean': 20.0, 'min': 10, 'max': 30} -``` python -%%testcell -# WARNING: this will alter the state of global variable: -globals().update({'aaa' : 'modified global variable'}); -``` +## Advanced parsing + +Thanks to Python’s `ast` module, `%%testcell` correctly handles complex +code patterns including comments on the last line and multi-line +statements: ``` python -aaa +%%testcell verbose +result = "complex parsing" +(result, + True) +# comment on last line ``` - 'modified global variable' - ``` python -del aaa +### BEGIN +def _test_cell_(): + #| echo: false + result = "complex parsing" + return (result, + True) # %%testcell +try: + _ = _test_cell_() +finally: + del _test_cell_ +_ # This will be added to global scope +### END ``` + ('complex parsing', True) + ## Links: - PROJECT PAGE: - DOCUMENTATION: - PYPI: +- DETAILED DEMO: + +- USE CASE ZOO: + +- LAUNCHING BLOG: [Introducing + `%%testcell`](https://artste.github.io/blog/posts/introducing-testcell) - COLAB DEMO: [testcell_demo.ipynb](https://colab.research.google.com/github/artste/testcell/blob/main/demo/testcell_demo.ipynb) - KAGGLE SAMPLE NOTEBOOK: diff --git a/demo/testcell_demo.ipynb b/demo/testcell_demo.ipynb index d34700b..d82ebb1 100644 --- a/demo/testcell_demo.ipynb +++ b/demo/testcell_demo.ipynb @@ -20,7 +20,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 1, "id": "a9d36534", "metadata": {}, "outputs": [], @@ -30,7 +30,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 2, "id": "574fdb14", "metadata": {}, "outputs": [], @@ -50,7 +50,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 3, "id": "c4de6797", "metadata": {}, "outputs": [], @@ -65,10 +65,21 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 4, "id": "545b0118", "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "dict_keys(['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__builtin__', '__builtins__', '_ih', '_oh', '_dh', 'In', 'Out', 'get_ipython', 'exit', 'quit', 'open', '_', '__', '___', '__session__', '_i', '_ii', '_iii', '_i1', '_i2', 'testcell', '_i3', 'os', 'sample_variable', 'sample_function', '_i4'])" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "globals().keys()" ] @@ -85,10 +96,21 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 5, "id": "30027b0f", "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "\"'a' is not polluting global scope\"" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "%%testcell\n", "a = \"'a' is not polluting global scope\"\n", @@ -97,7 +119,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 6, "id": "a135198e", "metadata": {}, "outputs": [], @@ -108,10 +130,21 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 7, "id": "3f369a4a", "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "\"'a' is not polluting global scope\"" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "%%testcell\n", "a = \"'a' is not polluting global scope\"; a # it works as inline too even if there is a comment" @@ -119,7 +152,7 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 8, "id": "2ea4f41f", "metadata": {}, "outputs": [], @@ -138,10 +171,79 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 9, "id": "1e70eb60", "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "\t \n", + "\t SVG Logo\n", + "\t Designed for the SVG Logo Contest in 2006 by Harvey Rayner, and adopted by W3C in 2009. It is available under the Creative Commons license for those who have an SVG product or who are using SVG on their site.\n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t SVG Logo\n", + "\t 14-08-2009\n", + "\t \n", + "\t W3C\n", + "\t Harvey Rayner, designer\n", + "\t \n", + "\t See document description\n", + "\t \n", + "\t image/svg+xml\n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t \n", + "\t " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "%%testcell\n", "from IPython.display import SVG,HTML\n", @@ -225,10 +327,53 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 10, "id": "9e576724", "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/markdown": [ + "```python\n", + "### BEGIN\n", + "def _test_cell_():\n", + "\tb = 123\n", + "\treturn b # %%testcell\n", + "try:\n", + "\t_ = _test_cell_()\n", + "finally:\n", + "\tdel _test_cell_\n", + "_ # This will be added to global scope\n", + "### END\n", + "```" + ], + "text/plain": [ + "### BEGIN\n", + "def _test_cell_():\n", + "\tb = 123\n", + "\treturn b # %%testcell\n", + "try:\n", + "\t_ = _test_cell_()\n", + "finally:\n", + "\tdel _test_cell_\n", + "_ # This will be added to global scope\n", + "### END" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/plain": [ + "123" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "%%testcell verbose\n", "b = 123\n", @@ -247,10 +392,43 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 11, "id": "3065d361", "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/markdown": [ + "```python\n", + "### BEGIN\n", + "def _test_cell_():\n", + "\tb = 123\n", + "\treturn print('You should not see this message on the output') # %%testcell\n", + "try:\n", + "\t_ = _test_cell_()\n", + "finally:\n", + "\tdel _test_cell_\n", + "if _ is not None: display(_)\n", + "### END\n", + "```" + ], + "text/plain": [ + "### BEGIN\n", + "def _test_cell_():\n", + "\tb = 123\n", + "\treturn print('You should not see this message on the output') # %%testcell\n", + "try:\n", + "\t_ = _test_cell_()\n", + "finally:\n", + "\tdel _test_cell_\n", + "if _ is not None: display(_)\n", + "### END" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], "source": [ "%%testcell dryrun\n", "b = 123\n", @@ -270,10 +448,20 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 12, "id": "19ab37cc", "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "123" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], "source": [ "%%testcell noreturn\n", "b = 123\n", @@ -293,10 +481,18 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 13, "id": "b0a2abfd", "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "dict_keys(['__builtins__'])\n" + ] + } + ], "source": [ "%%testcelln\n", "assert list(globals().keys()) == ['__builtins__']\n", @@ -316,10 +512,27 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 14, "id": "c8fe2586-4dec-48f3-a486-5e7b6d46d10b", "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "
\n", + " This cell has been skipped\n", + "
\n", + " " + ], + "text/plain": [ + "ℹ️ This cell has been skipped" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], "source": [ "%%testcell skip\n", "raise ValueError('This should not be executed')" @@ -337,10 +550,37 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 15, "id": "cedecb62-35d2-4cdd-ac28-2473e0093d8e", "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "
\n", + " testcell\n", + "
\n", + " " + ], + "text/plain": [ + "🟡 testcell" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/plain": [ + "'This is a banner for regular %%testcell cell'" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "%%testcell banner\n", "'This is a banner for regular %%testcell cell'" @@ -348,15 +588,210 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 16, "id": "c9584772-e407-415e-a6bc-24ee44c66455", "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "
\n", + " testcell noglobals\n", + "
\n", + " " + ], + "text/plain": [ + "🟢 testcell noglobals" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/plain": [ + "'This is a banner for %%testcell noglobals cell'" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], "source": [ "%%testcelln banner\n", "'This is a banner for %%testcell noglobals cell'" ] }, + { + "cell_type": "markdown", + "id": "451f0524-3e5e-4881-85b2-bdd7ea48c2a3", + "metadata": {}, + "source": [ + "### `%%testcelln` (sample_variable)->(new_func)\n", + "This is the most advanced use case and let you execute code on an isolated context, but adding access to `sample_variable` and returning to the global context the function `new_func`.\n", + "Additionaly we're using the \"debug\" option that shows step by step what's happening under the hood." + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "id": "24c7c686-7a9a-458c-b486-ba353c3c4b52", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "```python\n", + "### BEGIN\n", + "def _test_cell_():\n", + "\tglobal new_func\n", + "\tdef new_func(): return sample_variable\n", + "\t\n", + "\treturn new_func() # %%testcell\n", + "try:\n", + "\t_ = _test_cell_()\n", + "finally:\n", + "\tdel _test_cell_\n", + "if _ is not None: display(_)\n", + "### END\n", + "```" + ], + "text/plain": [ + "### BEGIN\n", + "def _test_cell_():\n", + "\tglobal new_func\n", + "\tdef new_func(): return sample_variable\n", + "\t\n", + "\treturn new_func() # %%testcell\n", + "try:\n", + "\t_ = _test_cell_()\n", + "finally:\n", + "\tdel _test_cell_\n", + "if _ is not None: display(_)\n", + "### END" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/plain": [ + "'this is a sample variable'" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/markdown": [ + "```python\n", + "### GLOBALS UPDATE CODE:\n", + "global new_func; new_func=locals()[\"new_func\"]\n", + "###\n", + "```" + ], + "text/plain": [ + "### GLOBALS UPDATE CODE:\n", + "global new_func; new_func=locals()[\"new_func\"]\n", + "###" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcelln (sample_variable)->(new_func) debug\n", + "def new_func(): return sample_variable\n", + "\n", + "new_func()" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "id": "12387ed0-89c9-45dd-9d2f-ab7c0673ceef", + "metadata": {}, + "outputs": [], + "source": [ + "assert new_func()==sample_variable\n", + "del new_func #cleanup state" + ] + }, + { + "cell_type": "markdown", + "id": "08394121-0f6e-4b73-b9a6-4e6a43dc4859", + "metadata": {}, + "source": [ + "This is an exmaple of **input only**" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "id": "634d2155-545b-4aae-957e-11c3491d5e6d", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'this is a sample variable'" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcelln (sample_variable)\n", + "def new_func(): return sample_variable\n", + "new_func()" + ] + }, + { + "cell_type": "markdown", + "id": "581c1dc9-db12-42b6-a6ca-233cda556e1b", + "metadata": {}, + "source": [ + "This is an exmaple of **output only**" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "id": "fb8ba9cb-6082-4071-89e5-9d8dbdf118fb", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "123" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcelln ->(new_func)\n", + "def new_func(): return 123\n", + "new_func()" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "id": "ddc07336-e208-440f-969d-b88468a3da48", + "metadata": {}, + "outputs": [], + "source": [ + "assert new_func()==123\n", + "del new_func #cleanup state" + ] + }, { "cell_type": "markdown", "id": "05290ea3", diff --git a/demo/testcell_zoo.ipynb b/demo/testcell_zoo.ipynb new file mode 100644 index 0000000..6f2f7e4 --- /dev/null +++ b/demo/testcell_zoo.ipynb @@ -0,0 +1,2539 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "f5b071d3-d836-4128-9596-ce3f2f4d21c9", + "metadata": {}, + "source": [ + "# Testcell Zoo" + ] + }, + { + "cell_type": "markdown", + "id": "40eb3af8-1dec-428d-8e22-40374f63245a", + "metadata": {}, + "source": [ + "## Introduction\n", + "\n", + "This notebook demonstrates practical use cases for `%%testcell`. Each example is self-contained and can be run independently after loading the fixtures." + ] + }, + { + "cell_type": "markdown", + "id": "d75a2d0b", + "metadata": {}, + "source": [ + "### install\n", + "\n", + "First of all, install and import `testcell`." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "92505d86-5f42-4f5f-8d3c-6008b6b113be", + "metadata": {}, + "outputs": [], + "source": [ + "!pip install testcell" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "db3d55aa-237c-4033-8ace-621202fcfa4d", + "metadata": {}, + "outputs": [], + "source": [ + "# Install dependecies used in this notebook\n", + "!pip install numpy pandas matplotlib" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "574fdb14", + "metadata": {}, + "outputs": [], + "source": [ + "import testcell" + ] + }, + { + "cell_type": "markdown", + "id": "a5cbdc78-47e9-470c-9187-f00c9c229527", + "metadata": {}, + "source": [ + "### Fixtures\n", + "\n", + "Common objects used throughout the examples:" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "9818ce17-f533-4fa3-a442-68fcf4908b2c", + "metadata": {}, + "outputs": [], + "source": [ + "# Simple variable\n", + "global_variable = \"I'm a global variable\"\n", + "\n", + "# Simple list\n", + "global_list = [1, 2, 3, 5, 7, 11, 13]\n", + "\n", + "# Simple function\n", + "def global_function(n):\n", + " return n*2\n", + "\n", + "# Simple class\n", + "class GlobalClass:\n", + " def one(self):\n", + " return 37\n", + " \n", + " def two(self):\n", + " return \"method of global class\"" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "181a3890-f9df-44b7-afcb-db906d85d800", + "metadata": {}, + "outputs": [], + "source": [ + "# Fixing random state for reproducibility\n", + "import numpy as np\n", + "np.random.seed(123)" + ] + }, + { + "cell_type": "markdown", + "id": "574ff0f3-d605-4c0d-8ed4-463e0adfc3d3", + "metadata": {}, + "source": [ + "## Documentation\n", + "\n", + "Individual options and their behavior:" + ] + }, + { + "cell_type": "markdown", + "id": "f9579085-fa36-4f1d-a4b8-91aa4c50a209", + "metadata": {}, + "source": [ + "### option: `verbose`\n", + "Shows the generated wrapper function before execution." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "1849ad22-85f6-4f7d-9ab2-25c847bfcf1f", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "```python\n", + "### BEGIN\n", + "def _test_cell_():\n", + "\t\n", + "\treturn global_function(3) * 2 # %%testcell\n", + "try:\n", + "\t_ = _test_cell_()\n", + "finally:\n", + "\tdel _test_cell_\n", + "_ # This will be added to global scope\n", + "### END\n", + "```" + ], + "text/plain": [ + "### BEGIN\n", + "def _test_cell_():\n", + "\t\n", + "\treturn global_function(3) * 2 # %%testcell\n", + "try:\n", + "\t_ = _test_cell_()\n", + "finally:\n", + "\tdel _test_cell_\n", + "_ # This will be added to global scope\n", + "### END" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/plain": [ + "12" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "%%testcell verbose\n", + "global_function(3) * 2" + ] + }, + { + "cell_type": "markdown", + "id": "49c32487-bbf7-4d6a-bc57-50597809b3fe", + "metadata": {}, + "source": [ + "### option `dryrun`\n", + "Shows the wrapper code without executing it." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "e96b57e1-31f9-4af6-b673-fc8387e32007", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "```python\n", + "### BEGIN\n", + "def _test_cell_():\n", + "\t\n", + "\treturn global_function(3) * 2 # %%testcell\n", + "try:\n", + "\t_ = _test_cell_()\n", + "finally:\n", + "\tdel _test_cell_\n", + "if _ is not None: display(_)\n", + "### END\n", + "```" + ], + "text/plain": [ + "### BEGIN\n", + "def _test_cell_():\n", + "\t\n", + "\treturn global_function(3) * 2 # %%testcell\n", + "try:\n", + "\t_ = _test_cell_()\n", + "finally:\n", + "\tdel _test_cell_\n", + "if _ is not None: display(_)\n", + "### END" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcell dryrun\n", + "global_function(3) * 2" + ] + }, + { + "cell_type": "markdown", + "id": "bad62496-6d10-4627-86af-2abc3ba9a752", + "metadata": {}, + "source": [ + "### option `skip`\n", + "Skips cell execution entirely." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "ff0bb19d-d263-46da-9176-d2abee049f8f", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "
\n", + " This cell has been skipped\n", + "
\n", + " " + ], + "text/plain": [ + "ℹ️ This cell has been skipped" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcell skip\n", + "raise ValueError(\"This won't execute\")" + ] + }, + { + "cell_type": "markdown", + "id": "762017da-2a77-458c-bb94-843239baafaf", + "metadata": {}, + "source": [ + "### option `banner`\n", + "Displays a visual marker at the top of output." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "9357a4b7-d2fa-455e-b0dd-61efc76c5b4f", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "
\n", + " testcell\n", + "
\n", + " " + ], + "text/plain": [ + "🟡 testcell" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/plain": [ + "6" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "%%testcell banner\n", + "global_function(3)" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "2927380d-4dc1-487f-90d4-ae0ebdff18dc", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "
\n", + " testcell noglobals\n", + "
\n", + " " + ], + "text/plain": [ + "🟢 testcell noglobals" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "This is running in an isolated environment\n" + ] + } + ], + "source": [ + "%%testcelln banner\n", + "print('This is running in an isolated environment')" + ] + }, + { + "cell_type": "markdown", + "id": "96abea99-be60-4244-a00c-2fccd28cbd65", + "metadata": {}, + "source": [ + "### option `noglobals` (or `%%testcelln`)\n", + "Executes with zero access to notebook globals - only `__builtins__` available." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "c8f4ee3c-750c-4e84-a321-2a5be1d976a1", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "global_function not available here\n" + ] + } + ], + "source": [ + "%%testcelln\n", + "try:\n", + " global_function(3) # This will fail\n", + "except NameError:\n", + " print('global_function not available here')" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "fd7f6e61-b159-4d81-9b89-910fbae42c34", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "global_function not available here\n" + ] + } + ], + "source": [ + "%%testcell noglobals\n", + "try:\n", + " global_function(3) # This will fail\n", + "except NameError:\n", + " print('global_function not available here')" + ] + }, + { + "cell_type": "markdown", + "id": "1963ed9c-026d-437e-9ca6-763a2c637e9a", + "metadata": {}, + "source": [ + "### option `(inputs)` syntax\n", + "Explicitly control what enters in the isolated context." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "48412b54-b2ce-4cdb-8a2e-7dac4a3a2d98", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[2, 4, 6, 10, 14, 22, 26]" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcelln (global_list)\n", + "[x * 2 for x in global_list]" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "id": "bffb1ef7-4c40-4d82-8b37-e5d72e93079e", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "37" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcelln (GlobalClass)\n", + "GlobalClass().one()" + ] + }, + { + "cell_type": "markdown", + "id": "77e834b2-6c40-4b61-8a43-3734656e861f", + "metadata": {}, + "source": [ + "### option `(inputs)->(outputs)` syntax\n", + "Explicitly control what enters and exits the isolated context." + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "id": "008e7a4c-86c9-46da-8ddc-cfef3e482fc2", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "[2, 4, 6, 10, 14, 22, 26]" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcelln (global_list)->(double_list)\n", + "double_list = [x * 2 for x in global_list]\n", + "double_list" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "id": "e0426272-2f12-4658-ad08-eba79b63cc43", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[2, 4, 6, 10, 14, 22, 26]\n" + ] + } + ], + "source": [ + "print(double_list) # double_list now exists in the global namespace\n", + "del double_list # cleanin it up" + ] + }, + { + "cell_type": "markdown", + "id": "e3b33f77-cdb7-4ccb-9896-edb5070726ea", + "metadata": {}, + "source": [ + "### option `debug`\n", + "Shows wrapper code AND globals update operations." + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "id": "4773e185-9605-4567-bab4-1abc910f353d", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "```python\n", + "### BEGIN\n", + "def _test_cell_():\n", + "\tglobal new_var\n", + "\tnew_var = global_variable.upper()\n", + "\treturn new_var # %%testcell\n", + "try:\n", + "\t_ = _test_cell_()\n", + "finally:\n", + "\tdel _test_cell_\n", + "if _ is not None: display(_)\n", + "### END\n", + "```" + ], + "text/plain": [ + "### BEGIN\n", + "def _test_cell_():\n", + "\tglobal new_var\n", + "\tnew_var = global_variable.upper()\n", + "\treturn new_var # %%testcell\n", + "try:\n", + "\t_ = _test_cell_()\n", + "finally:\n", + "\tdel _test_cell_\n", + "if _ is not None: display(_)\n", + "### END" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/plain": [ + "\"I'M A GLOBAL VARIABLE\"" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/markdown": [ + "```python\n", + "### GLOBALS UPDATE CODE:\n", + "global new_var; new_var=locals()[\"new_var\"]\n", + "###\n", + "```" + ], + "text/plain": [ + "### GLOBALS UPDATE CODE:\n", + "global new_var; new_var=locals()[\"new_var\"]\n", + "###" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcelln (global_variable)->(new_var) debug\n", + "new_var = global_variable.upper()\n", + "new_var" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "id": "94a81e1e-9bc7-46a9-b5c9-3c67327daced", + "metadata": {}, + "outputs": [], + "source": [ + "del new_var" + ] + }, + { + "cell_type": "markdown", + "id": "dc485ee8-b811-497e-9baf-7627d4a1f3b1", + "metadata": {}, + "source": [ + "## Use Case Scenarios" + ] + }, + { + "cell_type": "markdown", + "id": "23d09c78-8ff9-449d-b846-2128fc5b526c", + "metadata": {}, + "source": [ + "### Basic Isolation" + ] + }, + { + "cell_type": "markdown", + "id": "4e1bfff7-a9d6-4b12-b6c0-7ad0adf1a502", + "metadata": {}, + "source": [ + "#### Testing without pollution\n", + "\n", + "Test a function with various inputs without cluttering globals with test values." + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "id": "1c6d0120-9c43-427b-b26d-f21f74138fb5", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Results: [2, 10, 20, 200]\n" + ] + }, + { + "data": { + "text/plain": [ + "np.float64(58.0)" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "%%testcell\n", + "import numpy as np\n", + "\n", + "# Test global_function with different inputs\n", + "test_cases = [1, 5, 10, 100]\n", + "results = [global_function(x) for x in test_cases]\n", + "print(f\"Results: {results}\")\n", + "np.mean(results) # test_cases and results don't leak to globals" + ] + }, + { + "cell_type": "markdown", + "id": "06669ac7-51fa-4187-b607-b23436058a87", + "metadata": {}, + "source": [ + "#### Quick code prototyping\n", + "\n", + "Experiment with different approaches without commitment." + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "id": "a465106a-18f1-49b2-a012-a20f13504437", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Approach 1: [2.0, 3.3333333333333335, 5.0, 7.666666666666667, 10.333333333333334]\n", + "Approach 2: [2.0, 3.3333333333333335, 5.0, 7.666666666666667, 10.333333333333334]\n" + ] + } + ], + "source": [ + "%%testcell\n", + "import pandas as pd\n", + "\n", + "# Trying different ways to process global_list\n", + "approach_1 = pd.Series(global_list).rolling(3).mean().dropna()\n", + "approach_2 = [sum(global_list[i:i+3])/3 for i in range(len(global_list)-2)]\n", + "print(f\"Approach 1: {approach_1.tolist()}\")\n", + "print(f\"Approach 2: {approach_2}\")\n", + "# Nothing saved - decide later which approach to keep" + ] + }, + { + "cell_type": "markdown", + "id": "4227e7c2-8abb-43fd-8a91-724267bd18b7", + "metadata": {}, + "source": [ + "#### Stack Overflow snippet testing\n", + "\n", + "Test SO answers in isolation to catch hidden dependencies." + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "id": "e3887d3e-b47d-4abb-9a06-86a7b399f8cd", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcelln\n", + "import matplotlib.pyplot as plt\n", + "import numpy as np\n", + "\n", + "# SO snippet for quick visualization\n", + "x = np.linspace(0, 10, 100)\n", + "y = np.sin(x)\n", + "plt.plot(x, y)\n", + "plt.title(\"Testing SO snippet\")\n", + "plt.show() # Completely isolated - no globals needed or leaked" + ] + }, + { + "cell_type": "markdown", + "id": "dc06aadf-a3e3-4036-8fbd-8b02ab7e92c2", + "metadata": {}, + "source": [ + "#### Safe LLM suggested code experimentation\n", + "\n", + "Try code from ChatGPT/Claude in complete isolation - verify it works standalone." + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "id": "cd8d15ba-ee02-43a8-90ca-7bd1a28c2ca2", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
ABratio
count7.0000007.0000007.000000
mean6.00000054.0000006.000000
std4.58257665.7190994.582576
min1.0000001.0000001.000000
25%2.5000006.5000002.500000
50%5.00000025.0000005.000000
75%9.00000085.0000009.000000
max13.000000169.00000013.000000
\n", + "
" + ], + "text/plain": [ + " A B ratio\n", + "count 7.000000 7.000000 7.000000\n", + "mean 6.000000 54.000000 6.000000\n", + "std 4.582576 65.719099 4.582576\n", + "min 1.000000 1.000000 1.000000\n", + "25% 2.500000 6.500000 2.500000\n", + "50% 5.000000 25.000000 5.000000\n", + "75% 9.000000 85.000000 9.000000\n", + "max 13.000000 169.000000 13.000000" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcelln (global_list)\n", + "import pandas as pd\n", + "\n", + "# Suppose an LLM suggested this snippet\n", + "data = {'A': global_list, 'B': [x**2 for x in global_list]}\n", + "df = pd.DataFrame(data)\n", + "df['ratio'] = df['B'] / df['A']\n", + "df.describe() # Runs in isolation, only global_list was passed in" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "id": "5f8576c0-9a7a-40bb-8b4a-647ff205aa1e", + "metadata": {}, + "outputs": [], + "source": [ + "assert 'df' not in locals() # Ensure df is not in the local scope\n", + "assert 'df' not in globals() # Ensure df is not in the global scope" + ] + }, + { + "cell_type": "markdown", + "id": "84c2b7eb-919d-4a92-a567-71243841a53e", + "metadata": {}, + "source": [ + "### Visualization & Analysis" + ] + }, + { + "cell_type": "markdown", + "id": "dc9b82fc-c4e2-432d-a59e-33fc5794a99b", + "metadata": {}, + "source": [ + "#### Matplotlib without clutter\n", + "\n", + "Create plots without importing `plt`, `fig`, or `ax` into globals. For example, this could be a code snippet from the documentation that you want to understand and experiment with." + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "id": "f3e757d9-b253-43d9-bf67-85e051b5faba", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcelln\n", + "# FROM: https://matplotlib.org/stable/gallery/statistics/hexbin_demo.html#sphx-glr-gallery-statistics-hexbin-demo-py\n", + "import matplotlib.pyplot as plt\n", + "import numpy as np\n", + "\n", + "n = 100_000\n", + "x = np.random.default_rng(seed=19680801).standard_normal(n) # inline seeded generator\n", + "y = 2.0 + 3.0 * x + 4.0 * np.random.standard_normal(n)\n", + "xlim = x.min(), x.max()\n", + "ylim = y.min(), y.max()\n", + "\n", + "fig, (ax0, ax1) = plt.subplots(ncols=2, sharey=True, figsize=(9, 4))\n", + "\n", + "hb = ax0.hexbin(x, y, gridsize=50, cmap='inferno')\n", + "ax0.set(xlim=xlim, ylim=ylim)\n", + "ax0.set_title(\"Hexagon binning\")\n", + "cb = fig.colorbar(hb, ax=ax0, label='counts')\n", + "\n", + "hb = ax1.hexbin(x, y, gridsize=50, bins='log', cmap='inferno')\n", + "ax1.set(xlim=xlim, ylim=ylim)\n", + "ax1.set_title(\"With a log color scale\")\n", + "cb = fig.colorbar(hb, ax=ax1, label='counts')" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "id": "5da54b5c-6341-403d-9001-83565b50a187", + "metadata": {}, + "outputs": [], + "source": [ + "# Note that matplotlib has not been importad in global scope\n", + "assert 'matplotlib' not in globals()" + ] + }, + { + "cell_type": "markdown", + "id": "014bd19e-7d52-48a1-afe0-34cee14cbf0a", + "metadata": {}, + "source": [ + "#### Data transformation testing\n", + "\n", + "Test complex pandas pipelines without keeping intermediate results." + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "id": "b6dca02a-7f88-4de1-9488-708166b04daf", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
valuessquaredlognormalized
0110.000000-1.091089
1240.693147-0.872872
2391.098612-0.654654
35251.609438-0.218218
47491.9459100.218218
5111212.3978951.091089
6131692.5649491.527525
\n", + "
" + ], + "text/plain": [ + " values squared log normalized\n", + "0 1 1 0.000000 -1.091089\n", + "1 2 4 0.693147 -0.872872\n", + "2 3 9 1.098612 -0.654654\n", + "3 5 25 1.609438 -0.218218\n", + "4 7 49 1.945910 0.218218\n", + "5 11 121 2.397895 1.091089\n", + "6 13 169 2.564949 1.527525" + ] + }, + "execution_count": 26, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "%%testcell\n", + "import pandas as pd\n", + "import numpy as np\n", + "\n", + "df = pd.DataFrame({'values': global_list})\n", + "transformed = (df\n", + " .assign(squared=lambda x: x['values']**2)\n", + " .assign(log=lambda x: np.log(x['values']))\n", + " .assign(normalized=lambda x: (x['values'] - x['values'].mean()) / x['values'].std())\n", + ")\n", + "transformed # df and transformed don't leak" + ] + }, + { + "cell_type": "markdown", + "id": "e2ec685e-3140-4802-82e3-fd793b731068", + "metadata": {}, + "source": [ + "#### Statistical analysis sandbox\n", + "\n", + "Run complex analyses without polluting namespace." + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "id": "06d2a37f-e1ec-485f-b925-9f89551faf1c", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "mean 49.542272\n", + "median 49.691380\n", + "std 8.812565\n", + "q25 43.523162\n", + "q75 55.185949\n", + "dtype: float64" + ] + }, + "execution_count": 27, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "%%testcell\n", + "import numpy as np\n", + "import pandas as pd\n", + "\n", + "data = np.random.normal(loc=50, scale=10, size=100)\n", + "stats = {\n", + " 'mean': np.mean(data),\n", + " 'median': np.median(data),\n", + " 'std': np.std(data),\n", + " 'q25': np.percentile(data, 25),\n", + " 'q75': np.percentile(data, 75)\n", + "}\n", + "pd.Series(stats) # data and stats don't persist" + ] + }, + { + "cell_type": "markdown", + "id": "b3cae828-2abb-48a1-ae87-08afe2bb1cd8", + "metadata": {}, + "source": [ + "### Skip feature" + ] + }, + { + "cell_type": "markdown", + "id": "7a7431c0-4644-4cdc-91ae-05c78d0e0700", + "metadata": {}, + "source": [ + "#### A/B testing approaches\n", + "\n", + "Toggle between different implementations without deleting code." + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "id": "b2df1a3d-c95a-46d5-a108-5420f5466ae0", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Approach A result: 6.0\n" + ] + } + ], + "source": [ + "%%testcell\n", + "import numpy as np\n", + "# Approach A: using numpy\n", + "result = np.array(global_list).mean()\n", + "print(f\"Approach A result: {result}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "id": "968a250d-6d4c-4fbe-9d9e-ebed1f63d81b", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "
\n", + " This cell has been skipped\n", + "
\n", + " " + ], + "text/plain": [ + "ℹ️ This cell has been skipped" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcell skip\n", + "# Approach B: pure python (currently disabled)\n", + "result = sum(global_list) / len(global_list)\n", + "print(f\"Approach B result: {result}\")" + ] + }, + { + "cell_type": "markdown", + "id": "70808ee6-6100-44b8-acbd-77bddcde12b7", + "metadata": {}, + "source": [ + "#### Disable expensive operations\n", + "\n", + "Skip time-consuming cells during development iteration." + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "id": "268381af-c478-43e4-b527-2e3a31f4f948", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "
\n", + " This cell has been skipped\n", + "
\n", + " " + ], + "text/plain": [ + "ℹ️ This cell has been skipped" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcell skip\n", + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "\n", + "# Expensive simulation - skip while working on other parts\n", + "data = np.random.randn(1000000)\n", + "for i in range(100):\n", + " data = data * np.random.random() + np.random.randn(1000000)\n", + "plt.hist(data, bins=50)\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "id": "78b07769-b64e-4d3b-8ffa-27d05ef7742f", + "metadata": {}, + "source": [ + "#### Conditional cell execution with global_skip\n", + "\n", + "Disable all testcell cells at once for production runs." + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "id": "a80e5967-dd0f-44a6-a868-8b1ab065436f", + "metadata": {}, + "outputs": [], + "source": [ + "# Enable this to skip all %%testcell cells\n", + "testcell.global_skip = True" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "id": "5c15ee09-4f83-41c9-bfe4-07ed8d9857c6", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "
\n", + " This cell has been skipped\n", + "
\n", + " " + ], + "text/plain": [ + "ℹ️ This cell has been skipped" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcell\n", + "print(\"This won't execute when global_skip is True\")" + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "id": "4cf1f579-9766-4c95-9b03-c8b10b4774c6", + "metadata": {}, + "outputs": [], + "source": [ + "# Reset for other examples\n", + "testcell.global_skip = False" + ] + }, + { + "cell_type": "markdown", + "id": "9d08b709-bfa4-4d0d-9ad0-7117b0b0a0aa", + "metadata": {}, + "source": [ + "#### Iterative development in Colab/Modal\n", + "\n", + "Skip cells in platforms without native \"Raw cell\" type." + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "id": "27c3c0f1-74bd-47b5-8ec6-a0afc62f9180", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "
\n", + " This cell has been skipped\n", + "
\n", + " " + ], + "text/plain": [ + "ℹ️ This cell has been skipped" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcell skip\n", + "# Work-in-progress code that's not ready yet\n", + "# Colab/Kaggle/Modal don't have \"Raw\" cells, so this is perfect\n", + "import pandas as pd\n", + "experimental_feature = pd.DataFrame(global_list)\n", + "# TODO: finish this later" + ] + }, + { + "cell_type": "markdown", + "id": "baa724bf-83c4-4be2-a4ce-93a4d6a64e93", + "metadata": {}, + "source": [ + "### Complete isolation (noglobals/testcelln)" + ] + }, + { + "cell_type": "markdown", + "id": "624c22eb-1c65-4400-b362-4471aa80330b", + "metadata": {}, + "source": [ + "#### Pure function testing\n", + "\n", + "Verify functions work without global dependencies." + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "id": "9a0acc06-e74e-433c-a0c3-2c7c68c15ff5", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'mean': np.float64(3.0),\n", + " 'std': np.float64(1.4142135623730951),\n", + " 'sum': np.int64(15)}" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcelln\n", + "import numpy as np\n", + "\n", + "# Define and test a pure function\n", + "def calculate_stats(values):\n", + " return {\n", + " 'mean': np.mean(values),\n", + " 'std': np.std(values),\n", + " 'sum': np.sum(values)\n", + " }\n", + "\n", + "# Test with local data\n", + "calculate_stats([1, 2, 3, 4, 5])" + ] + }, + { + "cell_type": "markdown", + "id": "0e3c4c1c-970d-4ca0-aed1-c61bde89041a", + "metadata": {}, + "source": [ + "#### Detect hidden dependencies\n", + "\n", + "Catch when code accidentally relies on globals." + ] + }, + { + "cell_type": "code", + "execution_count": 36, + "id": "8fd0452a-e7b9-442a-a6f8-3c2f144b5970", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Caught dependency: name 'global_list' is not defined\n" + ] + } + ], + "source": [ + "%%testcelln\n", + "import pandas as pd\n", + "\n", + "# This will fail - exposes dependency on global_list\n", + "def process_data():\n", + " return pd.Series(global_list).describe()\n", + "\n", + "try:\n", + " process_data()\n", + "except NameError as e:\n", + " print(f\"Caught dependency: {e}\")" + ] + }, + { + "cell_type": "markdown", + "id": "eb90f0ac-75d4-48f6-a5cc-5bf563d4360b", + "metadata": {}, + "source": [ + "#### Clean slate execution\n", + "\n", + "Run code with only `__builtins__` available." + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "id": "08766664-4eed-400f-814c-15595676dc7e", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Only available: ['__builtins__']\n", + "Can use built-ins like sum: 6\n" + ] + } + ], + "source": [ + "%%testcelln\n", + "# Verify truly isolated environment\n", + "available = list(globals().keys())\n", + "print(f\"Only available: {available}\")\n", + "print(f\"Can use built-ins like sum: {sum([1,2,3])}\")" + ] + }, + { + "cell_type": "markdown", + "id": "3b98ae8c-239e-4e50-8ce3-ca54b772a1bb", + "metadata": {}, + "source": [ + "#### Reproducibility verification\n", + "\n", + "Ensure code doesn't rely on notebook state." + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "id": "453047fa-960a-43ff-a0e3-ea5933c0ca2e", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
valuesnormalized
01-1.091089
12-0.872872
23-0.654654
35-0.218218
470.218218
5111.091089
6131.527525
\n", + "
" + ], + "text/plain": [ + " values normalized\n", + "0 1 -1.091089\n", + "1 2 -0.872872\n", + "2 3 -0.654654\n", + "3 5 -0.218218\n", + "4 7 0.218218\n", + "5 11 1.091089\n", + "6 13 1.527525" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcelln\n", + "import numpy as np\n", + "import pandas as pd\n", + "\n", + "# Self-contained analysis - no hidden dependencies\n", + "data = [1, 2, 3, 5, 7, 11, 13]\n", + "df = pd.DataFrame({'values': data})\n", + "df['normalized'] = (df['values'] - df['values'].mean()) / df['values'].std()\n", + "df" + ] + }, + { + "cell_type": "markdown", + "id": "b79652f7-9bd2-424f-87a8-a5ec3dd7965f", + "metadata": {}, + "source": [ + "### (input)->(output) syntax" + ] + }, + { + "cell_type": "markdown", + "id": "b2284441-3b68-4c61-a5c2-fcd7f40340e2", + "metadata": {}, + "source": [ + "#### Controlled input to isolated cell\n", + "\n", + "Pass specific variables to noglobals context without full access." + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "id": "664f9d8f-a295-4773-80d0-8a705bac4127", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "np.float64(54.0)" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcelln (global_list)\n", + "import numpy as np\n", + "\n", + "# Only global_list is available\n", + "squared = [x**2 for x in global_list]\n", + "np.mean(squared)" + ] + }, + { + "cell_type": "markdown", + "id": "847d6fc1-a04e-401b-89e1-9b84d41c62b5", + "metadata": {}, + "source": [ + "#### Safe experimentation with expensive state\n", + "\n", + "Protect important objects while experimenting - pass them in, don't leak experiments out." + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "id": "23600f42-530a-4a89-aed1-896201154723", + "metadata": {}, + "outputs": [], + "source": [ + "expensive_model = np.random.randn(1000, 1000) # Pretend this took hours to compute" + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "id": "714abe60-955c-4c3f-8e61-d6ef86ec54eb", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Test prediction mean: -0.6658779858466216\n" + ] + } + ], + "source": [ + "%%testcelln (expensive_model)\n", + "import numpy as np\n", + "\n", + "# Experiment safely - expensive_model is read-only here\n", + "test_input = np.random.randn(1000)\n", + "prediction = expensive_model @ test_input\n", + "temp_analysis = prediction.mean()\n", + "print(f\"Test prediction mean: {temp_analysis}\")\n", + "# temp_analysis and test_input don't leak" + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "id": "1b3ef669-2843-4f30-939c-e7cde0aa1ba2", + "metadata": {}, + "outputs": [], + "source": [ + "# Cleanup\n", + "del expensive_model " + ] + }, + { + "cell_type": "markdown", + "id": "0753b824-78a1-4b18-bf26-a14ac675372b", + "metadata": {}, + "source": [ + "#### Selective output saving\n", + "\n", + "Save only specific results back to globals." + ] + }, + { + "cell_type": "code", + "execution_count": 43, + "id": "b3317933-4eef-4ae2-99e8-3d2b833c2798", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
valuessquared
011
124
239
3525
4749
511121
613169
\n", + "
" + ], + "text/plain": [ + " values squared\n", + "0 1 1\n", + "1 2 4\n", + "2 3 9\n", + "3 5 25\n", + "4 7 49\n", + "5 11 121\n", + "6 13 169" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcelln (global_list)->(final_result, summary_stats)\n", + "import pandas as pd\n", + "\n", + "# Complex processing with many intermediate variables\n", + "df = pd.DataFrame({'values': global_list})\n", + "df['squared'] = df['values'] ** 2\n", + "df['cubed'] = df['values'] ** 3\n", + "intermediate = df.describe()\n", + "\n", + "# Only these two are saved to globals\n", + "final_result = df[['values', 'squared']].copy()\n", + "summary_stats = {'mean': df['values'].mean(), 'max': df['values'].max()}\n", + "\n", + "final_result" + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "id": "0edab722-39b9-45d8-a8cc-f428153d0617", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "final_result exists: True\n", + "summary_stats exists: True\n", + "df exists: False\n" + ] + } + ], + "source": [ + "# Verify only selected outputs exist\n", + "print(f\"final_result exists: {'final_result' in globals()}\")\n", + "print(f\"summary_stats exists: {'summary_stats' in globals()}\")\n", + "print(f\"df exists: {'df' in globals()}\") # Should be False\n", + "\n", + "# Cleanup\n", + "del final_result, summary_stats" + ] + }, + { + "cell_type": "markdown", + "id": "8424c0c5-7e6e-4102-a9dc-6dec90af9caf", + "metadata": {}, + "source": [ + "### Advanced scenarios" + ] + }, + { + "cell_type": "markdown", + "id": "769b2fef-43b1-42e0-abe8-a8f11a02fce4", + "metadata": {}, + "source": [ + "#### Benchmark alternatives without variable collision\n", + "\n", + "Compare implementations safely - variables from different approaches don't collide, ensuring fair comparison." + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "id": "a12daff1-fbac-4fcc-b084-1eb6165e8b8c", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Approach 1 result: 10.242640687119284\n" + ] + } + ], + "source": [ + "%%testcell\n", + "import numpy as np\n", + "\n", + "# Approach 1: using numpy operations\n", + "data = np.array(global_list)\n", + "result = data.mean() + data.std()\n", + "print(f\"Approach 1 result: {result}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 46, + "id": "fd32369b-8b31-4ee8-a4c7-ccfec19a59c9", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Approach 2 result: 10.582575694955839\n" + ] + } + ], + "source": [ + "%%testcell\n", + "import numpy as np\n", + "\n", + "# Approach 2: slightly different calculation\n", + "# 'data' and 'result' from previous cell don't interfere\n", + "data = np.array(global_list)\n", + "result = np.mean(data) + np.std(data, ddof=1) # Using sample std\n", + "print(f\"Approach 2 result: {result}\")" + ] + }, + { + "cell_type": "markdown", + "id": "83df0f97-6ecd-4588-bd18-cd506f2b1367", + "metadata": {}, + "source": [ + "#### Complex fixture extraction with complete isolation\n", + "\n", + "Test in complete isolation, but extract test results to main context for later use." + ] + }, + { + "cell_type": "code", + "execution_count": 47, + "id": "92f7f66f-36b0-4058-b62a-d80b17265823", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "```python\n", + "### BEGIN\n", + "def _test_cell_():\n", + "\tglobal test_suite\n", + "\timport pandas as pd\n", + "\timport numpy as np\n", + "\t\n", + "\t# Create test fixtures in isolation - no access to any globals\n", + "\ttest_data = pd.DataFrame({\n", + "\t 'values': [1, 2, 3, 5, 7, 11, 13],\n", + "\t 'doubled': [2, 4, 6, 10, 14, 22, 26]\n", + "\t})\n", + "\t\n", + "\ttest_suite = {\n", + "\t 'data': test_data,\n", + "\t 'validators': [lambda x: x > 0, lambda x: x < 100],\n", + "\t 'summary': test_data.describe().to_dict()\n", + "\t}\n", + "\t\n", + "\treturn test_suite['data'].head() # %%testcell\n", + "try:\n", + "\t_ = _test_cell_()\n", + "finally:\n", + "\tdel _test_cell_\n", + "if _ is not None: display(_)\n", + "### END\n", + "```" + ], + "text/plain": [ + "### BEGIN\n", + "def _test_cell_():\n", + "\tglobal test_suite\n", + "\timport pandas as pd\n", + "\timport numpy as np\n", + "\t\n", + "\t# Create test fixtures in isolation - no access to any globals\n", + "\ttest_data = pd.DataFrame({\n", + "\t 'values': [1, 2, 3, 5, 7, 11, 13],\n", + "\t 'doubled': [2, 4, 6, 10, 14, 22, 26]\n", + "\t})\n", + "\t\n", + "\ttest_suite = {\n", + "\t 'data': test_data,\n", + "\t 'validators': [lambda x: x > 0, lambda x: x < 100],\n", + "\t 'summary': test_data.describe().to_dict()\n", + "\t}\n", + "\t\n", + "\treturn test_suite['data'].head() # %%testcell\n", + "try:\n", + "\t_ = _test_cell_()\n", + "finally:\n", + "\tdel _test_cell_\n", + "if _ is not None: display(_)\n", + "### END" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
valuesdoubled
012
124
236
3510
4714
\n", + "
" + ], + "text/plain": [ + " values doubled\n", + "0 1 2\n", + "1 2 4\n", + "2 3 6\n", + "3 5 10\n", + "4 7 14" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/markdown": [ + "```python\n", + "### GLOBALS UPDATE CODE:\n", + "global test_suite; test_suite=locals()[\"test_suite\"]\n", + "###\n", + "```" + ], + "text/plain": [ + "### GLOBALS UPDATE CODE:\n", + "global test_suite; test_suite=locals()[\"test_suite\"]\n", + "###" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcelln ->(test_suite) debug\n", + "import pandas as pd\n", + "import numpy as np\n", + "\n", + "# Create test fixtures in isolation - no access to any globals\n", + "test_data = pd.DataFrame({\n", + " 'values': [1, 2, 3, 5, 7, 11, 13],\n", + " 'doubled': [2, 4, 6, 10, 14, 22, 26]\n", + "})\n", + "\n", + "test_suite = {\n", + " 'data': test_data,\n", + " 'validators': [lambda x: x > 0, lambda x: x < 100],\n", + " 'summary': test_data.describe().to_dict()\n", + "}\n", + "\n", + "test_suite['data'].head()" + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "id": "d8113ab2-096a-4c18-a814-a6cdd6b5f6b4", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Test suite keys: dict_keys(['data', 'validators', 'summary'])\n" + ] + } + ], + "source": [ + "# test_suite now available in main context\n", + "print(f\"Test suite keys: {test_suite.keys()}\")\n", + "del test_suite # Cleanup" + ] + }, + { + "cell_type": "markdown", + "id": "37d85e36-ead8-40b2-bfe2-2b4ec0edcc72", + "metadata": {}, + "source": [ + "#### Pipeline validation during refactoring\n", + "\n", + "Test two pipeline versions produce identical results - crucial when refactoring to ensure behavior doesn't change." + ] + }, + { + "cell_type": "code", + "execution_count": 49, + "id": "88ec22c2-27db-4f7b-be45-4218465f68c0", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
valuesdoublednormalized
012-1.091089
124-0.872872
236-0.654654
3510-0.218218
47140.218218
511221.091089
613261.527525
\n", + "
" + ], + "text/plain": [ + " values doubled normalized\n", + "0 1 2 -1.091089\n", + "1 2 4 -0.872872\n", + "2 3 6 -0.654654\n", + "3 5 10 -0.218218\n", + "4 7 14 0.218218\n", + "5 11 22 1.091089\n", + "6 13 26 1.527525" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcelln (global_list)->(pipeline_v1)\n", + "import pandas as pd\n", + "\n", + "def pipeline_v1(data):\n", + " df = pd.DataFrame({'values': data})\n", + " df['doubled'] = df['values'] * 2\n", + " df['normalized'] = (df['doubled'] - df['doubled'].mean()) / df['doubled'].std()\n", + " return df\n", + "\n", + "pipeline_v1(global_list)" + ] + }, + { + "cell_type": "code", + "execution_count": 50, + "id": "c60e5102-e24b-4ff8-b90f-6710b744d967", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
valuesdoublednormalized
012-1.178511
124-0.942809
236-0.707107
3510-0.235702
47140.235702
511221.178511
613261.649916
\n", + "
" + ], + "text/plain": [ + " values doubled normalized\n", + "0 1 2 -1.178511\n", + "1 2 4 -0.942809\n", + "2 3 6 -0.707107\n", + "3 5 10 -0.235702\n", + "4 7 14 0.235702\n", + "5 11 22 1.178511\n", + "6 13 26 1.649916" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%testcelln (global_list)->(pipeline_v2)\n", + "import pandas as pd\n", + "import numpy as np\n", + "\n", + "def pipeline_v2(data):\n", + " # Refactored version - more efficient\n", + " df = pd.DataFrame({'values': data})\n", + " doubled = np.array(df['values']) * 2\n", + " df['doubled'] = doubled\n", + " df['normalized'] = (doubled - doubled.mean()) / doubled.std()\n", + " return df\n", + "\n", + "pipeline_v2(global_list)" + ] + }, + { + "cell_type": "code", + "execution_count": 51, + "id": "ddfa72c5-f094-4013-b3dc-8efe26ead105", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
 valuesdoublednormalized
0000.087422
1000.069937
2000.052453
3000.017484
400-0.017484
500-0.087422
600-0.122391
\n" + ], + "text/plain": [ + "" + ] + }, + "execution_count": 51, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "%%testcell\n", + "# Inspect differences\n", + "from IPython.display import HTML\n", + "import pandas as pd\n", + "styler = (pipeline_v1(global_list) - pipeline_v2(global_list)).style.background_gradient(cmap='viridis', subset=['normalized'])\n", + "\n", + "# Hack to avoid randomness in colored table html generation\n", + "styler.set_uuid(\"stable\") # <- freeze the table id\n", + "HTML(styler.to_html())" + ] + }, + { + "cell_type": "code", + "execution_count": 52, + "id": "b43e6de4-c4ba-4b31-ac5b-7f118fcf45b4", + "metadata": {}, + "outputs": [], + "source": [ + "# Cleanup\n", + "del pipeline_v1, pipeline_v2" + ] + }, + { + "cell_type": "markdown", + "id": "a1afcc17-689d-4661-8645-cd8bb7a0665c", + "metadata": {}, + "source": [ + "#### Debugging global pollution and dependencies\n", + "\n", + "Same code behaves differently with testcell vs testcelln - exposes hidden dependency on global scope." + ] + }, + { + "cell_type": "code", + "execution_count": 53, + "id": "a1a7be2c-3dec-4f73-88dd-24f5a9f0ba9d", + "metadata": {}, + "outputs": [], + "source": [ + "# Set up a variable that code might accidentally depend on\n", + "scale_factor = 10" + ] + }, + { + "cell_type": "code", + "execution_count": 54, + "id": "1722d9ea-9b49-4101-ab9b-477594ddf00d", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "np.float64(60.0)" + ] + }, + "execution_count": 54, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "%%testcell\n", + "import numpy as np\n", + "\n", + "# This works - has access to scale_factor\n", + "data = np.array(global_list) * scale_factor\n", + "data.mean()" + ] + }, + { + "cell_type": "code", + "execution_count": 55, + "id": "3f6bdfc1-87b8-4f72-960a-657a487290ec", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "❌ Hidden dependency detected: name 'scale_factor' is not defined\n", + "Code relies on 'scale_factor' from global scope!\n" + ] + } + ], + "source": [ + "%%testcelln\n", + "import numpy as np\n", + "\n", + "# This fails - exposes the hidden dependency\n", + "try:\n", + " data = np.array([1,2,3,5,7,11,13]) * scale_factor\n", + " data.mean()\n", + "except NameError as e:\n", + " print(f\"❌ Hidden dependency detected: {e}\")\n", + " print(\"Code relies on 'scale_factor' from global scope!\")" + ] + }, + { + "cell_type": "code", + "execution_count": 56, + "id": "b446b0cf-4670-4089-be96-cfd2073bdd18", + "metadata": {}, + "outputs": [], + "source": [ + "# Cleanup\n", + "del scale_factor" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.11" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/nbs/01a_arguments.ipynb b/nbs/01a_arguments.ipynb new file mode 100644 index 0000000..73c6702 --- /dev/null +++ b/nbs/01a_arguments.ipynb @@ -0,0 +1,371 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "40730acc", + "metadata": {}, + "source": [ + "# arguments" + ] + }, + { + "cell_type": "markdown", + "id": "1e4407a1", + "metadata": {}, + "source": [ + "This module is meant to add all the support functions for *arguments* and *inout* parsing." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c927be30", + "metadata": {}, + "outputs": [], + "source": [ + "#| default_exp inout" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ea55fe32", + "metadata": {}, + "outputs": [], + "source": [ + "from fastcore.test import *" + ] + }, + { + "cell_type": "markdown", + "id": "81f5dc52", + "metadata": {}, + "source": [ + "## Find support function" + ] + }, + { + "cell_type": "markdown", + "id": "1e602995", + "metadata": {}, + "source": [ + "`optional_find` let you search a string in forward or reverse order using one or more *templates* (aka: reference sub-strings).\n", + "\n", + "It returns the *first* occurrence that is the leftmost in forward direction and rightmost in reverse direction.\n", + "The occurrence returned is a tuple `(position,template)` where:\n", + "+ `position`: is the position inside the string.\n", + "+ `template`: is the copy of the template found. We need this to understand where that string finishes." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "95d0ba51", + "metadata": {}, + "outputs": [], + "source": [ + "#| export\n", + "def optional_find(x,cc,reverse=False):\n", + " if isinstance(cc,str): cc = [cc] # listify\n", + " t = [(x.rfind(c) if reverse else x.find(c),c) for c in cc]\n", + " t = [(f,c) for f,c in t if f != -1]\n", + " if len(t)==0: return None\n", + " return max(t) if reverse else min(t)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "efb977d3", + "metadata": {}, + "outputs": [], + "source": [ + "# Single\n", + "test_eq( optional_find('abc','a')[0] , 0)\n", + "test_eq( optional_find('abc','c')[0] , 2)\n", + "test_eq( optional_find('abc','bc')[0] , 1)\n", + "test_eq( optional_find('abc','z'), None)\n", + "test_eq( optional_find('abcdefghiab','a',reverse=True)[0] , 9)\n", + "test_eq( optional_find('abc','bc',reverse=True)[0] , 1)\n", + "\n", + "# Multi\n", + "test_eq( optional_find('abcdefghiab',['aa','fg','ia'],reverse=False)[0] , 5)\n", + "test_eq( optional_find('abcdefghiab',['aa','fg','ia'],reverse=True)[0] , 8)\n", + "test_eq( optional_find('abcdefghiab',['aa','ia','fg'],reverse=True)[0] , 8)" + ] + }, + { + "cell_type": "markdown", + "id": "b547c7cb", + "metadata": {}, + "source": [ + "## Character level utility" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8a6e5b12", + "metadata": {}, + "outputs": [], + "source": [ + "#| export\n", + "def count_char(x,c):\n", + " # Count how many times c appears in x\n", + " return sum(map(lambda y: y==c,x))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f6954a2d", + "metadata": {}, + "outputs": [], + "source": [ + "test_eq( count_char('abacda','a') , 3 )\n", + "test_eq( count_char('abacda','b') , 1 )\n", + "test_eq( count_char('abacda','z') , 0 )" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e5f25890", + "metadata": {}, + "outputs": [], + "source": [ + "#| export\n", + "def count_delta(x,a='(',b=')'):\n", + " # Return the difference between the count of \"a\" and \"b\".\n", + " return count_char(x,a) - count_char(x,b)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fd0e9679", + "metadata": {}, + "outputs": [], + "source": [ + "test_eq(count_delta('asd(asd)'),0)\n", + "test_eq(count_delta('asd(asd'),1)\n", + "test_eq(count_delta('asd(a(sd'),2)\n", + "test_eq(count_delta('asd(a(sd)))'),-1)" + ] + }, + { + "cell_type": "markdown", + "id": "d77e6418", + "metadata": {}, + "source": [ + "## String split utility\n", + "\n", + "This is just a bit of syntactic sugar" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "49ec1d10", + "metadata": {}, + "outputs": [], + "source": [ + "#| export\n", + "def split_and_strip(x,splitter):\n", + " t = [t.strip() for t in x.split(splitter)]\n", + " if t==['']: return []\n", + " return t" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ad718a5c", + "metadata": {}, + "outputs": [], + "source": [ + "test_eq(split_and_strip('a,b,c',','),['a','b','c'])\n", + "test_eq(split_and_strip(' a, b,c ',','),['a','b','c'])\n", + "test_eq(split_and_strip('',','),[])" + ] + }, + { + "cell_type": "markdown", + "id": "704acae9", + "metadata": {}, + "source": [ + "## Inout util\n", + "\n", + "`inout` is the *name* of the portion of the complete arguments string that refers to input and output parameters.\n", + "\n", + "For example in this magick line: `%%testcell noglobals (aaa,bbb) ->(ccc)` \n", + "+ *raw_arguments*: `noglobals `\n", + "+ *inout*: `(aaa,bbb) ->(ccc)`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c2463950", + "metadata": {}, + "outputs": [], + "source": [ + "#| export\n", + "def process_inout(x,splitter='->'):\n", + " if x is None: return None\n", + " t = split_and_strip(x,splitter)\n", + " for v,c,n in zip(t,map(count_delta,t),map(lambda s: count_char(s,'('),t)): \n", + " if n>1: raise ValueError(f'Too much parenthesis on \"{v}\"')\n", + " if c>0: raise ValueError(f'Missing closing parenthesis on \"{v}\"')\n", + " t = [x[1:-1] for x in t]\n", + " if len(t)==0: raise ValueError('No groups available')\n", + " if len(t)>2: raise ValueError(f'You should have only one \"{splitter}\" symbol')\n", + " if len(t)==1: return split_and_strip(t[0],','),[]\n", + " if len(t)==2: return split_and_strip(t[0],','),split_and_strip(t[1],',')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "277e15f2", + "metadata": {}, + "outputs": [], + "source": [ + "test_eq(process_inout(None,splitter='->'),None)\n", + "\n", + "test_eq(process_inout('(a,b,cc)->(d,ee)',splitter='->'),(['a','b','cc'],['d','ee']))\n", + "test_eq(process_inout(' (a,b,cc) -> (d,ee) ',splitter='->'),(['a','b','cc'],['d','ee']))\n", + "test_eq(process_inout(' (a,b,cc) -> () ',splitter='->'),(['a','b','cc'],[]))\n", + "test_eq(process_inout(' (a,b,cc) ',splitter='->'),(['a','b','cc'],[]))\n", + "test_eq(process_inout('()->(a,b,cc) ',splitter='->'),([],['a','b','cc']))\n", + "test_eq(process_inout('->(a,b,cc) ',splitter='->'),([],['a','b','cc']))\n", + "\n", + "test_fail(lambda: process_inout('(a,b,cc)(d,ee)'), contains='Too much parenthesis')\n", + "test_fail(lambda: process_inout('(a,b,cc) (d,ee)'), contains='Too much parenthesis')\n", + "test_fail(lambda: process_inout('(a,b,cc) (->d,ee)'), contains='Too much parenthesis')\n", + "test_fail(lambda: process_inout('(a,b,cc->)(d,ee)'), contains='Missing closing parenthesis')\n", + "test_fail(lambda: process_inout('(a,b,cc) - > (d,ee)'), contains='Too much parenthesis')\n", + "test_fail(lambda: process_inout('(a,b,cc) ? (d,ee)'), contains='Too much parenthesis')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "22f7d40b", + "metadata": {}, + "outputs": [], + "source": [ + "#| export\n", + "def separate_args_and_inout(x):\n", + " if x is None: return None\n", + " if (start_t := optional_find(x,['(','->'],reverse=False)) and (end_t := optional_find(x,[')','->'],reverse=True)):\n", + " start,_ = start_t\n", + " end, c = end_t\n", + " length = len(c)\n", + " return x[:start]+x[end+length:],x[start:end+length]\n", + " return x,None" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "228f1202", + "metadata": {}, + "outputs": [], + "source": [ + "test_eq(separate_args_and_inout(None), None) # None input → None output\n", + "\n", + "test_eq(separate_args_and_inout(''), ['', None]) # Empty string → empty args, no in/out\n", + "test_eq(separate_args_and_inout('verbose'), ['verbose', None]) # Only args, no in/out\n", + "test_eq(separate_args_and_inout('dryrun verbose'), ['dryrun verbose', None]) # Multiple args, no in/out\n", + "test_eq(separate_args_and_inout('(a,b)->(c,d)'), ['', '(a,b)->(c,d)']) # Pure in/out spec, no args\n", + "test_eq(separate_args_and_inout('(a,b)'), ['', '(a,b)']) # Only input tuple, no args\n", + "test_eq(separate_args_and_inout('->(c,d)'), ['', '->(c,d)']) # Only output tuple, no args\n", + "\n", + "test_eq(separate_args_and_inout('dryrun verbose (a,b)'), ['dryrun verbose ', '(a,b)']) # Args + input tuple\n", + "test_eq(separate_args_and_inout('dryrun verbose (a,b) -> (c,d) '), ['dryrun verbose ', '(a,b) -> (c,d)']) # Args + in/out\n", + "test_eq(separate_args_and_inout('dryrun verbose () -> (c,d) '), ['dryrun verbose ', '() -> (c,d)']) # Args + empty input tuple\n", + "test_eq(separate_args_and_inout('dryrun verbose (a,b) -> ()'), ['dryrun verbose ', '(a,b) -> ()']) # Args + empty output tuple\n", + "test_eq(separate_args_and_inout('dryrun verbose (a,b) '), ['dryrun verbose ', '(a,b)']) # Args + input tuple, trailing spaces\n", + "\n", + "test_eq(separate_args_and_inout('dryrun (a,b) verbose '), ['dryrun verbose ', '(a,b)']) # In/out tuple between args\n", + "test_eq(separate_args_and_inout('dryrun (a,b) -> (c,d) verbose'), ['dryrun verbose', '(a,b) -> (c,d)']) # In/out in middle of args\n", + "test_eq(separate_args_and_inout('dryrun () -> (c,d) verbose'), ['dryrun verbose', '() -> (c,d)']) # Empty input tuple\n", + "test_eq(separate_args_and_inout('dryrun () -> (c,d)verbose'), ['dryrun verbose', '() -> (c,d)']) # No space before verbose\n", + "test_eq(separate_args_and_inout('dryrun() -> (c,d) verbose'), ['dryrun verbose', '() -> (c,d)']) # No space before ()\n", + "test_eq(separate_args_and_inout('dryrun (a,b)->() verbose'), ['dryrun verbose', '(a,b)->()']) # Compact arrow, spacing preserved\n", + "test_eq(separate_args_and_inout('dryrun (a, b) verbose'), ['dryrun verbose', '(a, b)']) # Tuple spacing preserved\n", + "\n", + "test_eq(separate_args_and_inout('dryrun -> (c,d)verbose'), ['dryrun verbose', '-> (c,d)']) # Output tuple glued to arg\n", + "test_eq(separate_args_and_inout('dryrun (c,d)->verbose'), ['dryrun verbose', '(c,d)->']) # Input tuple glued to arg\n", + "\n", + "test_eq(separate_args_and_inout('dryrun (incomplete'), ['dryrun (incomplete', None]) # Unclosed parenthesis → ignore as in/out" + ] + }, + { + "cell_type": "markdown", + "id": "104897cf", + "metadata": {}, + "source": [ + "## Consume inout" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2e72903f", + "metadata": {}, + "outputs": [], + "source": [ + "#| export\n", + "def validate_and_update_inputs(inputs:list,state:dict,)->dict:\n", + " s = set(inputs)\n", + " ret = {}\n", + " missing = []\n", + " for k in inputs:\n", + " if k not in state:\n", + " missing.append(k)\n", + " else:\n", + " ret[k] = state[k]\n", + "\n", + " if missing:\n", + " # Generic error that doesn't reveal what DOES exist\n", + " raise ValueError(f'Required input variable(s) not found: {\", \".join(missing)}')\n", + "\n", + " return ret" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c95e8718", + "metadata": {}, + "outputs": [], + "source": [ + "test_eq( validate_and_update_inputs(['a'],{'a':1, 'b':2}) , {'a':1} )\n", + "test_fail(lambda: validate_and_update_inputs(['a'],{'b':1}) , contains='not found: a' )\n", + "test_fail(lambda: validate_and_update_inputs(['a','c'],{'b':1}) , contains='not found: a, c' )" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ba812f8b", + "metadata": {}, + "outputs": [], + "source": [ + "#| hide\n", + "import nbdev; nbdev.nbdev_export()" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "python3", + "language": "python", + "name": "python3" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/nbs/02_testcell.ipynb b/nbs/02_testcell.ipynb index 90989da..52ee1e1 100644 --- a/nbs/02_testcell.ipynb +++ b/nbs/02_testcell.ipynb @@ -41,8 +41,15 @@ "#| export\n", "import ast\n", "from IPython.core.magic import register_cell_magic, needs_local_scope\n", - "from IPython import get_ipython # needed for quarto\n", - "from IPython.display import Code" + "from IPython import get_ipython # needed for quarto" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "WARNING: the official `IPython.display.Code` syntax hilighter don't seems to work. We're creating a \"drop-in\" replacement that force `full=True` in HtmlFormatter. This seems to work properly and give us more control on code display.\n", + "For details see: https://github.com/ipython/ipython/blob/72bb67ee8f57cb347ba358cce786c3fa87c470b9/IPython/lib/display.py#L667" ] }, { @@ -52,7 +59,8 @@ "outputs": [], "source": [ "#| export\n", - "from testcell.core import auto_return" + "from testcell.core import auto_return\n", + "from testcell.inout import separate_args_and_inout, process_inout, split_and_strip, validate_and_update_inputs" ] }, { @@ -80,6 +88,16 @@ "## Support classes " ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Used only in tests\n", + "from fastcore.test import *" + ] + }, { "cell_type": "code", "execution_count": null, @@ -87,20 +105,26 @@ "outputs": [], "source": [ "#| export\n", - "# Temporary class to deal with different display options (jupyter or console)\n", + "import html\n", + "\n", "class MessageBox:\n", " def __init__(self,data,*,background_color,text_color,emoji=None):\n", " self.data = data\n", " self.background_color = background_color\n", " self.text_color = text_color\n", " self.emoji = emoji\n", - "\n", + " \n", " def _repr_html_(self):\n", + " # Escape all user-controllable values\n", + " safe_bg = html.escape(str(self.background_color), quote=True)\n", + " safe_text = html.escape(str(self.text_color), quote=True)\n", + " safe_data = html.escape(str(self.data))\n", + " \n", " return f\"\"\"\n", - "
\n", - " {self.data}\n", - "
\n", - " \"\"\"\n", + "
\n", + " {safe_data}\n", + "
\n", + " \"\"\"\n", "\n", " def __repr__(self): \n", " text = self.data\n", @@ -108,6 +132,19 @@ " return text" ] }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# test against html injection\n", + "mb = MessageBox('', background_color='red', text_color='black')\n", + "test_eq('<script>' in mb._repr_html_(), True) # Escaped = GOOD\n", + "test_eq('