diff --git a/examples/notebooks/tools.ipynb b/examples/notebooks/tools.ipynb
index 1341f6c..815fd5c 100644
--- a/examples/notebooks/tools.ipynb
+++ b/examples/notebooks/tools.ipynb
@@ -1,8 +1,32 @@
{
"cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Exploring Impresso Tools (NER, NEL, article embeddings) \n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "This notebook provides an overview and demonstration of the core tools integrated into the Impresso application, focusing on components that power text understanding, entity recognition, and retrieval capabilities across the corpus. \n",
+ "\n",
+ "This notebook documents production-level tools that are permanent within the Impresso infrastructure.\n",
+ "\n",
+ "Specifically, we cover three major components:\n",
+ "\n",
+ "* **Named Entity Recognition (NER)** – identifying and classifying named entities (e.g., people, places, organizations) in historical newspaper text using the [impresso-project/ner-stacked-bert-multilingual](https://huggingface.co/impresso-project/ner-stacked-bert-multilingual) model.\n",
+ "* **Named Entity Linking (NEL)** – resolving recognized entities to canonical entries in knowledge bases such as Wikidata, using the [impresso-project/nel-mgenre-multilingual](https://huggingface.co/impresso-project/nel-mgenre-multilingual) model.\n",
+ "* **Article Embeddings** – generating embeddings of full articles using [gte-multilingual-base](https://huggingface.co/Alibaba-NLP/gte-multilingual-base) to enable semantic search with:\n",
+ " - **In-corpus queries** – selecting a query directly from the *Impresso* corpus. \n",
+ " - **Out-of-corpus queries** – embedding an external query (e.g., manually formulated or from another source). \n"
+ ]
+ },
{
"cell_type": "code",
- "execution_count": 8,
+ "execution_count": 14,
"metadata": {},
"outputs": [
{
@@ -10,26 +34,26 @@
"output_type": "stream",
"text": [
"🎉 You are now connected to the Impresso API! 🎉\n",
- "🔗 Using API: http://localhost:3030\n"
+ "🔗 Using API: https://dev.impresso-project.ch/public-api/v1\n"
]
}
],
"source": [
"from impresso import connect\n",
"\n",
- "impresso = connect()"
+ "impresso = connect('https://dev.impresso-project.ch/public-api/v1')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "# Named entity recognition\n"
+ "### Named entity recognition\n"
]
},
{
"cell_type": "code",
- "execution_count": 9,
+ "execution_count": 15,
"metadata": {},
"outputs": [
{
@@ -86,38 +110,38 @@
" \n",
"
\n",
" \n",
- " | 1:37:pers:ner-stacked-2-bert-medium-historic-multilingual|ner-mgenre-multilingual | \n",
+ " 2:41:pers:ner-stacked-2-bert-medium-historic-multilingual|ner-mgenre-multilingual | \n",
" pers | \n",
" Jean-Baptiste Nicolas Robert Schuman | \n",
" N/A | \n",
" Baptiste Nicolas Robert Schuman | \n",
" 91.25 | \n",
- " 1 | \n",
- " 37 | \n",
+ " 2 | \n",
+ " 41 | \n",
" N/A | \n",
" N/A | \n",
"
\n",
" \n",
- " | 41:72:time:ner-stacked-2-bert-medium-historic-multilingual|ner-mgenre-multilingual | \n",
+ " 46:80:time:ner-stacked-2-bert-medium-historic-multilingual|ner-mgenre-multilingual | \n",
" time | \n",
" 29 June 1886 – 4 September 1963 | \n",
" N/A | \n",
" N/A | \n",
" 77.90 | \n",
- " 41 | \n",
- " 72 | \n",
+ " 46 | \n",
+ " 80 | \n",
" N/A | \n",
" N/A | \n",
"
\n",
" \n",
- " | 80:90:org:ner-stacked-2-bert-medium-historic-multilingual|ner-mgenre-multilingual | \n",
+ " 88:98:org:ner-stacked-2-bert-medium-historic-multilingual|ner-mgenre-multilingual | \n",
" org | \n",
" Luxembourg | \n",
" N/A | \n",
" N/A | \n",
" 25.12 | \n",
- " 80 | \n",
- " 90 | \n",
+ " 88 | \n",
+ " 98 | \n",
" N/A | \n",
" N/A | \n",
"
\n",
@@ -126,30 +150,39 @@
""
],
"text/plain": [
- ""
+ ""
]
},
- "execution_count": 9,
+ "execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"text = \"\"\"\n",
- "Jean-Baptiste Nicolas Robert Schuman ( \n",
- "29 June 1886 – 4 September 1963) was a Luxembourg-born French \n",
- "statesman. Schuman was a Christian democratic (Popular \n",
- "Republican Movement) political thinker and activist. \n",
- "\"\"\"\n",
+ " Jean-Baptiste Nicolas Robert Schuman ( \n",
+ " 29 June 1886 – 4 September 1963) was a Luxembourg-born French \n",
+ " statesman. Schuman was a Christian democratic (Popular \n",
+ " Republican Movement) political thinker and activist. \n",
+ " \"\"\"\n",
"result = impresso.tools.ner(\n",
" text=text\n",
")\n",
"result"
]
},
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Named entity linking\n",
+ "\n",
+ "For the system to know what entity to link, we need to surround it with the markers [START] and [END]. Leave spaces between the entity and the markers."
+ ]
+ },
{
"cell_type": "code",
- "execution_count": 10,
+ "execution_count": 17,
"metadata": {},
"outputs": [
{
@@ -158,7 +191,7 @@
"\n",
"
\n",
"
Ner result
\n",
- "
Contains 6 items of 6 total items.
\n",
+ "
Contains 1 items of 1 total items.
\n",
"
\n",
"
\n",
"
\n",
@@ -182,13 +215,7 @@
" \n",
" | \n",
" type | \n",
- " surfaceForm | \n",
- " function | \n",
- " name | \n",
- " confidence.ner | \n",
" confidence.nel | \n",
- " offset.start | \n",
- " offset.end | \n",
" wikidata.id | \n",
" wikidata.wikipediaPageName | \n",
" wikidata.wikipediaPageUrl | \n",
@@ -200,86 +227,61 @@
" | \n",
" | \n",
" | \n",
- " | \n",
- " | \n",
- " | \n",
- " | \n",
- " | \n",
- " | \n",
"
\n",
" \n",
" \n",
" \n",
- " | 1:37:pers:nel-mgenre-multilingual | \n",
- " pers | \n",
- " Jean-Baptiste Nicolas Robert Schuman | \n",
- " N/A | \n",
- " Baptiste Nicolas Robert Schuman | \n",
- " 91.25 | \n",
- " 96.76 | \n",
- " 1 | \n",
- " 37 | \n",
- " Q15981 | \n",
- " Robert Schuman | \n",
- " https://en.wikipedia.org/wiki/Robert_Schuman | \n",
- "
\n",
- " \n",
- " | 41:72:time:nel-mgenre-multilingual | \n",
- " time | \n",
- " 29 June 1886 – 4 September 1963 | \n",
- " N/A | \n",
- " N/A | \n",
- " 77.90 | \n",
- " 86.46 | \n",
- " 41 | \n",
- " 72 | \n",
+ " | \n",
+ " unk | \n",
+ " 99.93 | \n",
" Q15981 | \n",
" Robert Schuman | \n",
" https://en.wikipedia.org/wiki/Robert_Schuman | \n",
"
\n",
- " \n",
- " | 80:90:org:nel-mgenre-multilingual | \n",
- " org | \n",
- " Luxembourg | \n",
- " N/A | \n",
- " N/A | \n",
- " 25.12 | \n",
- " 100.00 | \n",
- " 80 | \n",
- " 90 | \n",
- " Q32 | \n",
- " Luxembourg | \n",
- " https://en.wikipedia.org/wiki/Luxembourg | \n",
- "
\n",
" \n",
"\n",
""
],
"text/plain": [
- ""
+ ""
]
},
- "execution_count": 10,
+ "execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"text = \"\"\"\n",
- "Jean-Baptiste Nicolas Robert Schuman ( \n",
- "29 June 1886 – 4 September 1963) was a Luxembourg-born French \n",
- "statesman. Schuman was a Christian democratic (Popular \n",
- "Republican Movement) political thinker and activist. \n",
- "\"\"\"\n",
- "result = impresso.tools.ner_nel(\n",
+ " [START] Jean-Baptiste Nicolas Robert Schuman [END] ( \n",
+ " 29 June 1886 – 4 September 1963) was a Luxembourg-born French \n",
+ " statesman. Schuman was a Christian democratic (Popular \n",
+ " Republican Movement) political thinker and activist. \n",
+ " \"\"\"\n",
+ "\n",
+ "result = impresso.tools.nel(\n",
" text=text,\n",
")\n",
"result"
]
},
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Named entity recognition and linking"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "This method will do entity recognition and linking at the same time."
+ ]
+ },
{
"cell_type": "code",
- "execution_count": 11,
+ "execution_count": 18,
"metadata": {},
"outputs": [
{
@@ -288,7 +290,7 @@
"\n",
"
\n",
"
Ner result
\n",
- "
Contains 1 items of 1 total items.
\n",
+ "
Contains 6 items of 6 total items.
\n",
"
\n",
"
\n",
"
\n",
@@ -312,7 +314,13 @@
" \n",
" | \n",
" type | \n",
+ " surfaceForm | \n",
+ " function | \n",
+ " name | \n",
+ " confidence.ner | \n",
" confidence.nel | \n",
+ " offset.start | \n",
+ " offset.end | \n",
" wikidata.id | \n",
" wikidata.wikipediaPageName | \n",
" wikidata.wikipediaPageUrl | \n",
@@ -324,38 +332,78 @@
" | \n",
" | \n",
" | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
+ " | \n",
"
\n",
" \n",
" \n",
" \n",
- " | \n",
- " unk | \n",
- " 99.94 | \n",
+ " 2:41:pers:nel-mgenre-multilingual | \n",
+ " pers | \n",
+ " Jean-Baptiste Nicolas Robert Schuman | \n",
+ " N/A | \n",
+ " Baptiste Nicolas Robert Schuman | \n",
+ " 91.25 | \n",
+ " 96.76 | \n",
+ " 2 | \n",
+ " 41 | \n",
+ " Q15981 | \n",
+ " Robert Schuman | \n",
+ " https://en.wikipedia.org/wiki/Robert_Schuman | \n",
+ "
\n",
+ " \n",
+ " | 46:80:time:nel-mgenre-multilingual | \n",
+ " time | \n",
+ " 29 June 1886 – 4 September 1963 | \n",
+ " N/A | \n",
+ " N/A | \n",
+ " 77.90 | \n",
+ " 86.46 | \n",
+ " 46 | \n",
+ " 80 | \n",
" Q15981 | \n",
" Robert Schuman | \n",
" https://en.wikipedia.org/wiki/Robert_Schuman | \n",
"
\n",
+ " \n",
+ " | 88:98:org:nel-mgenre-multilingual | \n",
+ " org | \n",
+ " Luxembourg | \n",
+ " N/A | \n",
+ " N/A | \n",
+ " 25.12 | \n",
+ " 100.00 | \n",
+ " 88 | \n",
+ " 98 | \n",
+ " Q32 | \n",
+ " Luxembourg | \n",
+ " https://en.wikipedia.org/wiki/Luxembourg | \n",
+ "
\n",
" \n",
"\n",
""
],
"text/plain": [
- ""
+ ""
]
},
- "execution_count": 11,
+ "execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"text = \"\"\"\n",
- "[START]Jean-Baptiste Nicolas Robert Schuman[END] ( \n",
- "29 June 1886 – 4 September 1963) was a Luxembourg-born French \n",
- "statesman. Schuman was a Christian democratic (Popular \n",
- "Republican Movement) political thinker and activist. \n",
- "\"\"\"\n",
- "result = impresso.tools.nel(\n",
+ " Jean-Baptiste Nicolas Robert Schuman ( \n",
+ " 29 June 1886 – 4 September 1963) was a Luxembourg-born French \n",
+ " statesman. Schuman was a Christian democratic (Popular \n",
+ " Republican Movement) political thinker and activist. \n",
+ " \"\"\"\n",
+ "result = impresso.tools.ner_nel(\n",
" text=text,\n",
")\n",
"result"
@@ -365,33 +413,54 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "# Embeddings"
+ "### Article embeddings"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "All content items in the Impresso data of type _article_ longer than 800 characters were embedded using [gte-multilingual-base](https://huggingface.co/Alibaba-NLP/gte-multilingual-base) which is the latest in the GTE (General Text Embedding) family that has a strong multilingual capability."
]
},
{
"cell_type": "code",
- "execution_count": 12,
+ "execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
- "'gte-768:F3aPvJPbqT332Y69y38JvPAlUr2RnVK9uckvPXyMYb0nLh49jhUHvhVmKjwStr46sK+iPCuGAD1oXue9GNxoPW84gT3p2G08dvuMPZNJdjti5ro80ensPBfWkTyaN3+9zvXvvFhZGT0LYHk97qbnveq9Gr1CWpo8LgvyvSakaDzff587Zqk1vBfOZz15yF29P2RpPVfeLDz8Mh293P25u9YigrvzICE9Gpc3PE+NGzuUlji9mND7PLakiLxq9kI98JqNPF52KT0mwRO8caoZvJffnDx0Low9zHkTvcNWfTxCAJa9g4EOvVbt/T3Vuv+8F6CJuxpdJ7vgiks7KWKxPHXmgr2C/j29DsQYPVhZOr1Ry7o9Df9cO+dZhL1xfBA9uvM8POHIkTxSdei8aBe0PVCqi73PwQe9gJzbugLUXDpDyBe8ndALPK1SAb21ZMA6IGb0vNFxtbyy0Ua9ETa0PBu9eLuDoMa6MHNkPTRHUTt36fA8T2iCPC4Edjx5zTQ9RsHAvBuD8rygvj88zfJLvIoJmzyuqoo8BXGpPMt4urxmMns97KEGvRSvHT2NAQG9iAVDPACFp7x2aGW5rL0SvaOJKb3M2M88SYumvat0ij33NsC7iFRxPCIKpz2hCT69SbZyujOAn7zInYq9VqxGvf9T4r0ytTi9CVqSvGVucL3oKiQ8LeowvMfYiz3dkfG8JkGLPDlIBr2cGDk8s2pWvc0zTznjl2M97h12PLEoLb1+hAa6tDPqPCZMd72pB/Y6byhVvJ78VjuzWbG8lNTmvPc8yjwi+389fTeDOzwC3zs9coo9H/+1vCU7mDxQdwg7T3PNvKWVczyIPGK9ANifvAhhfD2Kb/c8o2XzPKfoSr2iDw+8xSEiPGQp87pXCp88NByHPVC1nLugYyA8nCefPEnY1jzcuMu7f8HMvIqcWDyWU9U8Q/cbPbWhiruVetY9ddvKPPHOfb2vSJm9/fBavTuiND0xOGe8fNQAPZ4B27s8xFS9dkA0PbOgo7zErbi8x/WNvVukiDxpk2O8JfHRvV71b7sIl2Y9ielKPdxTID1VkvW7r71DPHFO7DyWhoQ8LV3yvA0XsDsL/4e7ryQ/vRscuzzYDfu8wcVYPVRAH7zFr8s8+5q9OQwDC7zNdYK802CnvK7Rjbk998O8VZaqvPNoVbwGEvG717i7vC8NdjyEek68De6dPMMU2ryjgsM8aNSxvCpwqLy026685BvvvIeCVDqhkA+9Mp4EvTT6dj0BqAa7CWwHPbvCDT2napY9C2FFPATzXLrepWc9Yz9yPctK87xv1Mo7DSF0PJuIQb3ss6y8Ft27OzgnJbwHRJi7NAt6O6+5Pz3oqQG9Xdubu+ifk7tMaxg9wzezPBSVM7zyWwS8WrfpO9Rs37hghUs8crU3PIcC5r3Spgg9y14XvZg3Eb1Emb+83QWkO7VYrj3eUHG9Qb62PIrUHb1g7127PuOJPLeUHL02RPg7uwu0PJR4jDxkiwk7pgq7PA7RlbxSwKc6/PWYPOVB+DxUGIi87jjJPCNw5jyv3hw9MvN2vQ2NJryTKec811pHPRMsBL2fyVO9KGtGOxTQATx/la67d/N+uh/Sj7uyg9Y54utLPfK6HL33qYO8iJjpvH6SZr2G+gm9/3h7PZbLJT3Uo6K8HLBvvNNyD71hco09HyopPdWxxzwSoSE8RfHCvAm+mroQ9OI7SIUKPvjPG70SK4A8oYuqOglxl7wfv9q8z/s3OyAXer3jfrc8nrzpPGbVaTxFHjM8IL1dvU+NsTzAkJ08zLIvPamDG7vTC948882ePAYqoD3nksg8xb9xvZRmGLuDouE8VXHzvNJzaryYMJG8KH8fPCD7obsUxtE8uDmAPGkBXTx/9ig9iw4DvccnDTxrq5o7KzdHPYYdijmN8lU9yLUHPaSz8LuZbIE8IljaPNnqeD1MrPg8sGg6vaOrArw4b+A8zF0OvQTtKzt3y7U8T/RPvQpPQT3ZSfK8z8AWOwHY2rhYWnA9JIgAPQsQPr0Ww0Y8q3d4PdpfqTzz9n28nTYKPOTK+rsy1Gi8bb+mPA9ptbxU1Ze8cx+2PPd2o7zbZoK6dUkxPGjXHD0owJ67PNrWPPFHuT2R8fc7/qDpuxFchTy36+m81BsVvQAbWT27QhU9XlykuydKjD3PV5S8uFmcOwITibz86qU8vsjiPB9mBrx5oC68Snc0PaOl1TzE5AM8uU7iPN8DIj4p84+99BXgu4R4HDwJmW0884B/vTC+ybw+6Q682fM0O+Pbwjxu2p+9+kkVPd+1Lr2GxP66BERivcX4cjzXeTO9vbE+vMbSIjvXjIi7t6fePDnwvDzkLfm8nzeLvY2fRTwZPh69LyWZvRPLn7xM3Di87S4hPTCRybxxV0U8ImdCvZW7J7zlpJ+8zK1bPN0sJz03jQC9Dm6Pu5qX7rxcGXw8p3b9PJu5QjxT37S85JgxPO2hEj0ZV789DudvPMAQQzx324A9JWmFu3HngLzbOIg876hQPFlW9zqf7KQ7VqD/uyzGqzxseOe8gMflPHf1zrtTcPS7AHCHPX7OpzxdG6K7Iuc/vSUojrxmxYE8HWyxu2IOt73aPZI8tqn9PBMTgT06/6g837RjvdDIFzrXBcQ8mB/ovA+j47xnEEW95AEJvVMb5DzUaEW9RhSpvZE37ry9//A8+aFGvPYY/7y+BFY9HaWcO/AvALuPqbg8JjNnvPyY+7xBRO67N3SyvOKNoL2mTgC84W3GPIeqgDwfaIS9TNXCOsgAjD3Mh988wC41vQRzhDy/7K07qKY3vRRLiLxo51y8uSKZvLxl6Lz6Xz88AXmoPNRwnjwv4vE8cADbOzUO/TzPOvs81PONvXTw9zsJ6+U8r4jzPBj2lDzZroc8QXQZPeR6Br3X2b08s4vwPGyLar3tr7G8QqbqO5M2NjxnCGM9onPYPNvRej2TpYA7mJqMvJzpvTr6PDy9zWuFPDBfIj1Q5lg8afl6vYL/s7znoYg9FsT3PGV0lrw2Y5C8jApRPM2hjTsgP4M8jcbmPPgcYjxkAb48TG9jOrwhkbwJEBQ8HojOvCPzg70MRbC7wgjvO3qAir0xHB+8bziEvCHttTzWe2+8v/0fPbEEl7oxVwA9uPwQver20bwCDAG721bmPFKvhrxI07O8gQLgPBJ4lTuS6vm7SjETPW0/zrwL9YE9qLeDPD0cA72XwZ049MDSvHSZIbxaE1g7MaAZO9q65bw3dRG9GTmSPP54Ur3xkNw7PMrJPDdHrjxMdHO8WnvEPLXEijvfDOe89RpTvSOfUj3ywaq8qf66Ox6MnLxs8vm7rXc+PQ8kPzxnOQS7/4k/vSzaYr3dzvi8zHYVOYwEyjswe5m8DPAFPUoJ9bp1Fas8wTutu3SsT7040dY8ZKvdvIFs8rzvsQc9IQrhPDsioD0P8AE8dmQXvcYeCzwbf2E7ITQCvFjiyTr42F09qiViPB9G1jsk6+E69fHqvP+Ktzq7GgU8SQE7PWopkjyFTjA9xX1aPMwJpruvkyQ97dg5vW/aNr0xjWc8MIX8PFeeC73kXYG87HEpPdueMbut8Rm910LQvO1Dy7tthAu9S59Ku1Cw8LnRdq+8HEbcPLwrmTsmIYA9IVRwvMIpAr2RIp88NoMzvbpPs7wFZ4A93rYxPftj1Dsb60y7+/GDvMPQizxKS5o8miSOvPZH6jyqrJo7bSBfPbyLLTwtXm86iF1lvB+OuTuPl8+7deKcvNgfBT3ffR09kmVlPBDhJrxCFcO8FXeXvDpNfj0GGUW7rlU2vR6tnbvxjP875+XfvKZKTjspVrw7bbYMPGrAg7v81bE9ecotPeZpRD0rkeO87zk2PLF0hLsPbS+8tYoOvecewTwxqHO8J8uXPOEmgz2mrqg8NO1QOKgVc7wMfoY8TMaoO/yVgDzwUNO9CyLJOipQ2ryP+QK8rQ2BvLgAjT17adg8EMxNvTeIdLzi4f88zTGcPAu0BjxlfB49jzqTO9wQkzwdG708Kd9YPc2gHLwTCEs7zGqdPI9Dir0b2q69r0iWPHOE8bt7nI08QQt1PYIWPT2cAe68CxnZPKDPFzz5Gzk9'"
+ "'gte-768:Fao1vWPioj3Msv+7LwqfPLicTb2kPgO+pkpAPVhbkL1cdi891oY9vYv1dDzSHTC9dJ0MvYGTZT3d87C90kt6PZ81gT0pg4c7x6XRPdfOC70xllU9WJrxO/pBwDwDI1C8YveXvZ3gyrzDjek9FkkhvXFAoL2LjYo9W0XwvKLdFLsg10C7QGhjPcAUHD1Z4YC8yFZhPbxhnToLYza9cFl8vMkdYTw0DZG8Mr10vX8zIjo/ZTs8AYC1PGyAn70CFaS7PAyNvNqxPT0AUyC9eiAyPXVJo7y1OGM9r/8TPdFwez1HIaS9lXUHvGb4Aj4N9oK9wTlNvQBmOrtQG9E7WqmPvUjIYLx6jZa8hOEFPV6vkLzIEoM9latiPej5i7s72xI8NmLBurlZuDzdEyw8gD7mPcUoQLytEUk6CfLAPfnsOD1UKy29vVunvBrRozzb8jU9op9OPJnZqTyJeE+8bchbvT3uhLxL2WU9toKuPfffBrwuxZk8H2PLvG6TWby+ei09OKuLvAUJTL0mvh08oOuDvYTpiTsBNQ47CNTAPB3frbzvPi88CUQ9vcza/LwKvRs91lUCPPlTCjx/bH+9dEkovK43K70N5bC8+BMxvcgJBz1vxoo9JQygvHvAkz3VDry8nWWzuwuSJ72RL+a8lJP5vJPTur1O3uG8Ph27vGo/LL0q7ew8mvK9vBMQmD3kaIU93Lg1PejcIb1ffAg9y4wuvDBo2LxlEzc9SWfWunvsKb228fy8F7iKPBLTSL2OFiO9T4+JPOr5jzuas5K8grgMuy6EQzzGCF69W2DZPKX7JT3C/908EIxYvaKCMrwC3e28KIJJvLBgGj0P+Im8LMJuvGClrT2C4W67DOx6u/g0rbummEI8UquLPKSzvrwj1nk8OyIoPXR3ILxpzxy98WElPW2ejDxP2jK8vsGBvKCdBD1QD/K8sieVPc6Cjr0mm6k7U8SCvEEc9LyS4Iq9STFGvGc7uLsB0bg8kPsMvf71OL1OeyE8xRFePMzPlLwlezA8mk6FvZepnTygYVm8lpG/vSrQE70AAZo9dW9tvKEhMDw3tzM9JxoBPQQrfz1oTuA8PLAyPTZcT7zgAJ27is46vJwGMrzvySK9aSHsPKMgBzx3DUU8LM9OPSKZLr2bfkC8NAxEvUJnyrwaLyk8kDEHvdzT8by4SLe8ihZfvc4Bn71960I92S0sPYRbJTpyUBw9kjFBPM6NsjyVPf+8xJrePDDPbrxxoXm9TomuOp9iw7vczX28Cwy5vNWlFrxusiE95cLcPKQBdrzeF/A8oUoNPCiY5LySdkO9exmBPEWYAbu0dzg8ZbvUOuFCHL25xvk8CGmPvPW+ELwGwC89tsSCPNrVND1qxde8wMZBPdphLDtXLgO9ltB3O0HdljvoDCS76XEvPWikL730pcU8rzN6PEWfEL0nL4U8RkJSuz9WGT1B9Vc6bO4aO8ZaNr0bEbw8kx6ovSSNdbz79707pqSJOz+2Nz26XpE8rrukvC6WvzkEPiy9haWUPSanjDw7xIo7iyCLO3pH57v+h3Q7AYOiu6m7r7zROxW75Dm+PJsavb3b49S9xarFuxI6mTsIBE+860WOPCQhuLuGPg49SIv8PF0f7Trvn4y8A+xZvYJWlLzP/XG6VdBdvVypPjkCnrm85P5/PO3zS73+Adg6S7cTvJEkk7zksFQ9WJANvR9u9LzuLDS9/6lOPbK26bxHOCc9Jhe9vOutYrxDnWu6fV7zOjkbGL1G4d28M76NPO6FgTy6dXA8bidnvcTxqDzBSjS9/ykjPMqsRjvicHM8/dRsO4ODubw6nQM9Aa3JvYEvYD1Lqq+8zqH3vPIaMjxxD868qD45vCfHFzqe84S6bNcWPeMPvDtdVJE9NUAlvEWahTw8OMW7l9UePTSgeb1fym09T8ZAPenUuLz8wZU92C8GPMWIoj0D2Y+8X82HvUBCuDwwSJW8g9PlO66ndjzMbi89yL8UvNg2Xj1ahaa8CPdmO6NkFD3si2s9/MGdPODKHb3wY5k7dq54Pagplj32DUk8nZsdPLiM/7yMrMo8DXxQPUKaWj2G+oG8ARQXu59o07yzynE8jSu2PN1QUzt14Ju8baxvPKdDzzzvh968hA5EPRlkIT3s/Lw8qi6iPMu7wDz1Tmm9AJRJPN6sPD0ysUe8qOWBOgnuf7wy7Ss9wXNUva8HoD1WWKu8A02wvDoVYj25MK26lVSEPPRYEz7054m88q3Wu+/HyDwdstU8fPgmvUca97zGigI7HH4JPRhFKD1RtSO97Z0hPSUjmr3L3wI9XarYPG18CL3+1wi985qOO1XFBTxMC4y8RFryPH62ID1Foh48L2qGOjWQtzyzxou9/8IPPBpMPTxwqXc9AI6LPas9U7yD9jS9bgpQvSYFML1F6DS847GHPM/Z4zwxSyq82X5/usUL4DzAyo48HReVvN7ZfbwE7pS8smoxPfbUBr1njLA86H4wPbXUirukqWk9x9tROh8TWbx4QDm9eshOPEXhprx+oBK9EJkHPVYNAD2rYFm90V6BvENfjjwiI4m4O+1aPC7oQzwed2s856AhvKBkE7008Dk8cUTaOxr6rr21I408yucmPf5ixbzQmN08B0VhPHgWn7zW0Bc9Tp8PPWcNWTwrzsa7sYUsPO0hKbyrSwW9AVRRvTVcFbwpevs8nPqHveffgrwJ/3Y9jsVnPcSSUzzpLT+9UuoxvXH2jDyVhxg9JDGEvXIWIb1+EEo8IDXXO7z7Jb3Sb1q8hlUsveisQz16Ex69Wy71vJYkLD0Skb88QswPu4B8xTwNQZa8Y3WTu0gAYDxHedo6JFIPPVoQgDssKZg7slNMvAi6ij3EoSY975P5u6pe/DvrIcw8yK+BuzGHhDxnldS7SnwJvM914zx4WQo80+KRPXi5lb1TdO6862kdvJnqUDwEGKC8VSQTPVvIED3RpbS6K2LLu5Lkxzy42ke7NJe+vNAeeT3oAla4apEDvflT7jySHUK8XkD9vGnjyLwF/ze9wwQgveJbbTwQqIE8KQCLPafGVDy7why9PYh6u5F3GbxKID29MTKLvP0oSD0pj9W84Uwou8UW4bxTfWu9dodZvTFvS73kC/a8dlGkPNkkQb3gpIY8zPqZO+iI1jtHxvY6XTg7PRXa/bxGlt67WI4EPewo/7o9bQu8lA5yPIfSbDyl5eI7fvPvPH4kObxE7AU94mALvR2mjjyRb5m7jTgLu2XBjr2975m7jd8KPeNqbL2aPwS7YZ9yPBIuJT0SmpI896JOPNBsgby6gIm8xfqmvMTVPD3KjB+9hoxkuxCJGz0fPUM8NThhPcOC6zt+xR29BvuSPL5LqbxVyuK6ymV1O9SvzLwuV4Q8tOQ9PTqIfTxdFi09alKKuwl6XTuQE2i8D3okO49v0bxSfHE6VeFvPdbmFj3UTnK9/YfZPCTYVzzN4AU8m4ndvBYsCj1urWU8ABAKuXfmXb0ppoo8beOtvHhRWr2XiOa89ynYOyML3byIbIk9HHLSPO1MBT1FQBk9DfyfvCqsNb2IWYk8oCUQPbhLL71lLsU8B5IjO1fgOL0ycMK82sYwvZUSMb0UGh69P8pjPaFck7wnxT68lU0zPRrqCz3kkPG65XKdvEXE0jw4pcI8cSIHvFiOcb0l3Ek9VUOkPChwLzynODe8aQiZvDW5J73IfLQ6hWssvFg2XjyDOKu8606sPCB9ojzrkDE82OjXvDW7AT1QVvY8IYFCPEX8ej2G27w7Ru7iOg+vZzwiSQi72enwPDArtjxJbMk8uwbjvE0aVTxpfOW8aH5RvI9N4jxA90w81ykfvJGJTb3jDPk8dWaduxEHJT2tKYo7pRSsPGrOFbvmSyE9Oea/Ox/Gjz3g/oQ98aR6vNpF8bwjhYq9oabBPHp0I711Tl48yUFhvCTTID37KX48sUpEPNDPEL37nWO8b+TLvA5o6rqLJYA9V2uCu1QCgDwAYRU92wJ7PaXYkT2yzWi9PKNjvEUb8zxqDEI8ZVDlPPM2dzyZ46a8+lpOPYGWy7y0FTS9DJShPADtP72T3o+7dCmdPPZ5qDwGSBO9zBAEPOryVrpgZiE9'"
]
},
- "execution_count": 12,
+ "execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
- "embedding = impresso.tools.embed_text(text=\"Le congrès à Paris de l'Union cycliste internationale\", target=\"text\")\n",
+ "text = \"\"\"\n",
+ " Jean-Baptiste Nicolas Robert Schuman ( \n",
+ " 29 June 1886 – 4 September 1963) was a Luxembourg-born French \n",
+ " statesman. Schuman was a Christian democratic (Popular \n",
+ " Republican Movement) political thinker and activist. \n",
+ " \"\"\"\n",
+ "\n",
+ "embedding = impresso.tools.embed_text(text=text, target=\"text\")\n",
"embedding"
]
},
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Search by text embedding with `tools.embed_text` with an out-of-corpus embedding"
+ ]
+ },
{
"cell_type": "code",
- "execution_count": 13,
+ "execution_count": 30,
"metadata": {},
"outputs": [
{
@@ -402,7 +471,7 @@
"Search result
\n",
"Contains 10 items (0 - 10) of 80 total items.
\n",
"
\n",
- "See this result in the Impresso App.\n",
+ "See this result in the Impresso App.\n",
"\n",
"\n",
"Data preview:
\n",
@@ -469,66 +538,66 @@
" \n",
" \n",
" \n",
- " | EXP-1948-09-02-a-i0003 | \n",
+ " EXP-1951-11-23-a-i0001 | \n",
" in_cpy | \n",
" ar | \n",
" print | \n",
- " Jamais président du conseil n'est parti à la c... | \n",
+ " M. ADENAUER A PARIS | \n",
+ " [{'uid': '2-54-Paris', 'count': 2}, {'uid': '2... | \n",
+ " [{'uid': '2-50-Aristide_Briand', 'count': 1}, ... | \n",
" [] | \n",
" [] | \n",
- " [] | \n",
- " [] | \n",
- " [{'uid': 'tm-fr-all-v2.0_tp98_fr', 'relevance'... | \n",
- " 512 | \n",
+ " [{'uid': 'tm-fr-all-v2.0_tp87_fr', 'relevance'... | \n",
+ " 621 | \n",
" 1 | \n",
" fr | \n",
" True | \n",
- " 1948-09-02T00:00:00+00:00 | \n",
- " EXP-1948-09-02-a | \n",
+ " 1951-11-23T00:00:00+00:00 | \n",
+ " EXP-1951-11-23-a | \n",
" CH | \n",
" SNL | \n",
" EXP | \n",
" newspaper | \n",
"
\n",
" \n",
- " | IMP-1950-05-19-a-i0001 | \n",
+ " GDL-1998-02-11-a-i0031 | \n",
" in_cpy | \n",
" ar | \n",
" print | \n",
- " Le «combinat» franco-allemand charbon et acier | \n",
- " [] | \n",
+ " Une voix s'est tue: celle de Maurice Schumann | \n",
+ " [{'uid': '2-54-Londres', 'count': 1}, {'uid': ... | \n",
+ " [{'uid': '2-50-René_Payot', 'count': 1}] | \n",
" [] | \n",
" [] | \n",
- " [] | \n",
- " [{'uid': 'tm-fr-all-v2.0_tp87_fr', 'relevance'... | \n",
- " 443 | \n",
+ " [{'uid': 'tm-fr-all-v2.0_tp55_fr', 'relevance'... | \n",
+ " 283 | \n",
" 1 | \n",
" fr | \n",
- " True | \n",
- " 1950-05-19T00:00:00+00:00 | \n",
- " IMP-1950-05-19-a | \n",
+ " False | \n",
+ " 1998-02-11T00:00:00+00:00 | \n",
+ " GDL-1998-02-11-a | \n",
" CH | \n",
" SNL | \n",
- " IMP | \n",
+ " GDL | \n",
" newspaper | \n",
"
\n",
" \n",
- " | EXP-1947-11-25-a-i0011 | \n",
+ " EXP-1958-03-11-a-i0148 | \n",
" in_cpy | \n",
" ar | \n",
" print | \n",
- " Le gouvernement Schuman accueilli sans enthous... | \n",
- " [] | \n",
- " [{'uid': '2-50-René_Mayer', 'count': 2}, {'uid... | \n",
- " [{'uid': '2-53-Parti_républicain_de_la_liberté... | \n",
+ " Robert Schuman parle de la zone de libre-échange | \n",
+ " [{'uid': '2-54-Berne', 'count': 3}, {'uid': '2... | \n",
+ " [{'uid': '2-50-Robert_Schuman', 'count': 5}] | \n",
+ " [{'uid': '2-53-États_pontificaux', 'count': 2}] | \n",
" [] | \n",
- " [{'uid': 'tm-fr-all-v2.0_tp98_fr', 'relevance'... | \n",
- " 251 | \n",
+ " [{'uid': 'tm-fr-all-v2.0_tp00_fr', 'relevance'... | \n",
+ " 657 | \n",
" 1 | \n",
" fr | \n",
- " True | \n",
- " 1947-11-25T00:00:00+00:00 | \n",
- " EXP-1947-11-25-a | \n",
+ " False | \n",
+ " 1958-03-11T00:00:00+00:00 | \n",
+ " EXP-1958-03-11-a | \n",
" CH | \n",
" SNL | \n",
" EXP | \n",
@@ -539,44 +608,92 @@
""
],
"text/plain": [
- ""
+ ""
]
},
- "execution_count": 13,
+ "execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
- "impresso.search.find(\n",
- " embedding=impresso.tools.embed_text(text=\"Schumann politicien\", target=\"text\"),\n",
+ "result = impresso.search.find(\n",
+ " embedding=impresso.tools.embed_text(text=text, target=\"text\"),\n",
" limit=10\n",
- ")"
+ ")\n",
+ "\n",
+ "result"
]
},
{
"cell_type": "code",
- "execution_count": 14,
+ "execution_count": 47,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
- "'gte-768:27ZuvEPAhD39Epu84zQ1PXUHd73baWq9SjyDPa8LML0MNNc8H5Y+vRDkgzyc7wo8erVbvfTRDD4z4Ja9+v4+PclYlz19Frw7plwEPtcDkLxCkzU9R2whPdNVKz3HtYQ8G8ZcvQ4CcbytaJ09CSLDvZDC+71TdAc9ACzxOmeg+Dy0Bpk76kKWPdA6jT0eSTq8/29rPTL4JbwYHAi9iUCfPPraeTxh4se7SAYqvZrCu7zJpRs8FUwmPNA6Db19fLO8QBnivDku4jx2KXS9+WJuPIZQCDmQMuE89GsVujSdgDxz7qC9Lm6GPWCIJj6nhie9ZUMovRHrxboOajC9geeBvdcsz7zdmAG8M98yPWSHIrxqHJS6DYK/PdtrMrzyEXc85y+evK1E2Dz6RAG813BJPdo+47wgfUs8ZUSMPbZDND0t/wS9LWp2uleW5zwi9x49qZ/9u6eGpzw5MCq9cltavdwFu7yn9Sg95dfHPYC7FrxtyDA8DaWgvIVyhTzgj1o8KNmUvE5Yhb1FrtO7s3NSvTL/Z7xEngc9w5geOz1tRbz6lv+8uhOWvKR0EzvZgt06BNydvDeU2boJkqi9uQoMvVfja73l+4y8Fz4FvY+8HT2G6pA9JJoxu2COgT04JVg9KLeXO29APL0MFCK9PEB2vT1uqb2OIM07fRY8vc4xg73Hb0I9Waj7OhroWT2aBja8XxvwPLZB7Ltbs009a0cbvWeZtrwd2HA9rYoaPfYnm7zoOCi9Z7uzPIRE0r01WYa8R5eoPA0/KTyVwJC8CSJDPEAbKj0XqXa9k0T1PNmkWj3G1p09HNGuvKgn8jsnisi7yvBXvAKkfDwWgLc5hW51vDzUoD3b+ug76RTjvNBBT7xkhyI6/aMZPXM7JT0UkCA8cLhHPJcWnzwul8W8OHLcPCyFsTxdvh+8HPOrvOqM7jsWgDe6LwiPPdOgZ70JIsM8Nj5LvYSQ8rwWy3O9fJ6wu/OmBT27iyE8kMWnOU1wFL1hl4s8bTXqPPKCQLzaNyE86aiNvYzz/btEnL+8d1R7vcoUHb054yU9HGBluxMYlbxQ+rM8MkPiPLwaWD26M0s9UgUGPDYVjLzmJMy7LiqMvX+5Tr3UgZa9izA2PPfq4jzrBPq77qnUPBbt8LxppIg8SjwDvA7ZsbxF8k09MkPivAEOBL2zc9K8PgkWvWX/Lb3Rspg7dPcqPUz/yjy7q1Y9BI1RvCi3lzvdBwO9E2PROzl9rrxoy3+9VIRTPOqMbrsvMc683CkAvYC7FrvyEfc8sIO7PNIqJLxeFK48cUuOPKCi6bzmt5K9fRY8vI+Y2Lz1ttG4lQtNvW+xBb26ExY9UlDCu//XKj1a+Y890WUUvSBUjDz1+ks88/HBPbqATz1DUQO9f/8QvE4zXLkzTrQ8+SA8Pf/XKr3wkik8amdQO6rFir2pmDu87qnUOwX+Gj3FGTS9F807vVkSg7uyjg29IvVWvfFOL7yGLos9Ojm0OwLsBj31ttE7TQhVvUVjF7uMN3i8qlTBParFCj1idY488sY6PXk9ULxKFva7jiDNu13F4buaDXi8rF+TPHZ2+LzSAIG9pTAZvMDzQ7tAXdy8zg2+Oy9TS7wr8mo967m9u1897bxf0LO8M060vLGnAL3qQTI8rbUhvIP26btx5Ra8uebGuznjpb0wD1G8j5jYO9uNr7zTeIw9BjBkvQCdOr1X3Y29ACxxPb4DrTqWx9I8aaSIvZwYyrzBZA28Shb2OzMBML3nL568y6xdvJcWHzz3n6Y8sUEJvUxFDT3rbgG8RC8GPce1hDs/Vpo8fvS+POqMbrrQQxc9skfnvRlwzjwCho88HpT2vBbLczu5d0W9FoA3ukxDxbuy0ge8ceNOPWZK6ruFIzk9hGezPJEQZLuqVEG7XoHnPCXFOL1H3IY90zFmvEXyzbzTU2M9yHhMOlViVj3ViwS8OEkdvUfbojzF7iy9rPmbOrXJYDzj7nI9ohp1u8NLmj3OwoG88JIpPQAscTmPvJ09Fzy9PGPLnLyxFoI7fh1+Pa9YtD0LVtS6a9/bPAOojDwQWP88nbJSPFJQwjzA80M8stlJO0ZqWb1WuOQ7Sss5PMgtELr9Epu8U32RO/6zZTyhXu+8M701PfRpTT0Z1sU87AiKvG9H/jp+qQI9YdsFPEkPND3QOg098P/iPLmbCrzN4O48ub2HvBRuozwDr868s+SbvH7SQT2KuCo8EVyPPK4HHT7wtCa9M7ttPCmVmjw/Vho8/aHRvOCP2rx/Sk28YlMRPLvxmDyLe3K8vSWqPLLZyb1SAz49NVkGu6bsHrzGj3e9RmrZvDbRET1ENki8m+v6PBhHjzz7S8M6cuzYvGHbhTxjyVS92jchuwreyLvVi4Q8os84PfoihLxf8jA8jtxSvbqAz7y29q87YIy5PFBnbbxuhLY767m9vOG8qTx9fLM8EC9APDYczrumOSO9hptEum+xhbt1IOo8OQaHPaKLPjwzI609ey1nPMdt+jxaREy9BnamPKz5G7wLVtS8EFE9PVRbFD0Rwoa9lAB7POecVzt7wC29j02cO5Utyjwr8mo8xo93vE+r57uo3LU8ITlRvB8uf719WjY8SMDnPFa4ZDuyjo25msK7vJZambo+Kcs80PRKPb97uLsCpHy98Qq1PLiX+jz/iia9AuwGvcGvybzdMEI9QBuqvXOG4Tx/sgw9ZFwbPZPXuzz+rCO9KPsRPBSXYjzPhck7d6F/vVnsdb0rp647vgMtPEScv7zwkim8D+DzvI1kRz04BSO9zNksvCovIz3z84k7o/yHu2EExTw6ObS7nO+Kuo73jbymN1u8imsmPTZgSDzzWQE9W4qOu1VA2TsKk4w7yC2Qu+UdCjwHF/E8mgY2PL4KbzyA3ZO8Pt4Ou+nQaLxbs808U32RPZbpz71u8e+8eIHKu5w6Rz0Cyok8x7zGvKZchD33n6a8Q768OhklkjsF/po8vr8yvOtuAT1f8jA7a5JXvfNZAT3uXpi8MA/Ru84Nvjr0HMm8l2HbvHsmJbvhIqE8TN1NPTrsLz1eNiu8bJthvFHYtrzpySa9rbUhvDtmAz2gVy28zlj6Oj68kby7iyG9bC4ovdhXVr2YHeG8UnK/PNDUFb0gn8g8GEP/Oyl6X7wBWUC8G3yEPWC1+LyAd5y8ZUOoPH1h+DsxPKC74QdmuwDhNDu0Bpk7VR7cPED35Lx9YXg7AE7uvP7OoDz4Wyw8+iIEvbvxmL3HS328Z7uzPGnPj71OM9y8iWnePATTEz0sH7o6PIgAPfG76Lz0QA69YB8AvLvPGz2NhsS8KB2PPCL11ro3lFk6mUhoPdG5WjxKyzm8Q3MAuuBEnrzsKoe8+0tDO8CohzvhB+a7wKgHPbduuzxLsP48stKHPGPtmbzqQTI6nO8KvK8tLb34W6w80W6ePCKx3DznC1m97HXDvBtZIzw5yOo8ZKmfvN/10Twa4Zc8eyalOxw+6Lyp43c8V+UzPJc5gL1xMNO84I/aO8WItTsxX4E9U30RPAyDI7xndzk8QBniOy/tU70PSDM9W9VKPKBXrbxD4gE9DvsuOy1q9jnljAu9bvHvvO8aHr0+Kcu7+7j8PCL1Vrp/Q4u8cqjePB2vsTwM6Zq8eBSRvMZGA72M7Ds8Wl0/u4EIm72wpbg8vBrYPPtLQzwO+y67EKCJvMo00ryY0qS85NCFPUz4CDyPmFi8T6vnO5P5OLz98J27zB2nPBBRvTz1jRK9Gp2dPBTbXLyMqEG8s5XPPL+C+jw2YEi8pTAZPBYP7jxxBUw979Yju6SWkDws0jW9sWG+u7KODbtp78Q70NJNvQ1hprwCnwI9PQfOPAZ9aLz0a5W6ZwqAPErLuTuBM6K762w5PdZphz0V5GY9pTAZvPcOKL3XA5C8Hms3PePwOjyfcCA91Ro7vNx0PL03lNm8hi6LPI6RFryc7wq6cqGcvDGH3Lxpic08+pb/vFmo+7x0h8U93HJ0PTvKsj1gH4C9PuVQPabsnjxWuOQ7otb6PMK6m7wYQ/87DfIkPQAscbuQKVe9t7l3PPopRr1k0t68widVPPCZaz2srBe9Q3rCPIjP1Tw28448'"
+ "copyrightStatus in_cpy\n",
+ "type ar\n",
+ "sourceMedium print\n",
+ "title M. ADENAUER A PARIS\n",
+ "locationEntities [{'uid': '2-54-Paris', 'count': 2}, {'uid': '2...\n",
+ "personEntities [{'uid': '2-50-Aristide_Briand', 'count': 1}, ...\n",
+ "organisationEntities []\n",
+ "newsAgenciesEntities []\n",
+ "topics [{'uid': 'tm-fr-all-v2.0_tp87_fr', 'relevance'...\n",
+ "transcriptLength 621\n",
+ "totalPages 1\n",
+ "languageCode fr\n",
+ "isOnFrontPage True\n",
+ "publicationDate 1951-11-23T00:00:00+00:00\n",
+ "issueUid EXP-1951-11-23-a\n",
+ "countryCode CH\n",
+ "providerCode SNL\n",
+ "mediaUid EXP\n",
+ "mediaType newspaper\n",
+ "Name: EXP-1951-11-23-a-i0001, dtype: object"
]
},
- "execution_count": 14,
+ "execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
- "impresso.content_items.get_embeddings(\"JDG-1963-09-06-a-i0004\")[0]"
+ "result.df.loc['EXP-1951-11-23-a-i0001']"
]
},
{
"cell_type": "code",
- "execution_count": 15,
+ "execution_count": 48,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "'gte-768:pRhKvXXfWz0f+gq9dFx+PZVbvb04PCC9mMWnOj6kQrwie6A8JTVevWjFYbxStA48ECQLvZTBtD2Qece8ramIPSPfTz2Pm8Q6GAS5PTq0q7umh0s9hTmdPRPpmjyLlAK8mz0zvJFJKb2Ra6Y9drPwvdbBnb1stXg9cP1lvVgzXLuFOLk8YzDwPN7XpD3gqXG8XJJ0PeEK0jwSC5g6TGOXPIP51TuAIGq8HGPRu3p0TTxk1co8fjaOu6fJfbyBgUq8LW1iPARMQ7uD+dW7ExdxPbT5m7rYnHQ8jwpGPSW6Azzpnfu9tqpPPRh9KD6/fFy9hIeFvHqhHLtONsi8igZTO1X/SrxB+lC9EF0zPebY67y75Js9468sPd4au7yx8+A8DxuBO67497zAeQ29XzQAPeOCXbyrxOY8LpnNPaY6RzyNe4+924uEPG7/Lb2UR+E8BzxaPK0a9byFcWG7aJeLOycyDzzN+Jo9/9hOPc5xCj197Ng7Ky5/vR7Pgzz+zfw8CH0FvSpwjr337U48AEUBvbiOjbpHS0g9q5YQPa0a9TyA3O+8mnbbvK6/z7yX3ho93JfdvPlZgbxTRKm9nEY9vSUeM73ucxi9JqTfvMuk9zwteDQ93tbAvG+kiLsPfzA97Admu5pqgrxKDQm9zdeBvK1Rsr2nslK8ySCTu6jHkr0JnTo9fxSRPMdFXz04avY8+XqaPUv35Lzcl908KZR2vfJBMr10tFQ9O1kGO+nTVLz0Sjy9ZXqlu/3PRL3i0am8OsvWvN4DED1dyM28VegfvGBAWT1Mvrw7KcFFvDW7Cj2/hy49j5tEvHz3qjxJc4C8xcoEvT9V9jzmzBI6H9gNvTFznT0kFSk7a/eHPVVDRTvGvzI98kEyPMCbCjxOHx08T+d7PWASg7zHtOC8DwVdPNpBzzyYxSe5ml+wvGWoezmwN1s8Twn5PIQ6Ab1ASR06PL01PbHFCr0MsLK9x3u4vDMhgj0SIsO8qs84PKQaEr3Pvg49s1+TPMqYnrxOS4i9C7uEveNUhzt0Lqi8rJ+avS/lbTu7jEU953J0u3knST2FceE6erjHPJppnj1awQs9k8wGvcPPm7y0+Rs7x501vV6EU7x139u72DhFPZXVELwCChE9+vOJu32+AjzJ80O7Y3RqPL72r7yk7UI9Ir8avS14tLzjdgS9U8DnvBgmNr3uc5g8jeoQPfoh4LuD+VU87lEbOwBcrDve1kA8xgMtuzHAIT01CA+9luHpPG/SXjrpvCm8tAX1PItyhTx8n1Q9zy0QuqszaL0ZL0A8ni3Ku+s/B729zIy9Wn2RvDq0qzuvZKo7s41pvTfQ7bwd8QA9PSy3O316iD1jwW49vaz6u83XgTzRpRs84v9/PZpIhTwsxwC90aWbO9huHrx7R/68wWPpPArIwb2WhsQ84JLGO0/yzbzTHac7HUwmPFkwjT246bK85tjrPGOTGL1Z44i8yuaGveOkWr0h+EI8Po0XPGnkD73Kr0m8pUWZvEWPwrwIn4K9qWGbPewH5jsZRms7GuvFPCTzKz2eLcq8+2sVvAc82rwOwyo9gfsdPcrocb2o0QC9FLF5vCRD/7uLlAK9RAxlvO+eH71qJkI8Cb+3vOY7FL39PsY7u9nJOvhFpb29fqQ7czF3vSwrML3ddWC8Pei8vMXvUL1LArc8j4QZvJBu9btzzcc8rvj3vBSlIL14/ME6yEIQPfOONj395m89tZ/9u8E1k7wUx508PAq6PKh5Kr1P5/u8DxuBO+NUhzyt7J48heBivcrRxjyzX5M8j+hIPU1Nczw1Qbc8KjlRPIMmJbwCOGc9HZmqvZHxUj0M6Ha8DCxxvYDc7zyhMwW9Po0XPIuUgrwegLe7Z/IwvMlOaTr8Kmo9zy2QutWVsjtywnU72lj6O0sCN73xeto8lEfhPCsAqbyWbxk9GutFPProtzzVlbK86U0ovb5R1TvgGHO9DCxxPFPfFT3DrZ486Ax9PYRyxT2j74o7wIVmvF2xojzRsfQ8JqRfPB4zM71StA48PdGRu+l7fj2sXm+8NTblPMnzQ7tu/608uFg0PQqS6DxdyE07TL68u5wkQL3miJi8SOXQPCW6Azsmdgm8tmbVPHXqrTzoOcy8ZW/TPBh+DD3cMy69wROWu2RxGz0uP4y7oKXVPHj8wbphuGQ93u3rOWMCGju+34Q8QcGou2EQOz2Y3NK87i+evFImXz1sh6K5ScCEPRpgIj6BgUq9fb6Cu3ams7zyuwU9Hq2Gu/r/4rxL64u8YEBZvKFVAj1Xu9C8bQoAPcjIPL2qJw89ySCTPAaXf7uazjG9hXFhvF8e3LzvDaG8fcrbPAEBhzoKqAy9iI7HvKTWFz2VkRa9fdUtvG/SXjuvyFk9AS/dPA2NUTyKHX68PRUMvAVqjb37xjq9D+6xuy/l7TwQwWI7tXEnvIYWvDwmpF+8bUMoPBoYFbygAPu8Iu3wvCToWb21UI49Xh/APWiutjww4h49ldUQvFCMVj2IsMS8hkMLPFqrZzz37U68Vt3NPJQwNrtbiWq9mMWnO/JBMjxASZ07lm+ZvMCbCjzW4ja9Po0XvcXKhLzTVs88nVqZOsWh6L2gjqq8V6QlPBFmvbsaAnE8VHCUO08Jebp6f5882b5xPZVbPb2Lqy29ygcgPZjnpDyrrTu9tPmbO4Vx4bt2prM8Ml35vFrN5DyIVoM9pJXsuzICVDyVA+e7hXHhOxCqN7yJM6K76er/OynYcL00p648kfFSPTPg1jzZmom953J0vLrFbT3uf/G8S+uLPAEN4DyRE1C8Po0XvMq6Gz31Xhi8zy2Qu8uNzLzIQhC9M1qqPcdkjTwv+xE9lNhfPcnzw7w3gJo8ms4xPRkNwzyzMsQ8mOekvOEKUryOcD29Rx75vMEqQTyq2go9V/EpPQc9vr2hMwW9XchNu4fFhD1vu7M85NozPS3yBz39Pka9GtQavLpKEzygpdU6X/CFPKPCuzyQbvW7AFwsu4S1Wz2oNbC88xRjPIVxYTti7j29vSZOvS/OQjw8Q2I8c7W4PfUdbT1odY68l00cPNQ9XL3zNmC9NGO0vLHRYz1KJDS92b7xPLQn8rxP8s28oiizvJyUpb15SUa9gK4ZPWMwcL0mjTQ8teP3vIoG0ztEDOW7ydyYPF3IzbxFj8K8XIYbPYlKzbxS4uQ8Y0YUPMReUrsP7jE8bIeiPFrvYby6lxc9VRb2ux3xgDzHkuM8hLVbPAxjrr17R368l0JKPZQZC700et88OdaoPAqoDD2NIzm9oj/ePJ96Tr2Xe3K8CnBrPKLkuDx3ug+9PvwYPYe7ljvWXIo8SBsqPY72aTuTf4K8gMXEvF3Izbzl7g+92ZCbPNZcCr3DiyG8DxsBPJsmCD1KO988WDPcOUjlULyquI28MIpIPCLtcL10XP68mxDkPOb6aLxRatm8ms4xvV9LqzwEH3Q8flgLvZ1amTy4AF48Kaqau2lTkbyxo408ZOCcPLArAr2K7ye9s1RBvOb66Lzja7I9AHNXPNTwV7zdXrU8djcyvQ7aVb3Y4O47KlvOPC/l7bzEl/o81w0+PJpIhb07h1y8Bq0jvdoqpDsDVxW9QvcBPVbSe7xXd1a8WAWGPADAWz14/ME5cgZwvGlqvDsyLyM8XfUcPL5RVb260D89Qe/+O6DSpDx7Gai7xGkkvcEqwbx6oZy8c7acvIiOxzuIpXK8Mi8jPWApLjws3qs8o+8KO/K7BT2Vv+w8ozwPPRILmLn5Q908zePaOp1amTxIdk+9A6foO5rZAz37rw+9g1R7vCzHALxUnuq7KlB8vJJ0sLyrdBO7Htvcu/rot7zPszw9/EmYvCUHCL0pwUW8pvZMPEUJFj1Ralm80nhMPAJlNj15a0M9oWytPEAczjxQuaW83u1ru4fSQby0+Zs8+UNdPSjjwrw6+KW8TL68uc6qMr1dI3M7YYqOvKg1ML0cevw8Fa4qvQYzUD3a6Xg9ISSuPeHzpj0YXA+8nmbyPH/+7Dy75Js8JwVAOwEjhDwluoO7yW0XPVekJTwkm1W9V2CrPOLRqTyaaoI7FriYPftrFbrn9xm942uyPJsEi7x6dE08'"
+ ]
+ },
+ "execution_count": 48,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "impresso.content_items.get_embeddings(\"EXP-1951-11-23-a-i0001\")[0]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Search by text embedding with `tools.embed_text` with an in-corpus embedding"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 49,
"metadata": {},
"outputs": [
{
@@ -585,9 +702,9 @@
"\n",
"
\n",
"
Search result
\n",
- "
Contains 4 items (0 - 4) of 32 total items.
\n",
+ "
Contains 2 items (0 - 2) of 16 total items.
\n",
"
\n",
- "See this result in the
Impresso App.\n",
+ "See this result in the
Impresso App.\n",
"
\n",
"
\n",
"Data preview:
\n",
@@ -654,69 +771,47 @@
" \n",
" \n",
" \n",
- " | jdpl-1944-06-28-a-i0006 | \n",
+ " jdpl-1931-07-18-a-i0012 | \n",
" in_cpy | \n",
" ar | \n",
" print | \n",
- " LES CONFÉRENCES | \n",
- " [{'uid': '2-54-Marseille', 'count': 1}, {'uid'... | \n",
- " [{'uid': '2-50-Robert_Grimm', 'count': 1}, {'u... | \n",
- " [] | \n",
+ " Les heures décisives | \n",
+ " [{'uid': '2-54-France', 'count': 1}, {'uid': '... | \n",
+ " [{'uid': '2-50-Charles_Dawes', 'count': 1}, {'... | \n",
+ " [{'uid': '2-53-Société_des_Nations', 'count': ... | \n",
" [] | \n",
- " [{'uid': 'tm-fr-all-v2.0_tp29_fr', 'relevance'... | \n",
- " 200 | \n",
+ " [{'uid': 'tm-fr-all-v2.0_tp87_fr', 'relevance'... | \n",
+ " 705 | \n",
" 1 | \n",
" fr | \n",
- " False | \n",
- " 1944-06-28T00:00:00+00:00 | \n",
- " jdpl-1944-06-28-a | \n",
+ " True | \n",
+ " 1931-07-18T00:00:00+00:00 | \n",
+ " jdpl-1931-07-18-a | \n",
" FR | \n",
" BNF | \n",
" jdpl | \n",
" newspaper | \n",
"
\n",
" \n",
- " | jdpl-1937-10-31-a-i0055 | \n",
+ " lepetitparisien-1928-05-07-a-i0047 | \n",
" in_cpy | \n",
" ar | \n",
" print | \n",
- " NaN | \n",
- " [{'uid': '2-54-Europe_centrale', 'count': 1}] | \n",
- " [{'uid': '2-50-Paul_Bourget', 'count': 1}] | \n",
+ " UNE ÉDITION EN ALLEMAND DES DISCOURS DE M. BRI... | \n",
+ " [{'uid': '2-54-Allemagne', 'count': 2}, {'uid'... | \n",
+ " [{'uid': '2-50-Gustav_Stresemann', 'count': 3}... | \n",
" [] | \n",
" [] | \n",
- " [{'uid': 'tm-fr-all-v2.0_tp55_fr', 'relevance'... | \n",
- " 265 | \n",
+ " [{'uid': 'tm-fr-all-v2.0_tp29_fr', 'relevance'... | \n",
+ " 246 | \n",
" 1 | \n",
" fr | \n",
" False | \n",
- " 1937-10-31T00:00:00+00:00 | \n",
- " jdpl-1937-10-31-a | \n",
+ " 1928-05-07T00:00:00+00:00 | \n",
+ " lepetitparisien-1928-05-07-a | \n",
" FR | \n",
" BNF | \n",
- " jdpl | \n",
- " newspaper | \n",
- "
\n",
- " \n",
- " | oecaen-1941-10-21-a-i0056 | \n",
- " in_cpy | \n",
- " ar | \n",
- " print | \n",
- " LA FRANCE européenne | \n",
- " [{'uid': '2-54-France', 'count': 1}, {'uid': '... | \n",
- " [{'uid': '2-50-Armand_Jean_du_Plessis_de_Riche... | \n",
- " [{'uid': '2-53-Europe', 'count': 1}] | \n",
- " [] | \n",
- " [{'uid': 'tm-fr-all-v2.0_tp10_fr', 'relevance'... | \n",
- " 168 | \n",
- " 1 | \n",
- " fr | \n",
- " True | \n",
- " 1941-10-21T00:00:00+00:00 | \n",
- " oecaen-1941-10-21-a | \n",
- " FR | \n",
- " BNF | \n",
- " oecaen | \n",
+ " lepetitparisien | \n",
" newspaper | \n",
"
\n",
" \n",
@@ -724,71 +819,139 @@
""
],
"text/plain": [
- ""
+ ""
]
},
- "execution_count": 15,
+ "execution_count": 49,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
- "impresso.search.find(\n",
+ "result = impresso.search.find(\n",
" country=\"FR\",\n",
- " embedding=impresso.content_items.get_embeddings(\"JDG-1963-09-06-a-i0004\")[0],\n",
- " limit=4\n",
- ")\n"
+ " embedding=impresso.content_items.get_embeddings(\"EXP-1951-11-23-a-i0001\")[0],\n",
+ " limit=2\n",
+ ")\n",
+ "\n",
+ "result"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 50,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "copyrightStatus in_cpy\n",
+ "type ar\n",
+ "sourceMedium print\n",
+ "title UNE ÉDITION EN ALLEMAND DES DISCOURS DE M. BRI...\n",
+ "locationEntities [{'uid': '2-54-Allemagne', 'count': 2}, {'uid'...\n",
+ "personEntities [{'uid': '2-50-Gustav_Stresemann', 'count': 3}...\n",
+ "organisationEntities []\n",
+ "newsAgenciesEntities []\n",
+ "topics [{'uid': 'tm-fr-all-v2.0_tp29_fr', 'relevance'...\n",
+ "transcriptLength 246\n",
+ "totalPages 1\n",
+ "languageCode fr\n",
+ "isOnFrontPage False\n",
+ "publicationDate 1928-05-07T00:00:00+00:00\n",
+ "issueUid lepetitparisien-1928-05-07-a\n",
+ "countryCode FR\n",
+ "providerCode BNF\n",
+ "mediaUid lepetitparisien\n",
+ "mediaType newspaper\n",
+ "Name: lepetitparisien-1928-05-07-a-i0047, dtype: object"
+ ]
+ },
+ "execution_count": 50,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "result.df.loc['lepetitparisien-1928-05-07-a-i0047']"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 51,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "'UNE ÉDITION EN ALLEMAND DES DISCOURS DE M. BRIAND PRÉFACÉE PAR M. STRE[...]'"
+ ]
+ },
+ "execution_count": 51,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "result.df.loc['lepetitparisien-1928-05-07-a-i0047']['title']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "## Convert embedding to an array of floats and back"
+ "### Other tools"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Convert embedding to an array of floats and back"
]
},
{
"cell_type": "code",
- "execution_count": 16,
+ "execution_count": 54,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
- "'gte-768:27ZuvEPAhD39Epu84zQ1PXUHd73baWq9SjyDPa8LML0MNNc8H5Y+vRDkgzyc7wo8erVbvfTRDD4z4Ja9+v4+PclYlz19Frw7plwEPtcDkLxCkzU9R2whPdNVKz3HtYQ8G8ZcvQ4CcbytaJ09CSLDvZDC+71TdAc9ACzxOmeg+Dy0Bpk76kKWPdA6jT0eSTq8/29rPTL4JbwYHAi9iUCfPPraeTxh4se7SAYqvZrCu7zJpRs8FUwmPNA6Db19fLO8QBnivDku4jx2KXS9+WJuPIZQCDmQMuE89GsVujSdgDxz7qC9Lm6GPWCIJj6nhie9ZUMovRHrxboOajC9geeBvdcsz7zdmAG8M98yPWSHIrxqHJS6DYK/PdtrMrzyEXc85y+evK1E2Dz6RAG813BJPdo+47wgfUs8ZUSMPbZDND0t/wS9LWp2uleW5zwi9x49qZ/9u6eGpzw5MCq9cltavdwFu7yn9Sg95dfHPYC7FrxtyDA8DaWgvIVyhTzgj1o8KNmUvE5Yhb1FrtO7s3NSvTL/Z7xEngc9w5geOz1tRbz6lv+8uhOWvKR0EzvZgt06BNydvDeU2boJkqi9uQoMvVfja73l+4y8Fz4FvY+8HT2G6pA9JJoxu2COgT04JVg9KLeXO29APL0MFCK9PEB2vT1uqb2OIM07fRY8vc4xg73Hb0I9Waj7OhroWT2aBja8XxvwPLZB7Ltbs009a0cbvWeZtrwd2HA9rYoaPfYnm7zoOCi9Z7uzPIRE0r01WYa8R5eoPA0/KTyVwJC8CSJDPEAbKj0XqXa9k0T1PNmkWj3G1p09HNGuvKgn8jsnisi7yvBXvAKkfDwWgLc5hW51vDzUoD3b+ug76RTjvNBBT7xkhyI6/aMZPXM7JT0UkCA8cLhHPJcWnzwul8W8OHLcPCyFsTxdvh+8HPOrvOqM7jsWgDe6LwiPPdOgZ70JIsM8Nj5LvYSQ8rwWy3O9fJ6wu/OmBT27iyE8kMWnOU1wFL1hl4s8bTXqPPKCQLzaNyE86aiNvYzz/btEnL+8d1R7vcoUHb054yU9HGBluxMYlbxQ+rM8MkPiPLwaWD26M0s9UgUGPDYVjLzmJMy7LiqMvX+5Tr3UgZa9izA2PPfq4jzrBPq77qnUPBbt8LxppIg8SjwDvA7ZsbxF8k09MkPivAEOBL2zc9K8PgkWvWX/Lb3Rspg7dPcqPUz/yjy7q1Y9BI1RvCi3lzvdBwO9E2PROzl9rrxoy3+9VIRTPOqMbrsvMc683CkAvYC7FrvyEfc8sIO7PNIqJLxeFK48cUuOPKCi6bzmt5K9fRY8vI+Y2Lz1ttG4lQtNvW+xBb26ExY9UlDCu//XKj1a+Y890WUUvSBUjDz1+ks88/HBPbqATz1DUQO9f/8QvE4zXLkzTrQ8+SA8Pf/XKr3wkik8amdQO6rFir2pmDu87qnUOwX+Gj3FGTS9F807vVkSg7uyjg29IvVWvfFOL7yGLos9Ojm0OwLsBj31ttE7TQhVvUVjF7uMN3i8qlTBParFCj1idY488sY6PXk9ULxKFva7jiDNu13F4buaDXi8rF+TPHZ2+LzSAIG9pTAZvMDzQ7tAXdy8zg2+Oy9TS7wr8mo967m9u1897bxf0LO8M060vLGnAL3qQTI8rbUhvIP26btx5Ra8uebGuznjpb0wD1G8j5jYO9uNr7zTeIw9BjBkvQCdOr1X3Y29ACxxPb4DrTqWx9I8aaSIvZwYyrzBZA28Shb2OzMBML3nL568y6xdvJcWHzz3n6Y8sUEJvUxFDT3rbgG8RC8GPce1hDs/Vpo8fvS+POqMbrrQQxc9skfnvRlwzjwCho88HpT2vBbLczu5d0W9FoA3ukxDxbuy0ge8ceNOPWZK6ruFIzk9hGezPJEQZLuqVEG7XoHnPCXFOL1H3IY90zFmvEXyzbzTU2M9yHhMOlViVj3ViwS8OEkdvUfbojzF7iy9rPmbOrXJYDzj7nI9ohp1u8NLmj3OwoG88JIpPQAscTmPvJ09Fzy9PGPLnLyxFoI7fh1+Pa9YtD0LVtS6a9/bPAOojDwQWP88nbJSPFJQwjzA80M8stlJO0ZqWb1WuOQ7Sss5PMgtELr9Epu8U32RO/6zZTyhXu+8M701PfRpTT0Z1sU87AiKvG9H/jp+qQI9YdsFPEkPND3QOg098P/iPLmbCrzN4O48ub2HvBRuozwDr868s+SbvH7SQT2KuCo8EVyPPK4HHT7wtCa9M7ttPCmVmjw/Vho8/aHRvOCP2rx/Sk28YlMRPLvxmDyLe3K8vSWqPLLZyb1SAz49NVkGu6bsHrzGj3e9RmrZvDbRET1ENki8m+v6PBhHjzz7S8M6cuzYvGHbhTxjyVS92jchuwreyLvVi4Q8os84PfoihLxf8jA8jtxSvbqAz7y29q87YIy5PFBnbbxuhLY767m9vOG8qTx9fLM8EC9APDYczrumOSO9hptEum+xhbt1IOo8OQaHPaKLPjwzI609ey1nPMdt+jxaREy9BnamPKz5G7wLVtS8EFE9PVRbFD0Rwoa9lAB7POecVzt7wC29j02cO5Utyjwr8mo8xo93vE+r57uo3LU8ITlRvB8uf719WjY8SMDnPFa4ZDuyjo25msK7vJZambo+Kcs80PRKPb97uLsCpHy98Qq1PLiX+jz/iia9AuwGvcGvybzdMEI9QBuqvXOG4Tx/sgw9ZFwbPZPXuzz+rCO9KPsRPBSXYjzPhck7d6F/vVnsdb0rp647vgMtPEScv7zwkim8D+DzvI1kRz04BSO9zNksvCovIz3z84k7o/yHu2EExTw6ObS7nO+Kuo73jbymN1u8imsmPTZgSDzzWQE9W4qOu1VA2TsKk4w7yC2Qu+UdCjwHF/E8mgY2PL4KbzyA3ZO8Pt4Ou+nQaLxbs808U32RPZbpz71u8e+8eIHKu5w6Rz0Cyok8x7zGvKZchD33n6a8Q768OhklkjsF/po8vr8yvOtuAT1f8jA7a5JXvfNZAT3uXpi8MA/Ru84Nvjr0HMm8l2HbvHsmJbvhIqE8TN1NPTrsLz1eNiu8bJthvFHYtrzpySa9rbUhvDtmAz2gVy28zlj6Oj68kby7iyG9bC4ovdhXVr2YHeG8UnK/PNDUFb0gn8g8GEP/Oyl6X7wBWUC8G3yEPWC1+LyAd5y8ZUOoPH1h+DsxPKC74QdmuwDhNDu0Bpk7VR7cPED35Lx9YXg7AE7uvP7OoDz4Wyw8+iIEvbvxmL3HS328Z7uzPGnPj71OM9y8iWnePATTEz0sH7o6PIgAPfG76Lz0QA69YB8AvLvPGz2NhsS8KB2PPCL11ro3lFk6mUhoPdG5WjxKyzm8Q3MAuuBEnrzsKoe8+0tDO8CohzvhB+a7wKgHPbduuzxLsP48stKHPGPtmbzqQTI6nO8KvK8tLb34W6w80W6ePCKx3DznC1m97HXDvBtZIzw5yOo8ZKmfvN/10Twa4Zc8eyalOxw+6Lyp43c8V+UzPJc5gL1xMNO84I/aO8WItTsxX4E9U30RPAyDI7xndzk8QBniOy/tU70PSDM9W9VKPKBXrbxD4gE9DvsuOy1q9jnljAu9bvHvvO8aHr0+Kcu7+7j8PCL1Vrp/Q4u8cqjePB2vsTwM6Zq8eBSRvMZGA72M7Ds8Wl0/u4EIm72wpbg8vBrYPPtLQzwO+y67EKCJvMo00ryY0qS85NCFPUz4CDyPmFi8T6vnO5P5OLz98J27zB2nPBBRvTz1jRK9Gp2dPBTbXLyMqEG8s5XPPL+C+jw2YEi8pTAZPBYP7jxxBUw979Yju6SWkDws0jW9sWG+u7KODbtp78Q70NJNvQ1hprwCnwI9PQfOPAZ9aLz0a5W6ZwqAPErLuTuBM6K762w5PdZphz0V5GY9pTAZvPcOKL3XA5C8Hms3PePwOjyfcCA91Ro7vNx0PL03lNm8hi6LPI6RFryc7wq6cqGcvDGH3Lxpic08+pb/vFmo+7x0h8U93HJ0PTvKsj1gH4C9PuVQPabsnjxWuOQ7otb6PMK6m7wYQ/87DfIkPQAscbuQKVe9t7l3PPopRr1k0t68widVPPCZaz2srBe9Q3rCPIjP1Tw28448'"
+ "'gte-768:8gU7vdWAjz3oWoK9o8o+PUHElL3dkT+9zky+vFKAfT34Ejg9S8Y/vSqUE71SgSE98ArSvMs/AT4x0a68X4zWPXoJnD2/+Rs7OJ/IPS4awLz4VjI9nRS0PedVKz3oMy69kP8Qve/ujzyU9L49KpSTvVwUS73ilTI9rkowvTgcaz11fhg9Q++bPUEzlj3ZLRC8UXdzPQKYwDw1x4C7CEn0PBl2pju5K4K886TavJmmFr2Zg/U8IMFiOy66g7zq/dS8QUG3PFI0Hb3fHSe8PIE+PYrQljy7ZKq7piDNPNNeUjwkbuO9Byf3PFWAPT7+w2u9pQNnvHniR7tFl0W8Qh86PDd9S7wGjW482+mVPCM6krw0AY09J6xiPSQXcb1gatm8fbX4O6JNnLtH0ZE7IT8pPccBAr0wFSm8x5KAPUfy6jyCKRE8ep5NPIPCdTwWsHI9EkzDOu4x5rySWja8q1RevCF9aDzkX1k9uSpePNbW3btIQBM9ez1tu+t7GzypaDo993ivvLvEZr37rEC8jUf+u0lq9rrQheY8MXBOu07ejjq+G5k8fracvACdV7wuWP88zKQUvQ230TgY6r69iz8Yvc7dPL0YyEG9AtejPJBRrD22msM9f3F+vN3CgbwTRyw9HO6xvPSfQ7wQSJC91Ll3vJaOx73uMWa8VKyoPPD8sL3Mc9I8ndonvanXOz12+zo9es8PPTSqmrs7NLo83IL6vIi4x7w7lZo9tXSTPCwf1zywO6u9S0mdPAsFur2/GvU6GJd/vALXIz3BVMG8C95lvDzAIT0o60U8oAqGvRgbATzG43c9hMOZvDc/jDrCjg27o8q+vOm/FTz0YQQ7CYPAvDFwTj3K2ck6g8L1PFTrC73zgt06L7nfPF7fFT0zjbS7OfuRPBJMwzu6R8S8PY/fPH702zw2Q/+7cDvCvDyBPj0W4TS7TKTCPGjSe70+6yg85+apvPWHtL37bV297hANPVw2yD0g8iS97DehOo/iqjxNwSg9EkxDvPCNr7wj2bG8wUBlvWIFBj0jOW680EenvRfcnbxmWvA8t1HyvHo+ET17HBQ8skCCPNlrTzz8mQg961j6O86Lobwu21w785EiveNROLzqeve83UmSPYowU7wXPFo79GGEOksFozxc+Ai88yKhvJerrbxaSiQ9ROlgvW3lM72r0iS8AhuevB9EwLy0Osc8sET1PJ72abxCgBo9OthwPLV0k7uAbOe83cKBOoVPgT1qbai76pz0PCP7LjyJlsq8H3UCPQ23UTyaYfg7K4C3OXFYKL0bT5K8rkowPGqrZzzqPDi9xkRYvNbW3bvr29c8U18kvf0IirzSQWw8l0pNvOKtwT1DJJE9GPmDPIVdojytPA89jialu2qsi7yG/EG85yRpvL/5mzvPyeC7OE0tPVhBmr1CMnI9Bo3uu3J1jrx6/628RyMtPQVAaj2WUAi5pIZEPVXI6rxoM1w5SR1yvUv2Xb2e9428j+IqPLWVbLzU/XG8glkvvYi4RzwyTtG8CJecPfnwujwBuj08U84lu1LiAT1Lh9y7rF4MvXao+zq+G5k8q6FiPWQ+rrwJImC9n1dKvMlOBj3TjnC9vxr1OmHoHzt8ONY8JdSavG7f+Lz9JMy76v3UPEGFMb0bcY88ldLBvKq1vrsE0ei8lr8JvYIRgr1knuo7jgSovC66gzwjOW49B6pUvHUPl7xoM9w8x6ChPfzJpjtDqyE96R/SvM9ppLyeuCq82BAqPLcNeL1WpxG9vdw1vD5L5TzDbJA8YZVgvWQ+rjzaiDU9QEbOPBoVxjyRPVA8EirGPLay0js/10w9AZSNvbnKoTzJ+8Y8WhnivDONtDwdqje9S4fcO/OCXbyWUIg80gOtOTdg5TvpETE9V2MXPZcMDrxgSQC8u2QqOkNd+bymIM08llAIPeiir7znJOm8hX57PYQjVj0fg6O7as1kvRYgmDyJUtC863sbvFU4ED3ESpM87bRDPeDxuz1gSQC8i89yvPusQDzuYig9/UbJPP1Gybzkbh48bGiRPFUHTj3AFgK9SSy3PAmDwLy0Vy28YAl5O5jIEz2lZEe8DbdROY/iKr0ALla99bypPGAJeTyPQwu8ZJ7qPNUaWDxEewO9334HPVqrhD0bcGu85voFvWnv4Tz3F089/VWOPWYW9juuGhI9Bo6SvEfy6jvxiJg80ebGu83w9DxlWxS98sbXPO/taz3vryw8GIoCPYS+Ij688BG9BTJJPPODgbwhoAk9vxr1vHQwcLw+Kgy9uA6cvECnrju5KwK9XERpPUfy6rys74o8cF0/OqHP1byVQUO9+m6BvEdiEL1h6B88gQyrPK/H0jzqLhe9Wl3cvMo6qjwO9jS9WsOTvXTysLtaGWK7DXmSPJpheLxiloQ9q0EmveMgdr1xlwu9g8J1vXt80DxK6Lw8usWKvCaQILsl1Bq8SSw3PJxAHztxl4u8kZ4wvVNQX73CQQk9mNKBPRn4XzwIxzo8/qISPEXHYz3i03G8+NNUPJK7FjxJy1a8dgoAvfDMEruwg9i8Ouc1Pc6s+rv76v87ThxOPGHonzrgOWm9vhuZvIGJzTzkbfo8vZ3SPEKKiL2aYfi7+fA6PBtPErzRZI09V4TwuzYiprtXhZQ8ympIPUxEhr2GXP68DhgyPKh8ljwfpSC99D5jvGvITbuBuWs9PHMdvJRxYTrzgt096MSsvBVB8Tzjj3e8geotPUWXRbz8uwW91/PDvOQNPr33F087ynhpvG9AWbzyBTu9jQk/PF2zaj0mIR+9haEcO8JBCT1+ZAG9wPNgPKAwNjx6t4C9QzygPQgFerxkPq68Snk7PVih1jy3z7i8ZRwxPWHoHzukxSe8maVyvCGgiTthBOI8H3UCvZHc7zoDFWO8z2mkvL/5m7rIvOM8nECfPH8RQr25aUG9B1g5vXTBbj3HMSA9gihtPYnVLT1bWMW8UAkWvWgz3Ln/QbK6ApjAPGIm3zsCWoG86xo7PPI1WT0yr7G8w39IPTSqmrreDwa9FUyDvXTyMDsmkKA78xQAPekRMT2oWpk7CUUBPKh78rwA8Ba9QcSUvFJzADp5Qyi95SqkPOOymLw/aEu9zS/YPLslx71RlFm7jpUmPX9ak71Zvjw9HSxxOYb8QbwvKYW9JaNYPXN0arzdwgG8NGagPHLVyrs/iki7MBUpvFQvBrylA2c80SWqPGHFfrz35zA9I/suPEGTUj1H0ZE8BTJJO9jgC72lA+e7JJU3Pc5MPr1erlM8wHa+PFI0nTxgSQC8BmwVPJrkVb3e8h+93WEhvc1uuzwtnR29exwUvMWluDySOLk8gQwrPdeSYzw6Vje8qPk4vJRx4bxEnNy81/NDu/tMBL2b3768t1FyPB0scTwfBd08LXugvNzHGL3dwoG7bXYyvYJn0Lw3wUU93LM8PdIDLbvxJzg7YhMnvQ5WcTxLh1w8whDHvKk3+DyWUIg8oJEWvKCRFjxDqyE98MwSOpfNqrzG43e9Ed1BvYFK6jy0SGg9vrq4PC+YhryDwvU7rb5IvVZ2T73eAMG89P//PO/t6zpAX4E9GPkDPQ81mLx1Tda7xZcXvX+OZDzgK0i9EM+gPBoCDr3WZ1y9/MkmvDalAz0wFak8x5/9u9P+Fb3QRyc8Kkbruypfnr0teyA9w8xMPMwTljx8+pY8UDm0vN/tCL3maGM7MBWpO6TFpzmKMFO8NQVAPPPjvbw/CI88a1lMvXCbfjw8UHy7NId5vGkSg7nYcQo8oQAYvTdg5boPEve8FuE0uiIdrDy7omk8HC0VvfTBQL1jYCs8piBNvO6SRrxwm/67uagkvSP7rrunX7A9wdHjvIabYbzR5sa8O5WaPOZoYztjIci8DtQ3PY37HT3iNFI8j3MpveB4zLoVQhW85mkHvVr9H71LZoM8UZTZu52bRLoEhQi9RHpfOwRUxrwX3J08QKeuvHODL70Nt9E8ae/hvBVClTztC7Y9jRdgPRfcHT1ID9E7dQ+XO+enxjy4Dpw8AXvaPNR7OLzCjo26hd/buxEv3TthVn29Fp26PJ3MBj26xQq8sZIdPbxCrTxPmhS93x2nO23lszwS7IY8'"
]
},
- "execution_count": 16,
+ "execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
- "original_embedding = impresso.content_items.get_embeddings(\"JDG-1963-09-06-a-i0004\")[0]\n",
+ "original_embedding = impresso.content_items.get_embeddings(\"lepetitparisien-1928-05-07-a-i0047\")[0]\n",
"original_embedding"
]
},
{
"cell_type": "code",
- "execution_count": 18,
+ "execution_count": 55,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
- "[-0.014569963328540325,\n",
- " 0.0648198351264,\n",
- " -0.01892995275557041,\n",
- " 0.044239889830350876,\n",
- " -0.060309845954179764,\n",
- " -0.05722985789179802,\n",
- " 0.06407983601093292,\n",
- " -0.0429798923432827,\n",
- " 0.026269935071468353,\n",
- " -0.04652988538146019]"
+ "[-0.04565996676683426,\n",
+ " 0.07006994634866714,\n",
+ " -0.06364995241165161,\n",
+ " 0.04657996818423271,\n",
+ " -0.07263994961977005,\n",
+ " -0.046769965440034866,\n",
+ " -0.02322998270392418,\n",
+ " 0.061889953911304474,\n",
+ " 0.04493996500968933,\n",
+ " -0.04681996628642082]"
]
},
- "execution_count": 18,
+ "execution_count": 55,
"metadata": {},
"output_type": "execute_result"
}
@@ -802,16 +965,16 @@
},
{
"cell_type": "code",
- "execution_count": 20,
+ "execution_count": 57,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
- "'gte-768:27ZuvEPAhD39Epu84zQ1PXUHd73baWq9SjyDPa8LML0MNNc8H5Y+vRDkgzyc7wo8erVbvfTRDD4z4Ja9+v4+PclYlz19Frw7plwEPtcDkLxCkzU9R2whPdNVKz3HtYQ8G8ZcvQ4CcbytaJ09CSLDvZDC+71TdAc9ACzxOmeg+Dy0Bpk76kKWPdA6jT0eSTq8/29rPTL4JbwYHAi9iUCfPPraeTxh4se7SAYqvZrCu7zJpRs8FUwmPNA6Db19fLO8QBnivDku4jx2KXS9+WJuPIZQCDmQMuE89GsVujSdgDxz7qC9Lm6GPWCIJj6nhie9ZUMovRHrxboOajC9geeBvdcsz7zdmAG8M98yPWSHIrxqHJS6DYK/PdtrMrzyEXc85y+evK1E2Dz6RAG813BJPdo+47wgfUs8ZUSMPbZDND0t/wS9LWp2uleW5zwi9x49qZ/9u6eGpzw5MCq9cltavdwFu7yn9Sg95dfHPYC7FrxtyDA8DaWgvIVyhTzgj1o8KNmUvE5Yhb1FrtO7s3NSvTL/Z7xEngc9w5geOz1tRbz6lv+8uhOWvKR0EzvZgt06BNydvDeU2boJkqi9uQoMvVfja73l+4y8Fz4FvY+8HT2G6pA9JJoxu2COgT04JVg9KLeXO29APL0MFCK9PEB2vT1uqb2OIM07fRY8vc4xg73Hb0I9Waj7OhroWT2aBja8XxvwPLZB7Ltbs009a0cbvWeZtrwd2HA9rYoaPfYnm7zoOCi9Z7uzPIRE0r01WYa8R5eoPA0/KTyVwJC8CSJDPEAbKj0XqXa9k0T1PNmkWj3G1p09HNGuvKgn8jsnisi7yvBXvAKkfDwWgLc5hW51vDzUoD3b+ug76RTjvNBBT7xkhyI6/aMZPXM7JT0UkCA8cLhHPJcWnzwul8W8OHLcPCyFsTxdvh+8HPOrvOqM7jsWgDe6LwiPPdOgZ70JIsM8Nj5LvYSQ8rwWy3O9fJ6wu/OmBT27iyE8kMWnOU1wFL1hl4s8bTXqPPKCQLzaNyE86aiNvYzz/btEnL+8d1R7vcoUHb054yU9HGBluxMYlbxQ+rM8MkPiPLwaWD26M0s9UgUGPDYVjLzmJMy7LiqMvX+5Tr3UgZa9izA2PPfq4jzrBPq77qnUPBbt8LxppIg8SjwDvA7ZsbxF8k09MkPivAEOBL2zc9K8PgkWvWX/Lb3Rspg7dPcqPUz/yjy7q1Y9BI1RvCi3lzvdBwO9E2PROzl9rrxoy3+9VIRTPOqMbrsvMc683CkAvYC7FrvyEfc8sIO7PNIqJLxeFK48cUuOPKCi6bzmt5K9fRY8vI+Y2Lz1ttG4lQtNvW+xBb26ExY9UlDCu//XKj1a+Y890WUUvSBUjDz1+ks88/HBPbqATz1DUQO9f/8QvE4zXLkzTrQ8+SA8Pf/XKr3wkik8amdQO6rFir2pmDu87qnUOwX+Gj3FGTS9F807vVkSg7uyjg29IvVWvfFOL7yGLos9Ojm0OwLsBj31ttE7TQhVvUVjF7uMN3i8qlTBParFCj1idY488sY6PXk9ULxKFva7jiDNu13F4buaDXi8rF+TPHZ2+LzSAIG9pTAZvMDzQ7tAXdy8zg2+Oy9TS7wr8mo967m9u1897bxf0LO8M060vLGnAL3qQTI8rbUhvIP26btx5Ra8uebGuznjpb0wD1G8j5jYO9uNr7zTeIw9BjBkvQCdOr1X3Y29ACxxPb4DrTqWx9I8aaSIvZwYyrzBZA28Shb2OzMBML3nL568y6xdvJcWHzz3n6Y8sUEJvUxFDT3rbgG8RC8GPce1hDs/Vpo8fvS+POqMbrrQQxc9skfnvRlwzjwCho88HpT2vBbLczu5d0W9FoA3ukxDxbuy0ge8ceNOPWZK6ruFIzk9hGezPJEQZLuqVEG7XoHnPCXFOL1H3IY90zFmvEXyzbzTU2M9yHhMOlViVj3ViwS8OEkdvUfbojzF7iy9rPmbOrXJYDzj7nI9ohp1u8NLmj3OwoG88JIpPQAscTmPvJ09Fzy9PGPLnLyxFoI7fh1+Pa9YtD0LVtS6a9/bPAOojDwQWP88nbJSPFJQwjzA80M8stlJO0ZqWb1WuOQ7Sss5PMgtELr9Epu8U32RO/6zZTyhXu+8M701PfRpTT0Z1sU87AiKvG9H/jp+qQI9YdsFPEkPND3QOg098P/iPLmbCrzN4O48ub2HvBRuozwDr868s+SbvH7SQT2KuCo8EVyPPK4HHT7wtCa9M7ttPCmVmjw/Vho8/aHRvOCP2rx/Sk28YlMRPLvxmDyLe3K8vSWqPLLZyb1SAz49NVkGu6bsHrzGj3e9RmrZvDbRET1ENki8m+v6PBhHjzz7S8M6cuzYvGHbhTxjyVS92jchuwreyLvVi4Q8os84PfoihLxf8jA8jtxSvbqAz7y29q87YIy5PFBnbbxuhLY767m9vOG8qTx9fLM8EC9APDYczrumOSO9hptEum+xhbt1IOo8OQaHPaKLPjwzI609ey1nPMdt+jxaREy9BnamPKz5G7wLVtS8EFE9PVRbFD0Rwoa9lAB7POecVzt7wC29j02cO5Utyjwr8mo8xo93vE+r57uo3LU8ITlRvB8uf719WjY8SMDnPFa4ZDuyjo25msK7vJZambo+Kcs80PRKPb97uLsCpHy98Qq1PLiX+jz/iia9AuwGvcGvybzdMEI9QBuqvXOG4Tx/sgw9ZFwbPZPXuzz+rCO9KPsRPBSXYjzPhck7d6F/vVnsdb0rp647vgMtPEScv7zwkim8D+DzvI1kRz04BSO9zNksvCovIz3z84k7o/yHu2EExTw6ObS7nO+Kuo73jbymN1u8imsmPTZgSDzzWQE9W4qOu1VA2TsKk4w7yC2Qu+UdCjwHF/E8mgY2PL4KbzyA3ZO8Pt4Ou+nQaLxbs808U32RPZbpz71u8e+8eIHKu5w6Rz0Cyok8x7zGvKZchD33n6a8Q768OhklkjsF/po8vr8yvOtuAT1f8jA7a5JXvfNZAT3uXpi8MA/Ru84Nvjr0HMm8l2HbvHsmJbvhIqE8TN1NPTrsLz1eNiu8bJthvFHYtrzpySa9rbUhvDtmAz2gVy28zlj6Oj68kby7iyG9bC4ovdhXVr2YHeG8UnK/PNDUFb0gn8g8GEP/Oyl6X7wBWUC8G3yEPWC1+LyAd5y8ZUOoPH1h+DsxPKC74QdmuwDhNDu0Bpk7VR7cPED35Lx9YXg7AE7uvP7OoDz4Wyw8+iIEvbvxmL3HS328Z7uzPGnPj71OM9y8iWnePATTEz0sH7o6PIgAPfG76Lz0QA69YB8AvLvPGz2NhsS8KB2PPCL11ro3lFk6mUhoPdG5WjxKyzm8Q3MAuuBEnrzsKoe8+0tDO8CohzvhB+a7wKgHPbduuzxLsP48stKHPGPtmbzqQTI6nO8KvK8tLb34W6w80W6ePCKx3DznC1m97HXDvBtZIzw5yOo8ZKmfvN/10Twa4Zc8eyalOxw+6Lyp43c8V+UzPJc5gL1xMNO84I/aO8WItTsxX4E9U30RPAyDI7xndzk8QBniOy/tU70PSDM9W9VKPKBXrbxD4gE9DvsuOy1q9jnljAu9bvHvvO8aHr0+Kcu7+7j8PCL1Vrp/Q4u8cqjePB2vsTwM6Zq8eBSRvMZGA72M7Ds8Wl0/u4EIm72wpbg8vBrYPPtLQzwO+y67EKCJvMo00ryY0qS85NCFPUz4CDyPmFi8T6vnO5P5OLz98J27zB2nPBBRvTz1jRK9Gp2dPBTbXLyMqEG8s5XPPL+C+jw2YEi8pTAZPBYP7jxxBUw979Yju6SWkDws0jW9sWG+u7KODbtp78Q70NJNvQ1hprwCnwI9PQfOPAZ9aLz0a5W6ZwqAPErLuTuBM6K762w5PdZphz0V5GY9pTAZvPcOKL3XA5C8Hms3PePwOjyfcCA91Ro7vNx0PL03lNm8hi6LPI6RFryc7wq6cqGcvDGH3Lxpic08+pb/vFmo+7x0h8U93HJ0PTvKsj1gH4C9PuVQPabsnjxWuOQ7otb6PMK6m7wYQ/87DfIkPQAscbuQKVe9t7l3PPopRr1k0t68widVPPCZaz2srBe9Q3rCPIjP1Tw28448'"
+ "'gte-768:8gU7vdWAjz3oWoK9o8o+PUHElL3dkT+9zky+vFKAfT34Ejg9S8Y/vSqUE71SgSE98ArSvMs/AT4x0a68X4zWPXoJnD2/+Rs7OJ/IPS4awLz4VjI9nRS0PedVKz3oMy69kP8Qve/ujzyU9L49KpSTvVwUS73ilTI9rkowvTgcaz11fhg9Q++bPUEzlj3ZLRC8UXdzPQKYwDw1x4C7CEn0PBl2pju5K4K886TavJmmFr2Zg/U8IMFiOy66g7zq/dS8QUG3PFI0Hb3fHSe8PIE+PYrQljy7ZKq7piDNPNNeUjwkbuO9Byf3PFWAPT7+w2u9pQNnvHniR7tFl0W8Qh86PDd9S7wGjW482+mVPCM6krw0AY09J6xiPSQXcb1gatm8fbX4O6JNnLtH0ZE7IT8pPccBAr0wFSm8x5KAPUfy6jyCKRE8ep5NPIPCdTwWsHI9EkzDOu4x5rySWja8q1RevCF9aDzkX1k9uSpePNbW3btIQBM9ez1tu+t7GzypaDo993ivvLvEZr37rEC8jUf+u0lq9rrQheY8MXBOu07ejjq+G5k8fracvACdV7wuWP88zKQUvQ230TgY6r69iz8Yvc7dPL0YyEG9AtejPJBRrD22msM9f3F+vN3CgbwTRyw9HO6xvPSfQ7wQSJC91Ll3vJaOx73uMWa8VKyoPPD8sL3Mc9I8ndonvanXOz12+zo9es8PPTSqmrs7NLo83IL6vIi4x7w7lZo9tXSTPCwf1zywO6u9S0mdPAsFur2/GvU6GJd/vALXIz3BVMG8C95lvDzAIT0o60U8oAqGvRgbATzG43c9hMOZvDc/jDrCjg27o8q+vOm/FTz0YQQ7CYPAvDFwTj3K2ck6g8L1PFTrC73zgt06L7nfPF7fFT0zjbS7OfuRPBJMwzu6R8S8PY/fPH702zw2Q/+7cDvCvDyBPj0W4TS7TKTCPGjSe70+6yg85+apvPWHtL37bV297hANPVw2yD0g8iS97DehOo/iqjxNwSg9EkxDvPCNr7wj2bG8wUBlvWIFBj0jOW680EenvRfcnbxmWvA8t1HyvHo+ET17HBQ8skCCPNlrTzz8mQg961j6O86Lobwu21w785EiveNROLzqeve83UmSPYowU7wXPFo79GGEOksFozxc+Ai88yKhvJerrbxaSiQ9ROlgvW3lM72r0iS8AhuevB9EwLy0Osc8sET1PJ72abxCgBo9OthwPLV0k7uAbOe83cKBOoVPgT1qbai76pz0PCP7LjyJlsq8H3UCPQ23UTyaYfg7K4C3OXFYKL0bT5K8rkowPGqrZzzqPDi9xkRYvNbW3bvr29c8U18kvf0IirzSQWw8l0pNvOKtwT1DJJE9GPmDPIVdojytPA89jialu2qsi7yG/EG85yRpvL/5mzvPyeC7OE0tPVhBmr1CMnI9Bo3uu3J1jrx6/628RyMtPQVAaj2WUAi5pIZEPVXI6rxoM1w5SR1yvUv2Xb2e9428j+IqPLWVbLzU/XG8glkvvYi4RzwyTtG8CJecPfnwujwBuj08U84lu1LiAT1Lh9y7rF4MvXao+zq+G5k8q6FiPWQ+rrwJImC9n1dKvMlOBj3TjnC9vxr1OmHoHzt8ONY8JdSavG7f+Lz9JMy76v3UPEGFMb0bcY88ldLBvKq1vrsE0ei8lr8JvYIRgr1knuo7jgSovC66gzwjOW49B6pUvHUPl7xoM9w8x6ChPfzJpjtDqyE96R/SvM9ppLyeuCq82BAqPLcNeL1WpxG9vdw1vD5L5TzDbJA8YZVgvWQ+rjzaiDU9QEbOPBoVxjyRPVA8EirGPLay0js/10w9AZSNvbnKoTzJ+8Y8WhnivDONtDwdqje9S4fcO/OCXbyWUIg80gOtOTdg5TvpETE9V2MXPZcMDrxgSQC8u2QqOkNd+bymIM08llAIPeiir7znJOm8hX57PYQjVj0fg6O7as1kvRYgmDyJUtC863sbvFU4ED3ESpM87bRDPeDxuz1gSQC8i89yvPusQDzuYig9/UbJPP1Gybzkbh48bGiRPFUHTj3AFgK9SSy3PAmDwLy0Vy28YAl5O5jIEz2lZEe8DbdROY/iKr0ALla99bypPGAJeTyPQwu8ZJ7qPNUaWDxEewO9334HPVqrhD0bcGu85voFvWnv4Tz3F089/VWOPWYW9juuGhI9Bo6SvEfy6jvxiJg80ebGu83w9DxlWxS98sbXPO/taz3vryw8GIoCPYS+Ij688BG9BTJJPPODgbwhoAk9vxr1vHQwcLw+Kgy9uA6cvECnrju5KwK9XERpPUfy6rys74o8cF0/OqHP1byVQUO9+m6BvEdiEL1h6B88gQyrPK/H0jzqLhe9Wl3cvMo6qjwO9jS9WsOTvXTysLtaGWK7DXmSPJpheLxiloQ9q0EmveMgdr1xlwu9g8J1vXt80DxK6Lw8usWKvCaQILsl1Bq8SSw3PJxAHztxl4u8kZ4wvVNQX73CQQk9mNKBPRn4XzwIxzo8/qISPEXHYz3i03G8+NNUPJK7FjxJy1a8dgoAvfDMEruwg9i8Ouc1Pc6s+rv76v87ThxOPGHonzrgOWm9vhuZvIGJzTzkbfo8vZ3SPEKKiL2aYfi7+fA6PBtPErzRZI09V4TwuzYiprtXhZQ8ympIPUxEhr2GXP68DhgyPKh8ljwfpSC99D5jvGvITbuBuWs9PHMdvJRxYTrzgt096MSsvBVB8Tzjj3e8geotPUWXRbz8uwW91/PDvOQNPr33F087ynhpvG9AWbzyBTu9jQk/PF2zaj0mIR+9haEcO8JBCT1+ZAG9wPNgPKAwNjx6t4C9QzygPQgFerxkPq68Snk7PVih1jy3z7i8ZRwxPWHoHzukxSe8maVyvCGgiTthBOI8H3UCvZHc7zoDFWO8z2mkvL/5m7rIvOM8nECfPH8RQr25aUG9B1g5vXTBbj3HMSA9gihtPYnVLT1bWMW8UAkWvWgz3Ln/QbK6ApjAPGIm3zsCWoG86xo7PPI1WT0yr7G8w39IPTSqmrreDwa9FUyDvXTyMDsmkKA78xQAPekRMT2oWpk7CUUBPKh78rwA8Ba9QcSUvFJzADp5Qyi95SqkPOOymLw/aEu9zS/YPLslx71RlFm7jpUmPX9ak71Zvjw9HSxxOYb8QbwvKYW9JaNYPXN0arzdwgG8NGagPHLVyrs/iki7MBUpvFQvBrylA2c80SWqPGHFfrz35zA9I/suPEGTUj1H0ZE8BTJJO9jgC72lA+e7JJU3Pc5MPr1erlM8wHa+PFI0nTxgSQC8BmwVPJrkVb3e8h+93WEhvc1uuzwtnR29exwUvMWluDySOLk8gQwrPdeSYzw6Vje8qPk4vJRx4bxEnNy81/NDu/tMBL2b3768t1FyPB0scTwfBd08LXugvNzHGL3dwoG7bXYyvYJn0Lw3wUU93LM8PdIDLbvxJzg7YhMnvQ5WcTxLh1w8whDHvKk3+DyWUIg8oJEWvKCRFjxDqyE98MwSOpfNqrzG43e9Ed1BvYFK6jy0SGg9vrq4PC+YhryDwvU7rb5IvVZ2T73eAMG89P//PO/t6zpAX4E9GPkDPQ81mLx1Tda7xZcXvX+OZDzgK0i9EM+gPBoCDr3WZ1y9/MkmvDalAz0wFak8x5/9u9P+Fb3QRyc8Kkbruypfnr0teyA9w8xMPMwTljx8+pY8UDm0vN/tCL3maGM7MBWpO6TFpzmKMFO8NQVAPPPjvbw/CI88a1lMvXCbfjw8UHy7NId5vGkSg7nYcQo8oQAYvTdg5boPEve8FuE0uiIdrDy7omk8HC0VvfTBQL1jYCs8piBNvO6SRrxwm/67uagkvSP7rrunX7A9wdHjvIabYbzR5sa8O5WaPOZoYztjIci8DtQ3PY37HT3iNFI8j3MpveB4zLoVQhW85mkHvVr9H71LZoM8UZTZu52bRLoEhQi9RHpfOwRUxrwX3J08QKeuvHODL70Nt9E8ae/hvBVClTztC7Y9jRdgPRfcHT1ID9E7dQ+XO+enxjy4Dpw8AXvaPNR7OLzCjo26hd/buxEv3TthVn29Fp26PJ3MBj26xQq8sZIdPbxCrTxPmhS93x2nO23lszwS7IY8'"
]
},
- "execution_count": 20,
+ "execution_count": 57,
"metadata": {},
"output_type": "execute_result"
}
@@ -823,7 +986,7 @@
},
{
"cell_type": "code",
- "execution_count": 21,
+ "execution_count": 58,
"metadata": {},
"outputs": [
{
@@ -832,7 +995,7 @@
"'Original and recreated embeddings are the same: True'"
]
},
- "execution_count": 21,
+ "execution_count": 58,
"metadata": {},
"output_type": "execute_result"
}
@@ -844,7 +1007,7 @@
],
"metadata": {
"kernelspec": {
- "display_name": "impresso-py3.13 (3.13.7)",
+ "display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@@ -858,9 +1021,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.13.7"
+ "version": "3.11.10"
}
},
"nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 4
}