diff --git a/docs/index.md b/docs/index.md index 9076076..191fe0f 100644 --- a/docs/index.md +++ b/docs/index.md @@ -16,11 +16,74 @@ pip install impresso The library requires Python version `3.10` or higher. It also depends on several packages commonly found in Jupyter environments, such as `matplotlib` and `pandas`. +## At a glance + +### Create a session + +```python +from impresso import connect +client = connect() +``` + +### Search + +```python +results = client.search.find(term="moon landing") +results +``` + +`results` will display a summary of the result including a preview of a pandas data frame with the result data. Use `df` property to access the full data frame: + +```python +results.df +``` +### Pagination + +!!! warning "Monthly Quota" + Every Impresso user has a monthly quota of the content items they can access. + The quota is currently set at 200,000 content items. Paginating through a + large result set may see you hitting the quota limit fairly soon. + Make sure to check the size of the full result set before fetching all pages. + +By default every result object is the first page of the full result set. Use the following code to go through the rest of the pages: + +```python +import pandas as pd +# Get first page with 100 items per page +results = impresso.search.find(term="landing", limit=100) +print(f"Full result contains {results.total} items.") + +full_df = results.df + +# Iterate through all pages +for page in results.pages(): + full_df = pd.concat([full_df, page.df]) + +full_df +``` + +### Accessing transcripts + +Content item transcripts can be large and are not returned by default. +To access a transcript, request it by content item ID: + +```python +result = client.content_items.get("NZG-1877-10-20-a-i0024") +result.df['text.content'][0] +``` + +### See content item on Web App (shortcut) +To see a specific content item in the Web App, look for the link "See this result in the Impresso App" in the rendered result summary: + +```python +result = client.content_items.get("NZG-1877-10-20-a-i0024") +result +``` + ## Create a session ::: impresso.connect - ## About Impresso ### Impresso project diff --git a/docs/resources.md b/docs/resources.md index e914fd8..74ba8f1 100644 --- a/docs/resources.md +++ b/docs/resources.md @@ -29,6 +29,9 @@ impresso.search.facet(facet='newspaper', term='war') ::: impresso.resources.search.SearchResource ::: impresso.api_client.models.search_order_by.SearchOrderByLiteral +::: impresso.api_client.models.content_item_access_rights_copyright.ContentItemAccessRightsCopyrightLiteral +::: impresso.resources.tools.Embedding + ::: impresso.resources.search.SearchDataContainer ## Entities @@ -72,6 +75,7 @@ impresso.media_sources.find( ::: impresso.resources.media_sources.MediaSourcesResource +::: impresso.api_client.models.find_media_sources_type.FindMediaSourcesTypeLiteral ::: impresso.api_client.models.find_media_sources_order_by.FindMediaSourcesOrderByLiteral ::: impresso.resources.media_sources.FindMediaSourcesContainer