Skip to content

tylanderson/cdse-dl

Repository files navigation

CDSE Downloader

Clients for searching and downloading data from Copernicus Data Space Ecosystem.

The structure of this client takes inspiration from a lot of clients I have used of the years. I designed some patterns around what I liked and found helpful and powerful in them.

TODO

  • Auth
    • create tokens
    • refresh tokens
    • s3 auth
  • OData
    • query products
    • query deleted products
    • query by list
    • query nodes
  • OpenSearch
    • Query products
  • Download products
    • download single product
    • download multiple products in parallel
    • download by id or name
  • Subscriptions
  • CLI?

Usage

OData

Product Search

To search OData use ProductSearch to query the API. Specify a collection, sensing date, publication date, area, and can further filter using lists of AttributeFilter

from cdse_dl.odata.filter import AttributeFilter
from cdse_dl.odata.search import ProductSearch

filters = [
    AttributeFilter.eq("productType","S2MSI1C")
]
area = "POINT (12.4577739 41.9077492)"

search = ProductSearch(
    collection="SENTINEL-2",
    area=area,
    date="2020-01-01/2020-01-02",
    filters=filters,
)
search.get(10)

To see what collection, attributes, and attribute types are available and can be used to search with, check the following example.

from cdse_dl.odata import get_attribute_type, get_collection_attributes, get_collections

# get all collections that can be searched with
collections = get_collections()

# get all attributes for a collection
attributes = get_collection_attributes("SENTINEL-1")

# get the type the attribute is expected to be
attr_type = get_attribute_type("SENTINEL-1", "sliceNumber")

OData searching allows the use of filters to build complex query patterns using OData's ability to filter on Attributes of the products. Use or_, and_ or not_ to combine or invert filters.

Any filters passed in the the list are and-ed together to build the final filter.

Filter Methods:

  • Greater then: gt
  • Less then: lt
  • Greater then or equal : gte
  • Less then or equal: lte
  • Equal to: eq
  • Not equal to: neq
  • String contains: contains
  • String starts with: startswith
  • String ends with: endswith
from cdse_dl.odata.filter import AttributeFilter, Filter
from cdse_dl.odata.search import ProductSearch

filters = [
    AttributeFilter.eq("productType","S2MSI1C"),
    Filter.or_([
        AttributeFilter.eq("tileId","32TPN"),
        AttributeFilter.eq("tileId","33TUH"),
    ]),
    Filter.and_([
        AttributeFilter.gt("cloudCover", 10),
        AttributeFilter.lt("cloudCover", 50),
    ]),
    AttributeFilter.eq("processorVersion","05.00").not_()
]
area = "POINT (12.4577739 41.9077492)"

search = ProductSearch(
    collection="SENTINEL-2",
    date="2020-01-01/2020-02-01",
    filters=filters,
    expand="Attributes"
)
print(search.hits())
products = search.get(20)

You can use other params such as order_by, expand, skip, and top to modify your search. skip and top are used during .get() and .get_all() and unless necessary can be ignored.

expand will add full metadata of each returned result. You can expand Attributes or Assets.

from cdse_dl.odata.filter import AttributeFilter
from cdse_dl.odata.search import ProductSearch

filters = [
    AttributeFilter.eq("productType","S2MSI1C")
]
area = "POINT (12.4577739 41.9077492)"

search = ProductSearch(
    collection="SENTINEL-2",
    area=area,
    date="2020-01-01/2020-01-02",
    filters=filters,
    order_by="ContentDate/Start",
    expand="Attributes"
)
search.get(1)

Deleted Product Search

To search for a specific deleted product, you can use OData's deleted product API with DeletedProductSearch.

from cdse_dl.odata.search import DeletedProductSearch

search = DeletedProductSearch(
    collection="SENTINEL-2",
    name="S2A_MSIL1C_20210331T100021_N0500_R122_T32TQM_20230218T121926.SAFE",
)
search.get(1)

To find products deleted during a specified date range, use the deletion_date filter

from cdse_dl.odata.search import DeletedProductSearch

filters = [
    AttributeFilter.eq("productType","S2MSI1C")
]

search = DeletedProductSearch(
    collection="SENTINEL-2",
    deletion_date="2024-01-31/2024-02-01",
    filters=filters,
)
search.hits()

To find products from published in a specified date range that have been deleted, use the origin_date filter

from cdse_dl.odata.search import DeletedProductSearch

filters = [
    AttributeFilter.eq("productType","S2MSI1C")
]

search = DeletedProductSearch(
    collection="SENTINEL-2",
    origin_date="2022-02-01/2022-02-10",
    filters=filters,
)
search.hits()

OpenSearch

Product Search

OData

Product Search

from cdse_dl.opensearch.search import ProductSearch

search = ProductSearch(
    collection="Sentinel2",
    point=(12.4577739,41.9077492),
    product_type="S2MSI1C",
    date="2000-01-01/2024-05-01",
)
items = list(search.get(10))

Download

To download a product, use the Downloader to manage downloading.

from cdse_dl.odata.search import ProductSearch
from cdse_dl.download import Downloader

name = "S2A_MSIL1C_20200116T100341_N0208_R122_T33TUH_20200116T103621.SAFE"

product = ProductSearch(name=name).get(1)[0]

downloader = Downloader()
downloader.download(product, "tmp")

You can auth from environment variables, netrc, or pass your own personal credentials.

from cdse_dl.download import Downloader
from cdse_dl.auth import Credentials

creds = Credentials.from_login("username", "password")
downloader = Downloader(credentials=creds)

To download multiple products, use download_all. The download manager will manage the 4 concurrent product limit of downloads on the session.

from cdse_dl.download import Downloader
from cdse_dl.odata.filter import AttributeFilter
from cdse_dl.odata.search import ProductSearch

filters = [
    AttributeFilter.eq("productType","S2MSI1C")
]

products = ProductSearch(collection="SENTINEL-2",filters=filters).get(5)

downloader = Downloader()
downloader.download_all(products, "tmp")

If you want to interact with files over the s3 api, you can do so using the s3fs session from get_s3fs_session, which authorizes you to the CDSE s3 api.

This endpoint may be higher performance from my testing.

from cdse_dl.auth import get_s3fs_session
from fsspec.callbacks import TqdmCallback

fs = get_s3fs_session()
tqdm_kwargs = {"unit":"files"}

remote_path = "eodata/Sentinel-2/MSI/L1C/2021/07/11/S2B_MSIL1C_20210711T095029_N0301_R079_T34UEC_20210711T110140.SAFE"
local_path = "S2B_MSIL1C_20210711T095029_N0301_R079_T34UEC_20210711T110140.SAFE"

_ = fs.get(
    remote_path,
    local_path,
    recursive=True,
    callback=TqdmCallback(tqdm_kwargs=tqdm_kwargs),
)

Subscriptions

Tooling to work with CDSE subscriptions endpoint, allowing creation of subscriptions, reading, acking, etc.

Example Usage:

from cdse_dl.odata.filter import AttributeFilter, Filter
from cdse_dl.subscriptions import SubscriptionClient

# Subscriptions client (with credentialed session)
client = SubscriptionClient()

# OData Filter
filter = Filter.and_([
    Filter.eq("Collection/Name", "SENTINEL-2"),
    AttributeFilter.eq("productType","S2MSI1C")
])

# create a subscription
sub = client.create_subscription(filter)
print(sub)

# list current subscriptions
subs = client.list_subscriptions()
print(subs)

# delete subscription
client.delete_subscription(sub['Id'])

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages