Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions src/extension/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
mod registry;

pub use registry::{ExtensionRegistry, TiffExtension, TiffExtensionFactory};
149 changes: 149 additions & 0 deletions src/extension/registry.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
use std::collections::{HashMap, HashSet};
use std::sync::{Arc, LazyLock};

use crate::geo::GeoKeyDirectory;
use crate::tags::Tag;
use crate::{tags, TagValue};

/// A registry for extensions that extend the set of tags able to be parsed from the TIFF
/// [`ImageFileDirectory``].
#[derive(Debug)]
pub struct ExtensionRegistry(HashMap<String, Box<dyn TiffExtensionFactory>>);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is for the user to define one or more TiffExtensionFactory to be used when parsing a TIFF, where one defines how to create a new struct to hold parsed TIFF tag data


impl ExtensionRegistry {
/// Create a new extension registry with no extensions registered
pub fn new() -> Self {
Self(HashMap::new())
}

pub fn add(&mut self, extension: Box<dyn TiffExtensionFactory>) {
// TODO: assert that no two extensions register the same tag values?
// Or, you could allow multiple extensions to register the same tag, and call all
// known extensions for every tag id they register.
self.0.insert(extension.name().to_string(), extension);
}

pub(crate) fn inner(&self) -> &HashMap<String, Box<dyn TiffExtensionFactory>> {
&self.0
}
}

/// Something that knows how to create a TIFF extension.
pub trait TiffExtensionFactory: std::fmt::Debug + Send + Sync {
/// The name of the extension.
fn name(&self) -> &str;

fn from_tags(&self, tag_data: HashMap<Tag, TagValue>) -> Box<dyn TiffExtension>;
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defines how the factory creates a new TiffExtension instance from tags of

One difficult thing is that the tags here are provided by value,

Really we want an API where the TiffExtensionFactory builds up extension information iteratively. Where it defines the tags that it claims how to support, and then when a tag with that key is found, the IFD parser delegates to this extension builder.

So perhaps we need three things: TiffExtensionFactory, TiffExtensionBuilder, and TiffExtension. A factory defines how to create a builder. A builder receives tag data one-by-one. Then after the last tag a builder is "finished" into a standalone metadata instance.

Copy link
Copy Markdown
Contributor

@feefladder feefladder Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So perhaps we need three things: TiffExtensionFactory, TiffExtensionBuilder, and TiffExtension. A factory defines how to create a builder. A builder receives tag data one-by-one. Then after the last tag a builder is "finished" into a standalone metadata instance.

Hi Kyle, this seems great, also nice to see a lot is happening again. Just to summarize, that would look something like this:

pub trait TiffExtensionFactory: std::fmt::Debug + Send + Sync {
    fn name(&self) -> &str;
    fn create_builder(&self) -> Box<dyn TiffExtensionBuilder>
}

pub trait TiffExtensionBuilder: std::fmt::Debug + Send + Sync {
  fn supported_tags(&self) -> &HashSet<u16>;
  /// insert the tag into self
  fn insert(&mut self, tag: u16, value: TagValue);
  /// Finish parsing self, returning the TiffExtension on success
  fn finish(&mut self) -> Result<Box<dyn TiffExtension>, AsyncTiffError>
}

/// the main thing is that it can be put in a HashMap. supertrait `std::any::Any` to get the underlying data structure through a `downcast_ref()`
pub trait TiffExtension: std::any::Any {
  fn as_any(&self) -> &dyn std::any::Any;
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, pretty much

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thought was that it may very well happen that an IFD does not contain any supported tags. As in a COG only the first ifd contains geo data (now I can only find this section ) So maybe finish()->Option<> should handle that (quite common?) case?

}

/// Something that holds parsed IFD extension data.
pub trait TiffExtension: std::fmt::Debug + Send + Sync {
/// The name of the extension.
fn name(&self) -> &str;

/// The u16 tag values this extension supports.
fn supported_tags(&self) -> &HashSet<u16>;

fn insert(&mut self, tag: u16, value: TagValue);

fn finish(&mut self);

fn parse_tag(&self, tag: u16) -> Option<Self::Tags>;
}

pub trait IfdExtension: std::fmt::Debug + Send + Sync {}

#[derive(Debug)]
pub struct GeoTIFFExtensionFactory;

impl TiffExtensionFactory for GeoTIFFExtensionFactory {
fn name(&self) -> &str {
"GeoTIFF"
}

fn from_tags(&self, tag_data: HashMap<Tag, TagValue>) -> Box<dyn TiffExtension> {
let mut geo_key_directory_data = None;
let mut model_pixel_scale = None;
let mut model_tiepoint = None;
let mut model_transformation = None;
let mut geo_ascii_params: Option<String> = None;
let mut geo_double_params: Option<Vec<f64>> = None;
let mut gdal_nodata = None;
let mut gdal_metadata = None;

for (k, v) in tag_data {}
}
}

#[derive(Debug)]
pub struct GeoTIFFExtension {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An example of a parsed GeoTIFF extension. This will then expose the parsed information to the end user.

// Geospatial tags
pub(crate) geo_key_directory: Option<GeoKeyDirectory>,
pub(crate) model_pixel_scale: Option<Vec<f64>>,
pub(crate) model_tiepoint: Option<Vec<f64>>,
pub(crate) model_transformation: Option<Vec<f64>>,

// GDAL tags
pub(crate) gdal_nodata: Option<String>,
pub(crate) gdal_metadata: Option<String>,
}

tags! {
enum GeoTIFFTag(u16) {
ModelPixelScale = 33550,
ModelTransformation = 34264,
ModelTiepoint = 33922,
GeoKeyDirectory = 34735,
GeoDoubleParams = 34736,
GeoAsciiParams = 34737,
GdalNodata = 42113,
GdalMetadata = 42112,
}
}

static GEOTIFF_TAGS: LazyLock<HashSet<u16>> = LazyLock::new(|| {
[
GeoTIFFTag::ModelPixelScale,
GeoTIFFTag::ModelTransformation,
GeoTIFFTag::ModelTiepoint,
GeoTIFFTag::GeoKeyDirectory,
GeoTIFFTag::GeoDoubleParams,
GeoTIFFTag::GeoAsciiParams,
GeoTIFFTag::GdalNodata,
GeoTIFFTag::GdalMetadata,
]
.map(|x| x.to_u16())
.iter()
.copied()
.collect()
});

impl TiffExtension for GeoTIFFExtension {
fn name(&self) -> &str {
"GeoTIFF"
}

fn supported_tags(&self) -> &HashSet<u16> {
&GEOTIFF_TAGS
}

fn insert(&mut self, tag: u16, value: TagValue) {
let geotiff_tag =
GeoTIFFTag::from_u16(tag).expect("tag should be supported by this extension");

match geotiff_tag {
// Geospatial tags
// http://geotiff.maptools.org/spec/geotiff2.4.html
GeoTIFFTag::GeoKeyDirectory => {
self.geo_key_directory_data = Some(value.into_u16_vec()?)
}
GeoTIFFTag::ModelPixelScale => model_pixel_scale = Some(value.into_f64_vec()?),
GeoTIFFTag::ModelTiepoint => model_tiepoint = Some(value.into_f64_vec()?),
GeoTIFFTag::ModelTransformation => model_transformation = Some(value.into_f64_vec()?),
GeoTIFFTag::GeoAsciiParams => geo_ascii_params = Some(value.into_string()?),
GeoTIFFTag::GeoDoubleParams => geo_double_params = Some(value.into_f64_vec()?),
GeoTIFFTag::GdalNodata => gdal_nodata = Some(value.into_string()?),
GeoTIFFTag::GdalMetadata => gdal_metadata = Some(value.into_string()?),
}
}
}
117 changes: 51 additions & 66 deletions src/ifd.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ use bytes::Bytes;
use num_enum::TryFromPrimitive;

use crate::error::{AsyncTiffError, AsyncTiffResult, TiffError};
use crate::extension::{ExtensionRegistry, TiffExtension};
use crate::geo::{GeoKeyDirectory, GeoKeyTag};
use crate::predictor::PredictorInfo;
use crate::reader::{AsyncFileReader, Endianness};
Expand Down Expand Up @@ -133,15 +134,6 @@ pub struct ImageFileDirectory {

pub(crate) copyright: Option<String>,

// Geospatial tags
pub(crate) geo_key_directory: Option<GeoKeyDirectory>,
pub(crate) model_pixel_scale: Option<Vec<f64>>,
pub(crate) model_tiepoint: Option<Vec<f64>>,
pub(crate) model_transformation: Option<Vec<f64>>,

// GDAL tags
pub(crate) gdal_nodata: Option<String>,
pub(crate) gdal_metadata: Option<String>,
pub(crate) other_tags: HashMap<Tag, TagValue>,
}

Expand All @@ -150,6 +142,7 @@ impl ImageFileDirectory {
pub fn from_tags(
tag_data: HashMap<Tag, TagValue>,
endianness: Endianness,
extension_registry: &ExtensionRegistry,
) -> AsyncTiffResult<Self> {
let mut new_subfile_type = None;
let mut image_width = None;
Expand Down Expand Up @@ -184,16 +177,13 @@ impl ImageFileDirectory {
let mut sample_format = None;
let mut jpeg_tables = None;
let mut copyright = None;
let mut geo_key_directory_data = None;
let mut model_pixel_scale = None;
let mut model_tiepoint = None;
let mut model_transformation = None;
let mut geo_ascii_params: Option<String> = None;
let mut geo_double_params: Option<Vec<f64>> = None;
let mut gdal_nodata = None;
let mut gdal_metadata = None;

let mut other_tags = HashMap::new();
let mut extension_instances: HashMap<String, Box<dyn TiffExtension>> = extension_registry
.inner()
.iter()
.map(|(name, factory)| (name.clone(), factory.create()))
.collect();

tag_data.into_iter().try_for_each(|(tag, value)| {
match tag {
Expand Down Expand Up @@ -253,20 +243,15 @@ impl ImageFileDirectory {
Tag::JPEGTables => jpeg_tables = Some(value.into_u8_vec()?.into()),
Tag::Copyright => copyright = Some(value.into_string()?),

// Geospatial tags
// http://geotiff.maptools.org/spec/geotiff2.4.html
Tag::GeoKeyDirectory => geo_key_directory_data = Some(value.into_u16_vec()?),
Tag::ModelPixelScale => model_pixel_scale = Some(value.into_f64_vec()?),
Tag::ModelTiepoint => model_tiepoint = Some(value.into_f64_vec()?),
Tag::ModelTransformation => model_transformation = Some(value.into_f64_vec()?),
Tag::GeoAsciiParams => geo_ascii_params = Some(value.into_string()?),
Tag::GeoDoubleParams => geo_double_params = Some(value.into_f64_vec()?),
Tag::GdalNodata => gdal_nodata = Some(value.into_string()?),
Tag::GdalMetadata => gdal_metadata = Some(value.into_string()?),
// Tags for which the tiff crate doesn't have a hard-coded enum variant
Tag::Unknown(DOCUMENT_NAME) => document_name = Some(value.into_string()?),
_ => {
other_tags.insert(tag, value);
Tag::Unknown(extension_tag_name) => {
for extension in extension_instances.values() {
if extension.supported_tags().contains(&extension_tag_name) {
extension.insert(extension_tag_name, value);
return Ok(());
}
}
}
};
Ok::<_, TiffError>(())
Expand Down Expand Up @@ -629,43 +614,43 @@ impl ImageFileDirectory {
self.copyright.as_deref()
}

/// Geospatial tags
/// <https://web.archive.org/web/20240329145313/https://www.awaresystems.be/imaging/tiff/tifftags/geokeydirectorytag.html>
pub fn geo_key_directory(&self) -> Option<&GeoKeyDirectory> {
self.geo_key_directory.as_ref()
}

/// Used in interchangeable GeoTIFF files.
/// <https://web.archive.org/web/20240329145238/https://www.awaresystems.be/imaging/tiff/tifftags/modelpixelscaletag.html>
pub fn model_pixel_scale(&self) -> Option<&[f64]> {
self.model_pixel_scale.as_deref()
}

/// Used in interchangeable GeoTIFF files.
/// <https://web.archive.org/web/20240329145303/https://www.awaresystems.be/imaging/tiff/tifftags/modeltiepointtag.html>
pub fn model_tiepoint(&self) -> Option<&[f64]> {
self.model_tiepoint.as_deref()
}

/// Stores a full 4×4 affine transformation matrix that maps pixel/line coordinates directly
/// into model (map) coordinates.
pub fn model_transformation(&self) -> Option<&[f64]> {
self.model_transformation.as_deref()
}

/// GDAL NoData value
/// <https://gdal.org/en/stable/drivers/raster/gtiff.html#nodata-value>
pub fn gdal_nodata(&self) -> Option<&str> {
self.gdal_nodata.as_deref()
}

/// GDAL Metadata XML information
///
/// Non standard metadata items are grouped together into a XML string stored in the non
/// standard `TIFFTAG_GDAL_METADATA` ASCII tag (code `42112`).
pub fn gdal_metadata(&self) -> Option<&str> {
self.gdal_metadata.as_deref()
}
// /// Geospatial tags
// /// <https://web.archive.org/web/20240329145313/https://www.awaresystems.be/imaging/tiff/tifftags/geokeydirectorytag.html>
// pub fn geo_key_directory(&self) -> Option<&GeoKeyDirectory> {
// self.geo_key_directory.as_ref()
// }

// /// Used in interchangeable GeoTIFF files.
// /// <https://web.archive.org/web/20240329145238/https://www.awaresystems.be/imaging/tiff/tifftags/modelpixelscaletag.html>
// pub fn model_pixel_scale(&self) -> Option<&[f64]> {
// self.model_pixel_scale.as_deref()
// }

// /// Used in interchangeable GeoTIFF files.
// /// <https://web.archive.org/web/20240329145303/https://www.awaresystems.be/imaging/tiff/tifftags/modeltiepointtag.html>
// pub fn model_tiepoint(&self) -> Option<&[f64]> {
// self.model_tiepoint.as_deref()
// }

// /// Stores a full 4×4 affine transformation matrix that maps pixel/line coordinates directly
// /// into model (map) coordinates.
// pub fn model_transformation(&self) -> Option<&[f64]> {
// self.model_transformation.as_deref()
// }

// /// GDAL NoData value
// /// <https://gdal.org/en/stable/drivers/raster/gtiff.html#nodata-value>
// pub fn gdal_nodata(&self) -> Option<&str> {
// self.gdal_nodata.as_deref()
// }

// /// GDAL Metadata XML information
// ///
// /// Non standard metadata items are grouped together into a XML string stored in the non
// /// standard `TIFFTAG_GDAL_METADATA` ASCII tag (code `42112`).
// pub fn gdal_metadata(&self) -> Option<&str> {
// self.gdal_metadata.as_deref()
// }

/// Tags for which the tiff crate doesn't have a hard-coded enum variant.
pub fn other_tags(&self) -> &HashMap<Tag, TagValue> {
Expand Down
1 change: 1 addition & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

mod array;
mod data_type;
mod extension;
#[cfg(feature = "ndarray")]
pub mod ndarray;
pub mod reader;
Expand Down
10 changes: 1 addition & 9 deletions src/tags.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
#![allow(missing_docs)]
#![allow(clippy::upper_case_acronyms)]

#[macro_export]
macro_rules! tags {
{
// Permit arbitrary meta items, which include documentation.
Expand Down Expand Up @@ -127,15 +128,6 @@ pub enum Tag(u16) unknown("A private or extension tag") {
SMaxSampleValue = 341, // TODO add support
// JPEG
JPEGTables = 347,
// GeoTIFF
ModelPixelScale = 33550, // (SoftDesk)
ModelTransformation = 34264, // (JPL Carto Group)
ModelTiepoint = 33922, // (Intergraph)
GeoKeyDirectory = 34735, // (SPOT)
GeoDoubleParams = 34736, // (SPOT)
GeoAsciiParams = 34737, // (SPOT)
GdalNodata = 42113, // Contains areas with missing data
GdalMetadata = 42112, // XML metadata string
}
}

Expand Down
Loading