Skip to content

exiftool "ColorMode" and "ColorSpaceData" are passed through un-normalized, despite being invalid values #83

@LeoniePhiline

Description

@LeoniePhiline

exiftool -j may extract "ColorMode" and "ColorSpaceData" values which are not fit to be placed into sys_file_metadata.color_space unaltered.

Example:

exiftool -j "exampe-sRGB-black-and-white-file.jpg" \
    | jq '[.[] | {ColorMode, ColorSpaceData, ColorSpace}]' 
[
  {
    "ColorMode": "Grayscale",
    "ColorSpaceData": "GRAY",
    "ColorSpace": "sRGB"
  }
]

extractor defines color_space mapping as follows:

jq '.[] | select(.FAL == "color_space")' "extractor/Configuration/Services/ExifTool/default.json"
{
  "FAL": "color_space",
  "DATA": [
    "ColorMode",
    "ColorSpaceData",
    "ColorSpace->Causal\\Extractor\\Utility\\ColorSpace::normalize"
  ]
}

Since \Causal\Extractor\Service\Extraction\AbstractExtractionService::remapServiceOutput breaks upon a non-null $value, the exiftool value "ColorMode" = "Grayscale" is extracted and passed back to the TYPO3 metadata extraction service, where it is used as parameter for an INSERT INTO sys_file_metadata.

However, the sys_file_metadata.color_space field is a VARCHAR(4), and "Grayscale" does not fit, causing an error in strict mode.

Furthermore, "Grayscale" is an invalid value. According to SYSEXT:filemetadata/Configuration/TCA/Overrides/sys_file_metadata.php, the correct grayscale color_space value would be "grey".

Thus, ColorMode = "Grayscale" and ColorSpaceData = "GRAY" must be normalized to the value "grey".

To my mind, this should be handled by the ColorSpace utility, using a configuration like ...:

{
  "FAL": "color_space",
  "DATA": [
    "ColorMode->Causal\\Extractor\\Utility\\ColorSpace::normalize",
    "ColorSpaceData->Causal\\Extractor\\Utility\\ColorSpace::normalize",
    "ColorSpace->Causal\\Extractor\\Utility\\ColorSpace::normalize"
  ]
}

... and adjusting Causal\Extractor\Utility\ColorSpace::normalize to match on strings starting (lowercased) with "gray" or "grey" and replacing them with the canonical "grey" value.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions