Skip to content

Handling larger clusters #13

Description

@rjpower

When we have arrays with more than a few hundred tiles, I've noticed that our performance drops significantly; this is almost certainly due to the various extent operations needed to compute tiles. We can move the extent code to Cython which would give us a big speedup.

Also, the vast majority of arrays have tiles that are all the same shape; we can leverage this to avoid scanning a tile list, and instead use the tile shape to find the target tile, e.g.

pos_to_tile(pos, tile_shape):
  tx = pos[0] / tile_shape[0]
  ty = pos[1] / tile_shape[1]
  ...
  num_tiles_x = array.shape[0] / tile_shape.x
  return ty * num_tiles_x + tx
  • Run profiles to find bottlenecks for arrays with many tiles
  • Migrate extent.py to Cython
  • Special handling for regular tile shapes

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions