Skip to content

Paged EntityVersions #25

Description

@elliotcmorris

Design Overview

Implements EntityVersions using a Pager object, (cursor) approach.

Constructs Pager object as an outcome of a Manager/ManagerInterface function, following the same interface practices for the construction method (entityVersions), i.e, batch first, callback-based with singular conveniences, with a distinction made between expected errors dispatched to BatchElementError error callbacks, and exceptional errors handled via language exception (see Error Handling)

As Pagers are distinctive in that they provide a second-order interface to the manager that can be called after the lifetime of the initial method has closed, there are some minor technical considerations to be made concerning argument mutability, (see Copying Mutable References)

Design Sketches

ManagerInterface (Cpp)
// Base Callback Formulation
using PagerInitSuccessCallback = std::function<void(Pager<EntityVersion>)>;
using PagerInitFailureCallback = std::function<void(size_t, BatchElementError)>;

/*
 * Construct a set of Pager objects capable of iterating in pages over
 * the EntityVersions of the provided entityReferences
 *
 * It is expected that the manager perform validation in this method.
 * You should initialize/assert the valid state of your backend,
 * throwing exceptions on errors.
 *
 * If the host provides you input that is malformed or in any
 * other way non-acceptable, you must call the error callback for the
 * specifically relevant Pager.
 *
 * Otherwise, for each input EntityReference, the provided success
 * callback should be called, provided with a constructed Pager object
 * ready to be iterated over.
 *
 * @throws std::exception. Any standard exception may be emitted from
 * this function to handle failure to initialize the backend.
 */
void Manager::entityVersions(
    const std::vector<EntityReference> &entityReferences,
    const TraitsDataPtr& queryPredicate,
    const ContextConstPtr &context, const HostSessionPtr& hostSession,
    size_t pageSize, PagerInitSuccessCallback successCb,
    PagerInitFailureCallback failureCb);
Manager (Cpp)
// Base Callback Formulation
using PagerInitSuccessCallback = std::function<void(Pager<EntityVersion>)>;
using PagerInitFailureCallback = std::function<void(size_t, BatchElementError)>;

/*
 * Construct a set of Pager objects capable of iterating in pages over
 * the EntityVersions of the provided entityReferences
 *
 * It is expected that the manager perform validation in this method.
 * The host can expect, if the input provided is malformed or in any
 * other way non-acceptable to the manager, to receive an error callback.
 * Otherwise, for each input EntityReference, the manager will receive
 * a constructed Pager object in the success callback.
 *
 * This method may be expected to perform a network query to perform
 * validation, do not assume a quick return.
 *
 * @throws std::exception. Any standard exception may be emitted from
 * this function to handle failure to initialize the backend.
 */
void Manager::entityVersions(
    const std::vector<EntityReference> &entityReferences,
    const TraitsDataPtr& queryPredicate,
    const ContextConstPtr &context, const HostSessionPtr& hostSession,
    size_t pageSize, PagerInitSuccessCallback successCb,
    PagerInitFailureCallback failureCb);

/*
 * Singular Variant Convenience
 * Uses the batch version as its underlying implementation
 */
std::variant<Pager<EntityVersion>, BatchElementError> entityVersions(
   const EntityReference &entityReference,
   const TraitsDataPtr& queryPredicate,
   const ContextConstPtr &context, const HostSessionPtr& hostSession,
   size_t pageSize);

/*
 * Singular Exceptional Convenience
* Uses the batch version as its underlying implementation
 */
Pager<EntityVersion> entityVersions(
   const EntityReference &entityReference,
   const TraitsDataPtr& queryPredicate,
   const ContextConstPtr &context, const HostSessionPtr& hostSession,
   size_t pageSize);
  
/*
 * Multi Variant Convenience
 * Uses the batch version as its underlying implementation
 */
std::vector<std::variant<Pager<EntityVersion>, BatchElementError>> entityVersions(
   const EntityReferences &entityReferences,
   const TraitsDataPtr& queryPredicate,
   const ContextConstPtr &context, const HostSessionPtr& hostSession,
   size_t pageSize);

/*
 * Multi Exceptional Convenience
* Uses the batch version as its underlying implementation
 */
std::vector<Pager<EntityVersion>> entityVersions(
   const EntityReferences &entityReferences,
   const TraitsDataPtr& queryPredicate,
   const ContextConstPtr &context, const HostSessionPtr& hostSession,
   size_t pageSize);
PagerInterface (Cpp)
/*
 * PagerInterface, implemented by the manager.
 * Deals with the retrieval of paginated data from the backend at the
 * behest of the host.
 *
 * To support as wide array of possible backends as possible,
 * OpenAssetIO places no restraints on the behaviour of this type
 * concerning performance, however, it is considered friendly to
 * document the performance characteristics of your Pager implementation.
 */
template <class Elem>
class PagerInterface {
 public:
    // Still need shared_ptr because Python.
    using Ptr = std::shared_ptr<PagerInterface<Elem>>;
    using Page = std::vector<Elem>;

    virtual PagerInterface(const HostSessionPtr&) = 0;

    // Explicitly disallow copying.
    PagerInterface(const& PagerInterface) = delete;
    PagerInterface<Elem>& operator=(const PagerInterface<Elem>&) = delete;
    // Allow moving.
    PagerInterface(PagerInterface&&) noexcept = default;
    PagerInterface<Elem>& operator=(PagerInterface<Elem>&&) noexcept = default;

    /* Manager should override destructor to be notified when query has
     * finished.
     */
    virtual ~PagerInterface() = default;

    /*
     * Whether or not there is more data accessible by advancing the
     * page. The mechanism to acquire this information is variable, and
     * left up to the specifics of the backend implementation.
     */
    virtual bool hasMore(const HostSessionPtr&) = 0;

    /*
     * get and next are separated to allow traversal through a list
     * without necessarily fetching the all the data.
     *
     * Whilst it is not a set rule that `next` must be a cheap operation
     * as some backends cannot support this, it is considered friendly
     * for `next` to be a cheap call that allows traversal, whilst `get`
     * performs the actual work.
     * Documenting the performance characteristics of these methods is
     * highly encouraged.
     */
    virtual Page get(const HostSessionPtr&) = 0;
    virtual void next(const HostSessionPtr&) = 0;
};
Pager (Cpp)
/*
 * The OpenAssetIO PagerInterface wrapper object. Following a similar
 * pattern to Manager and ManagerInterface
 *
 * This is the object the host interacts with, allowing for addition
 * of convenience functions, as well as the insertion of middle-layer
 * functionally, (such as the imminent arrival of an auditing system.)
 *
 * Due to the variance of backends, construction, `hasMore`, `get` and
 * `next` may all reasonably need to perform non trivial, networked
 * operations, and thus performance should not be assumed.
 */
template <class Elem>
class Pager {
 public:
    using Ptr = std::shared_ptr<Pager<Elem>>;
    using Page = PagerInterface<Elem>::Page;

    /*
     * Pager cannot be copied, as each object represents a single
     * paginated Query.
     * Destruction of this object is tantamount to closing the query.
     */
    Pager(PagerInterface<Elem>::Ptr pagerInterface, const HostSessionPtr&);
    Pager(const Pager&) = delete;
    Pager<Elem>& operator=(const Pager<Elem>&) = delete;
    Pager(Pager&&) noexcept = default;
    Pager<Elem>& operator=(Pager<Elem>&&) noexcept = default;
    ~Pager() = default;

    /*
     * Whether or not there is more data accessible by advancing the
     * page. This method may be slow depending on backend implementation.
     */
    bool hasMore();

    /*
     * get and next are separated to allow traversal through a list
     * without necessarily fetching the all the data.
     * Neither of these methods however, are guaranteed to be cheap,
     * as some backends have no choice but to retrieve the data when
     * they advance the page.
     */
    Page get();
    void next();


 private:
    PagerInterface<Elem>::Ptr pagerInterface_;
};

Implementation Notes

Error Handling

Error handling will follow the same style as previous methods in Manager, yielding to an error callback with an index and a BatchElementError (see note) to handle any errors in the input data, whilst errors in the backend (ie, non-expected, the backend isn't working sort of errors,) will be thrown as exceptions.

Note

Speccing out this work made the inappropriateness of BatchElementError as a name for this error type more clear. As really it's more of a InputArgumentsAreWrongSomehowError. This is already showing a bit of dissonance for the singular, non-batch conveniences. > We plan to rename this at some point.

This error handling philosophy applies to entityVersions, however, an important concept to note is that once the Pager has been constructed, non-exceptional errors are no longer expected. This is why the pager maintains a more convenient return-based interface, and does not have capability to communicate via error-types.

Copying Mutable References

Due to Pagers having a lifetime that exists beyond the method call that provides the initial arguments, we are presented a consideration
concerning arguments passed by pointer (which, thanks to our python integrations, is almost everything.)

It would be a poor behaving manager that changed the data pointed to in these arguments, as it would effect parts of the system outside its scope.

Two choices present immediately :

  • Document that the manager must not mutate the data in these pointers, and that if they wish to work with it, they should themselves copy it.
  • Use the Manager/ManagerInterface dependency inversion to only pass copied objects to ManagerInterface::entityVersions, thus removing the ability for managers to create problems. (The copy mechanism may vary, a prime example being that Context likely wants to call createChildContext and use that.)

This is relatively minor, as yet undecided, and is left to be an implementation detail.

Performance Philosophy

To enable the widest variety of backends, we have not placed hard performance limitations on either the method to construct a Pager, nor the members of the Pager itself.

However, it is unlikely any manager backends will need to be potentially unperformant in all of these cases. We should to put a strong focus in the documentation communicating the potential slow-performance of these methods to the host, whilst also communicating the the manager what a friendly manager would do (particularly concerning next and get), and also encouraging the manager to document their performance characteristics well.

Template implementation and Python

The above C++ implementations have been written as generic templates, as multiple paged methods will want to share the same implementation. However, in python, we're not gonna have that capability, and we're gonna end up having bound aliases, ie, EntityVersionsPager. We want to do the same in C++, and expose the same aliases, so to the consumer they will look like concrete types.

Implementation Strategy

Two options have been proposed.

Prototype first.

  • Rough python-only entityVersions using pager strategy, only one signature, no conveniences.
  • Update BAL to have entityVersions support, with trait(s) to test TraitsData filter.
  • Build real test suite in python against python-only entityVersions, using BAL
  • C++ implementation of entityVersions, to full standard with all conveniences and documentation.
  • Retarget python implementation to be based on C++ implementation.
  • Add additional python tests for conveniences/batching/whatever else got turned green when moving over to full C++.
  • API compliance suite.

Ground-up

  • Port entityVersions to C++ in Pager formulation to full standard with all conveniences and documentation
  • Update BAL to have entityVersions support, with trait(s) to test TraitsData filter.
  • Bind python methods/objects from C++ implementation.
  • Build full, real test suite in python, using BAL
  • API compliance suite.

Blocked by

  • Finalizing EntityVersion type.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions