Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ Please mark backwards incompatible changes with an exclamation mark at the start

## [Unreleased]

### Added
- The `Elasticsearch::Indexes` class. A class which allows multiple indexes to
be used (fed or queried) at the same time.

## [28.3.0] - 2025-06-05

### Added
Expand Down
Original file line number Diff line number Diff line change
@@ -1,62 +1,22 @@
Index
=====
Indexable
=========

This class represents an Index inside an Elasticsearch cluster. It provides a
set of methods that allow the user to query the index and add new data.
The following methods are common to the following classes, which include the
``Indexable`` mixin:

The class also keeps a buffer of documents waiting to be pushed to the index,
the user can add documents to the buffer and the class will push them as soon as
the buffer is full. The user can also force the push of the records by flushing
the buffer.
.. toctree::
:maxdepth: 2
:glob:

To initialize an index:

.. code-block:: ruby

client = JayAPI::Elasticsearch::ClientFactory.new(
cluster_url: 'https://my-cluster.elastic.io'
).create(max_attempts: 3, wait_strategy: :constant, wait_interval: 2)

index = JayAPI::Elasticsearch::Index.new(
client: client,
index_name: 'my_index'
)

The ``cluster_url`` and the ``index_name`` are the only required parameters. If
the cluster is configured to use Elasticsearch's default port (``9200``) and has
no authentication in place this is all you need. However in most cases that
would not be enough, so you can also provide the following extra parameters:

* ``port``: The port number where the Elasticsearch cluster is listening for
connections.
* ``username``: The username to use when authentication against the cluster.
* ``password``: The user's password
* ``batch_size``: The amount of documents the ``Index`` will store in its buffer
before triggering an automatic flush.
* ``logger``: If you want the messages to be logged to a particular logger. If
you don't pass a logger then the class will create one.

The ``create`` method, that returns the client object, also takes optional arguments,
which define connection re-try behaviour:

* ``max_attempts``: Sets the maximum number of reconnection attempts in
response to server errors.
* ``wait_strategy``: Determines the strategy for wait intervals between
reconnection attempts. Options are:

* ``:constant`` - Maintains a consistent wait time specified by ``wait_time``.
* ``:geometric`` - Increases the wait time geometrically based on ``wait_time``.

* ``wait_time``: Specifies the base wait time (in seconds) for the chosen
``wait_strategy``.
indexable/*

#push
-----

The ``push`` method stores a document in the ``Index``'s buffer. If the buffer
The ``push`` method stores a document in the ``Indexable``'s buffer. If the buffer
reaches the maximum number of records the buffer will be flushed automatically.

``push`` takes a single ``Hash``, the document you want to send to the index.
``push`` takes a single ``Hash``, the document you want to send to the index(es).

.. warning::

Expand All @@ -70,38 +30,38 @@ Example:

documents.each do |document|
# do something with your document, then push it
index.push(document)
indexable.push(document)
end

index.flush # Do not forget to flush the index at the end.
indexable.flush # Do not forget to flush the indexable at the end.

#index
------

``index`` pushes a document directly to the Elasticsearch cluster without adding
it to the buffer first. So you don't need to call ``flush``:

``index`` takes a single ``Hash``, the document you want to send to the index.
``index`` takes a single ``Hash``, the document you want to send to the index(es).

Example:

.. code-block:: ruby

index.index(my_document)
indexable.index(my_document)

.. note::

Pushing documents one at a time is very inefficient because the ``Index``
Pushing documents one at a time is very inefficient because the ``Indexable``
needs to perform an HTTP Request for each one. If you want to send many
documents use ``push`` instead.

.. _`Index#search`:
.. _`Indexable#search`:

#search
-------

The ``search`` method allows you to search the Elasticsearch index for documents
matching the provided query. This method takes two arguments:
The ``search`` method allows you to search the Elasticsearch index(es) for
documents matching the provided query. This method takes two arguments:

* ``query`` A ``Hash`` with the query you want to execute, this Hash will be
converted to JSON before being sent to Elasticsearch. It must follow
Expand All @@ -121,7 +81,7 @@ Example:

.. code-block:: ruby

index.search(
indexable.search(
query: {
match_all: { }
},
Expand All @@ -144,10 +104,10 @@ Example:
.. code-block:: ruby

documents.each do |document|
index.push(document)
indexable.push(document)
end

index.flush
indexable.flush

#queue_size
-----------
Expand All @@ -159,17 +119,17 @@ Example

.. code-block:: ruby

index.queue_size # => 16
indexable.queue_size # => 16

#delete_by_query
----------------

This method allows you to remove the documents that match the given query from
the index. The method has a single parameter:
the index(es). The method has a single parameter:

* ``query``: A ``Hash`` with the query you want to use to match documents for
deletion. For more information on this parameter or how to create queries see
the :ref:`Index#search` method documentation.
the :ref:`Indexable#search` method documentation.

On success the method will return a ``Hash`` with information about the executed
command, for example:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
Index
=====

.. note::

This class includes the :doc:`../indexable` mixin. It exposes all its methods.

This class represents an Index inside an Elasticsearch cluster. It provides a
set of methods that allow the user to query the index and add new data.

The class also keeps a buffer of documents waiting to be pushed to the index,
the user can add documents to the buffer and the class will push them as soon as
the buffer is full. The user can also force the push of the records by flushing
the buffer.

To initialize an index:

.. code-block:: ruby

client = JayAPI::Elasticsearch::ClientFactory.new(
cluster_url: 'https://my-cluster.elastic.io'
).create(max_attempts: 3, wait_strategy: :constant, wait_interval: 2)

index = JayAPI::Elasticsearch::Index.new(
client: client,
index_name: 'my_index'
)

The ``cluster_url`` and the ``index_name`` are the only required parameters. If
the cluster is configured to use Elasticsearch's default port (``9200``) and has
no authentication in place this is all you need. However in most cases that
would not be enough, so you can also provide the following extra parameters:

* ``port``: The port number where the Elasticsearch cluster is listening for
connections.
* ``username``: The username to use when authentication against the cluster.
* ``password``: The user's password
* ``batch_size``: The amount of documents the ``Index`` will store in its buffer
before triggering an automatic flush.
* ``logger``: If you want the messages to be logged to a particular logger. If
you don't pass a logger then the class will create one.

The ``create`` method, that returns the client object, also takes optional arguments,
which define connection re-try behaviour:

* ``max_attempts``: Sets the maximum number of reconnection attempts in
response to server errors.
* ``wait_strategy``: Determines the strategy for wait intervals between
reconnection attempts. Options are:

* ``:constant`` - Maintains a consistent wait time specified by ``wait_time``.
* ``:geometric`` - Increases the wait time geometrically based on ``wait_time``.

* ``wait_time``: Specifies the base wait time (in seconds) for the chosen
``wait_strategy``.
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
Indexes
=======

.. note::

This class includes the :doc:`../indexable` mixin. It exposes all its methods.

This class represents a set of indexes in an elasticsearch cluster. It provides
a set of methods that allow the user to query the indexes or add new data to
all of them at the same time.

The class works exactly as :doc:`index`. It only differs in the fact that it can
be initialized with multiple ``index_names`` and not only one, like ``Index``.

Initializing
------------

Just like with ``Index`` you need an instance of ``Elasticsearch::Client``. You
can use the ``ClientFactory`` to get one:

.. code-block:: ruby

require 'jay_api/elasticsearch/client_factory'

client = JayAPI::Elasticsearch::ClientFactory.new(
cluster_url: 'https://my-cluster.elastic.io'
).create

Then you can use the client to initialize the ``Indexes`` class:

.. code-block:: ruby

require 'jay_api/elasticsearch/indexes'

indexes = JayAPI::Elasticsearch::Indexes.new(
client: client, index_names: %w[my_index my_other_index not_my_index]
)

The following arguments are available for the ``#initialize`` method:

* ``client``: An instance of ``Elasticsearch::Client``. You can get one using
the ``Elasticsearch::ClientFactory`` class.
* ``index_names``: An ``Array`` of ``String``. The names of the indexes you
want to work with.
* ``batch_size``: The number of documents the ``Indexes`` class will store in
its buffer before triggering a ``#flush`` call when the ``#push`` method is
used to add data. The default is: 100.
* ``logger``: A ``Logger`` object used to log messages. If none is given the
``Indexes`` object will create one of its own.

.. warning::

When the ``batch_size`` isn't a multiple of the number of elements in the
``index_names`` array there is a chance that the size of the batches sent to
the Elasticsearch could be bigger than ``batch_size``. This can be avoided
simply by choosing an integer multiple of the array's size.

#index_names
------------

This method returns the array of index names used to initialize the ``Indexes``
object.

.. warning::

Unlike ``Index``, ``Indexes`` objects **DO NOT** respond to the
``#index_name`` message.
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ be returned if there aren't enough documents matching the query).

By using ``from`` and ``size`` you can only scroll through a maximum of 10,000
documents. If you have more than that in your index, you'll have to use
:ref:`Index#search` method with ``type: :search_after``.
:ref:`Indexable#search` method with ``type: :search_after``.

#sort
-----
Expand Down Expand Up @@ -171,7 +171,7 @@ And the use of Hashes to include or exclude parts of the document, for example:

Once you have added all the clauses you want on your queries you can call
``to_h`` or ``to_query`` to get the corresponding Hash. The class converts the
query to a Hash representation that can then be passed to :ref:`Index#search` to
query to a Hash representation that can then be passed to :ref:`Indexable#search` to
perform the actual search.

.. note::
Expand Down
1 change: 1 addition & 0 deletions lib/jay_api/elasticsearch.rb
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
require_relative 'elasticsearch/client_factory'
require_relative 'elasticsearch/errors'
require_relative 'elasticsearch/index'
require_relative 'elasticsearch/indexes'
require_relative 'elasticsearch/query_builder'
require_relative 'elasticsearch/query_results'
require_relative 'elasticsearch/response'
Expand Down
4 changes: 2 additions & 2 deletions lib/jay_api/elasticsearch/async.rb
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,8 @@ class Async

def_delegators :index, :index_name

# @param [JayAPI::Elasticsearch::Index] index The elasticsearch index on
# which to execute asynchronous operations
# @param [JayAPI::Elasticsearch::Indexable] index The elasticsearch
# index or indexes on which to execute asynchronous operations
def initialize(index)
@index = index
end
Expand Down
Loading