[JAY-625] Add Elasticsearch's composite aggregation#32
Conversation
The class represents a "Terms" value source for Elasticsearch's composite aggregations. How this value source works is explained here: https://www.elastic.co/docs/reference/aggregations/search-aggregations-bucket-composite-aggregation#_terms The class will be used in an upcoming commit by the, yet to be added, composite aggregation class.
| @field = field | ||
| @order = order | ||
| @missing_bucket = missing_bucket | ||
| @missing_order = missing_order |
There was a problem hiding this comment.
No check on this string (same thing for order actually)? What if I pass "in the middle"? :D
There was a problem hiding this comment.
We are not usually doing any checking in these classes. If you pass "in the middle" you'll get an error when you try to use the produced query to search the index. I think it is better not to add too much validation here since these classes are facades for Elasticsearch.
Let me know if you disagree.
There was a problem hiding this comment.
If you think the Elasticsearch error is explicative enough then sure 👍
| module Sources | ||
| # Represents the collection of sources for a Composite aggregation in | ||
| # Elasticsearch | ||
| class Sources |
There was a problem hiding this comment.
Sources::Sources? I don't know enough about this codebase to suggest a name change for the class, but is there no other name that is different from the module name the class is in?
There was a problem hiding this comment.
At first I had the class also as namespace, so instead of module Sources I had just class Sources, similar to what we have for QueryBuilder, however I ran into an issue because unlike QueryBuilder, Sources is a superclass.
So this is why at the end I decided to just add a namespace module. I have no issue with changing the name, but.... I couldn't think of a better option, do you have a suggestion?
| def clone | ||
| # rubocop:disable Lint/EmptyBlock (The sources will be assigned later) | ||
| copy = self.class.new(name, size: size) {} | ||
| # rubocop:enable Lint/EmptyBlock | ||
|
|
||
| copy.aggregations = aggregations.clone | ||
| copy.sources = sources.clone | ||
| copy | ||
| end | ||
|
|
||
| # @return [Hash] The Hash representation of the +Aggregation+. | ||
| # Properly formatted for Elasticsearch. | ||
| def to_h | ||
| super do | ||
| { | ||
| composite: { | ||
| sources: sources.to_a, | ||
| size: size | ||
| }.compact | ||
| } | ||
| end | ||
| end |
There was a problem hiding this comment.
Similar question for the classes introduced in the 2 previous commits:
I'm seeing that these #clone, #to_h, #to_a methods are not shown to be used in the docs you add in the comings commits. Are these methods then only used to test these classes? No problem for me, but this makes it so that around 50% of the logic in the class is used only to test the class itself, which is a bit strange IMO (although as I said, no problem)
There was a problem hiding this comment.
These methods are normally not used directly, hence they are not listed in the documentation.
#to_h and #to_a are used when you call #to_query on the QueryBuilder instance, there, all these classes are converted into a Hash that is then sent to Elasticsearch. So, basically these are serialization methods. #to_h and #to_a methods are called recursively until everything is either a hash or an array.
Regarding #clone they are used in two cases
- When you clone a
QueryBuilderobject, then all the internal classes are cloned recursively, this prevents changes to clone from affecting the originalQueryBuilderobject. - When you merge two
QueryBuilderobjects together. The resulting merged object also gets a clone of all the underlying classes. Again, this prevents changes to the merged object from affecting the original queries.
This class holds a collection of sources for a Composite aggregation for Elasticsearch. The class will be used in an upcoming commit by said aggregation to hold its sources and to allow the user to add sources to a composite aggregation by passing a block to its constructor. For the moment the class can only handle a Terms source. Additional sources might be added in the future.
The class model a composite Elasticsearch aggregation. Tests for the proper handling of nested aggregations were added as integration tests because creating said tests as unit tests would require a great deal of mocking logic.
The method allows the user to add composite aggregations. It relies on the Aggregations::Composite class for that.
Adds documentation on how to use the newly added composite aggregation.
e781ab0 to
e187774
Compare
This type of aggregation allows documents to be aggregated using composite values made out of all the existing combinations of values coming from the specified sources. For the task that triggered the creation of these changes only the
termssource is needed, hence that is the only source being added here.When used with the
termssource, the composite aggregation creates a bucket for each existing combination of values in the given fields.