From 6dcaa318978680ad5bdccbeb0617c08286d25942 Mon Sep 17 00:00:00 2001 From: Thomas Darimont Date: Wed, 31 Jul 2013 15:11:25 +0200 Subject: [PATCH] DATAMONGO-586 - Added chapter for Aggregation Framework support to the reference documentation. Added chapter to mongodb.xml. Added myself to the authors list in index.xml --- src/docbkx/index.xml | 4 + src/docbkx/reference/mongodb.xml | 404 ++++++++++++++++++++++++++++++- 2 files changed, 405 insertions(+), 3 deletions(-) diff --git a/src/docbkx/index.xml b/src/docbkx/index.xml index 6c31ed463..8b87fd28a 100644 --- a/src/docbkx/index.xml +++ b/src/docbkx/index.xml @@ -30,6 +30,10 @@ Jon Brisbin + + Thomas + Darimont + diff --git a/src/docbkx/reference/mongodb.xml b/src/docbkx/reference/mongodb.xml index ad1f5f123..cd4b5b9ae 100644 --- a/src/docbkx/reference/mongodb.xml +++ b/src/docbkx/reference/mongodb.xml @@ -2140,6 +2140,403 @@ GroupByResults<XObject> results = mongoTemplate.group(where("x").gt(0), +
+ Aggregation Framework Support + + Spring Data MongoDB provides support for the Aggregation Framework + introduced to MongoDB in version 2.2. + + The MongoDB Documentation describes the Aggregation + Framework as follows:The MongoDB aggregation framework + provides a means to calculate aggregated values without having to use + map-reduce. While map-reduce is powerful, it is often more difficult than + necessary for many simple aggregation tasks, such as totaling or averaging + field values. + + For further information see the full reference + documentation of the aggregation framework and other data + aggregation tools for MongoDB. + +
+ Basic Concepts + + The Aggregation Framework support in Spring Data MongoDB is based + on the following key abstractions Aggregation, + AggregationOperation and + AggregationResults. + + + + Aggregation + + An Aggregation represents a MongoDB + aggregate operation and holds the + description of the aggregation pipline instructions. Aggregations + are created by inoking the appropriate + newAggregation(…) static factory Method of the + Aggregation class which takes the list of + AggregateOperation as a parameter next to the + optional input class. + + The actual aggregate operation is executed by the + aggregate method of the + MongoTemplate which also takes the desired + output class as parameter. + + + + AggregationOperation + + An AggregationOperation represents a + MongoDB aggregation pipeline operation and describes the processing + that should be performed in this aggregation step. Although one + could manually create an AggregationOperation + the recommended way to construct an + AggregateOperation is to use the static + factory methods provided by the Aggregate + class. + + + + AggregationResults + + AggregationResults is the container for + the result of an aggregate operation. It provides access to the raw + aggreation result in the form of an DBObject, + to the mapped objects and information which performed the + aggregation. + + + + The canonical example for using the Spring Data MongoDB support + for the MongoDB Aggregation Framework looks as follows: + + import static org.springframework.data.mongodb.core.aggregation.Aggregation.*; +… +Aggregation agg = newAggregation( + pipelineOP1(), + pipelineOP2(), +… + pipelineOPn() +); + +AggregationResults<OutputType> results = mongoTemplate.aggregate(agg, "INPUT_COLLECTION_NAME", OutputType.class); +List<OutputType> mappedResult = results.getMappedResults(); +… + + Note that if you provide an input class as the first parameter to + the newAggregation method the + MongoTemplate will derive the name of the input + collection from this class. Otherwise if you don't not specify an input + class you must provide the name of the input collection explicitly. If + an input-class and an explicity input-collection is provided the latter + takes precedence. +
+ +
+ Supported Aggregation Operations + + The MongoDB Aggregation Framework provides the following types of + Aggregation Operations: + + + + Pipeline Aggregation Operators + + + + Group Aggregation Operators + + + + Boolean Aggregation Operators + + + + Comparison Aggregation Operators + + + + Arithmetic Aggregation Operators + + + + String Aggregation Operators + + + + Date Aggregation Operators + + + + Conditional Aggregation Operators + + + + At the time of this writing we provide support for the following + Aggregation Operations in Spring Data MongoDB. + + + Aggregation Operations currently supported by Spring Data + MongoDB + + + + + Pipeline Aggregation Operators + + project, skip, limit, unwind, group, sort, + geoNear + + + + Group Aggregation Operators + + add_to_set, first, last, max, min, avg, push, sum, + (*count) + + + + Comparison Aggregation Operators + + eq (*via: is), gt, gte, lt, lte, ne + + + +
+ + Note that the aggregation operations not listed here are currently + not supported by Spring Data MongoDB. + + *) The operation is mapped or added by Spring Data MongoDB. +
+ +
+ Aggregation Framework Example 1 + + In this introductory example we want to aggregate a list of tags + to get the occurence count of a particular tag from a MongoDB collection + called "tags" sorted by the occurence count in descending + order. This example demonstrates the usage of grouping, sorting, + projections (selection) and unwinding (result splitting). + + In order to do this we first create a new aggregation via the + newAggregation static factory method to which + we pass a list of aggregation operations. These aggregate operations + form the aggregation pipeline of our Aggregation. + + + As a first step we select the "tags" field (which is + an array of strings) from the input collection with the + project operation. + + In a second step we use the unwind + operation to generate a new document for each tag within the + "tags" array. + + In the third step we use the group + operation to define a group for each "tags"-value for which + we aggregate the occurence count via the count + aggregation operator and collect the result in a new field called + "n". + + As a forth step we select the field "n" and create an + alias for the id-field generated from the previous group operation with + the name "tag". + + Finally as the fifth step we sort the resulting list of tags by + their occurence count in descending order via the + sort operation. + + In order to let MongoDB perform the acutal aggregation operation + we call the aggregate Method on the + MongoTemplate with the created Aggregation as an + argument. + + class TagCount { + private String tag; + private int n; +… +} + +import static org.springframework.data.mongodb.core.aggregation.Aggregation.*; +… +Aggregation agg = newAggregation( + project("tags"), + unwind("tags"), + group("tags").and("n").count(), + project("n").and("tag").previousOperation(), + sort(DESC, "n") +); + +AggregationResults<TagCount> results = mongoTemplate.aggregate(agg, "tags", TagCount.class); +List<TagCount> tagCount = results.getMappedResults(); +… + + Note that the input collection is explicitly specified as the + "tags" parameter to the aggregate + Method. If the name of the input collection is not specified explicitly, + it is derived from the input-class passed as first parameter to the + newAggreation Method. +
+ +
+ Aggregation Framework Example 2 + + This example is based on the Largest + and Smallest Cities by State example from the MongoDB + Aggregation Framework documentation. We added additional sorting to + produce stable results with different MongoDB versions. Here we want to + return the smallest and largest cities by population for each state, + using the aggregation framework. This example demonstrates the usage of + grouping, sorting and projections (selection). + + The class ZipInfo maps the structure of the + given input-collection. The class ZipInfoStats + defines the structure in the desired output format. + + As a first step we use the group + operation to define a group from the input-collection. The grouping + criteria is the combination of the fields "state" and + "city" which forms the id structure of the group. We + aggregate the value of the "population" field from the + grouped elements with by using the sum operator + saving the result in the field "pop". + + In a second step we use the sort + operation to sort the intermediate-result by the fields + "pop", "state" and "city" in + ascending order, such that the smallest city is at the top and the + biggest city is at the bottom of the result. Note that the sorting on + "state" and "city" is implicitly performed against the + group id fields which Spring Data MongoDB took care of. + + In the third step we use a group + operation again to group the intermediate result by + "state". Note that "state" again implicitly + references an group-id field. We select the name and the population + count of the biggest and smallest city with calls to the + last(…) and first(...) operator respectively + via the project operation. + + As the forth step we select the "state" field from + the previous group operation. Note that + "state" again implicitly references an group-id field. As + we do not want an implict generated id to appear, we exclude the id from + the previous operation via + and(previousOperation()).exclude(). As we want to populate + the nested City structures in our output-class + accordingly we have to emit appropriate sub-documents with the nested + method. + + Finally as the fifth step we sort the resulting list of + StateStats by their state name in ascending order + via the sort operation. + + class ZipInfo { + String id; + String city; + String state; + @Field("pop") int population; + @Field("loc") double[] location; +… +} + +class City { + String name; + int population; +… +} + +class ZipInfoStats { + String id; + String state; + City biggestCity; + City smallestCity; +… +} + +import static org.springframework.data.mongodb.core.aggregation.Aggregation.*; +… +TypedAggregation<ZipInfo> aggregation = newAggregation(ZipInfo.class, + group("state", "city").and("pop").sum("population"), + sort(ASC, "pop", "state", "city"), + group("state") + .and("biggestCity").last("city") + .and("biggestPop").last("pop") + .and("smallestCity").first("city") + .and("smallestPop").first("pop"), + project() + .and(previousOperation()).exclude() + .and("state").previousOperation() + .and("biggestCity").nested(bind("name", "biggestCity").and("population", "biggestPop")) + .and("smallestCity").nested(bind("name", "smallestCity").and("population", "smallestPop")), + sort(ASC, "state") +); + +AggregationResults<ZipInfoStats> result = mongoTemplate.aggregate(aggregation, ZipInfoStats.class); +ZipInfoStats firstZipInfoStats = result.getMappedResults().get(0); +… + + + Note that we derive the name of the input-collection from the + ZipInfo-class passed as first parameter to the + newAggregation-Method. +
+ +
+ Aggregation Framework Example 3 + + This example is based on the States + with Populations Over 10 Million example from the MongoDB + Aggregation Framework documentation. We added additional sorting to + produce stable results with different MongoDB versions. Here we want to + return all states with a population greater than 10 million, using the + aggregation framework. This example demonstrates the usage of grouping, + sorting and matching (filtering). + + In the first step we group the input collection by the + "state" field and calculate the sum of the + "population" field and store the result in the new field + "totalPop". + + As a second step we sort the intermediate result by the + id-reference of the previous group operation in addition to the + "totalPop" field in ascending order. + + Finally in the third step we filter the intermediate result by + using a match operation which accepts a + Criteria query as an argument. + + class StateStats { + @Id String id; + String state; + @Field("totalPop") int totalPopulation; +… +} + +import static org.springframework.data.mongodb.core.aggregation.Aggregation.*; +… +TypedAggregation<ZipInfo> agg = newAggregation(ZipInfo.class, + group("state").and("totalPop").sum("population"), + sort(ASC, previousOperation(), "totalPop"), + match(where("totalPop").gte(10 * 1000 * 1000)) +); + +AggregationResults<StateStats> result = mongoTemplate.aggregate(agg, StateStats.class); +List<StateStats> stateStatsList = result.getMappedResults(); +… + + Note that we derive the name of the input-collection from the + ZipInfo-class passed as first parameter to the + newAggregation-Method. +
+
+
Overriding default mapping with custom converters @@ -2574,9 +2971,10 @@ mongoTemplate.dropCollection("MyNewCollection"); <T> T execute - (DbCallback<T> action) - Executes a DbCallback translating any exceptions as - necessary. + (DbCallback<T> action) Spring Data + MongoDB provides support for the Aggregation Framework introduced to + MongoDB in version 2.2. Executes a DbCallback translating + any exceptions as necessary.