DATAMONGO-586 - Added chapter for Aggregation Framework support to the reference documentation.

Added chapter to mongodb.xml. Added myself to the authors list in index.xml
13 years ago · 6dcaa31897
2 changed files with 405 additions and 3 deletions
--- a/src/docbkx/index.xml
+++ b/src/docbkx/index.xml
@ -30,6 +30,10 @@
				@@ -30,6 +30,10 @@
        <firstname>Jon</firstname>
        <surname>Brisbin</surname>
      </author>
+      <author>
+        <firstname>Thomas</firstname>
+        <surname>Darimont</surname>
+      </author>
    </authorgroup>

    <legalnotice>
--- a/src/docbkx/reference/mongodb.xml
+++ b/src/docbkx/reference/mongodb.xml
@ -2140,6 +2140,403 @@ GroupByResults&lt;XObject&gt; results = mongoTemplate.group(where("x").gt(0),
				@@ -2140,6 +2140,403 @@ GroupByResults&lt;XObject&gt; results = mongoTemplate.group(where("x").gt(0),
    </section>
  </section>

+  <section id="mongo.aggregation">
+    <title>Aggregation Framework Support</title>
+
+    <para>Spring Data MongoDB provides support for the Aggregation Framework
+    introduced to MongoDB in version 2.2.</para>
+
+    <para>The MongoDB Documentation describes the <ulink
+    url="http://docs.mongodb.org/manual/core/aggregation/">Aggregation
+    Framework</ulink> as follows:<quote>The MongoDB aggregation framework
+    provides a means to calculate aggregated values without having to use
+    map-reduce. While map-reduce is powerful, it is often more difficult than
+    necessary for many simple aggregation tasks, such as totaling or averaging
+    field values.</quote></para>
+
+    <para>For further information see the full <ulink
+    url="http://docs.mongodb.org/manual/aggregation/">reference
+    documentation</ulink> of the aggregation framework and other data
+    aggregation tools for MongoDB.</para>
+
+    <section id="mongo.aggregation.basic-concepts">
+      <title>Basic Concepts</title>
+
+      <para>The Aggregation Framework support in Spring Data MongoDB is based
+      on the following key abstractions <classname>Aggregation</classname>,
+      <classname>AggregationOperation</classname> and
+      <classname>AggregationResults</classname>.</para>
+
+      <itemizedlist>
+        <listitem>
+          <para><classname>Aggregation</classname></para>
+
+          <para>An Aggregation represents a MongoDB
+          <methodname>aggregate</methodname> operation and holds the
+          description of the aggregation pipline instructions. Aggregations
+          are created by inoking the appropriate
+          <code>newAggregation(…)</code> static factory Method of the
+          <classname>Aggregation</classname> class which takes the list of
+          <classname>AggregateOperation</classname> as a parameter next to the
+          optional input class.</para>
+
+          <para>The actual aggregate operation is executed by the
+          <methodname>aggregate</methodname> method of the
+          <classname>MongoTemplate</classname> which also takes the desired
+          output class as parameter. </para>
+        </listitem>
+
+        <listitem>
+          <para><classname>AggregationOperation</classname></para>
+
+          <para>An <classname>AggregationOperation</classname> represents a
+          MongoDB aggregation pipeline operation and describes the processing
+          that should be performed in this aggregation step. Although one
+          could manually create an <classname>AggregationOperation</classname>
+          the recommended way to construct an
+          <classname>AggregateOperation</classname> is to use the static
+          factory methods provided by the <classname>Aggregate</classname>
+          class.</para>
+        </listitem>
+
+        <listitem>
+          <para><classname>AggregationResults</classname></para>
+
+          <para><classname>AggregationResults</classname> is the container for
+          the result of an aggregate operation. It provides access to the raw
+          aggreation result in the form of an <classname>DBObject</classname>,
+          to the mapped objects and information which performed the
+          aggregation. </para>
+        </listitem>
+      </itemizedlist>
+
+      <para>The canonical example for using the Spring Data MongoDB support
+      for the MongoDB Aggregation Framework looks as follows:</para>
+
+      <programlisting language="java">import static org.springframework.data.mongodb.core.aggregation.Aggregation.*;
+…
+Aggregation agg = newAggregation(
+    pipelineOP1(),
+    pipelineOP2(),
+…
+    pipelineOPn()
+);
+
+AggregationResults&lt;OutputType&gt; results = mongoTemplate.aggregate(agg, "INPUT_COLLECTION_NAME", OutputType.class);
+List&lt;OutputType&gt; mappedResult = results.getMappedResults();
+…</programlisting>
+
+      <para>Note that if you provide an input class as the first parameter to
+      the <methodname>newAggregation</methodname> method the
+      <classname>MongoTemplate</classname> will derive the name of the input
+      collection from this class. Otherwise if you don't not specify an input
+      class you must provide the name of the input collection explicitly. If
+      an input-class and an explicity input-collection is provided the latter
+      takes precedence.</para>
+    </section>
+
+    <section id="mongo.aggregation.supported-aggregation-operations">
+      <title>Supported Aggregation Operations</title>
+
+      <para>The MongoDB Aggregation Framework provides the following types of
+      Aggregation Operations:</para>
+
+      <itemizedlist>
+        <listitem>
+          <para>Pipeline Aggregation Operators</para>
+        </listitem>
+
+        <listitem>
+          <para>Group Aggregation Operators</para>
+        </listitem>
+
+        <listitem>
+          <para>Boolean Aggregation Operators</para>
+        </listitem>
+
+        <listitem>
+          <para>Comparison Aggregation Operators</para>
+        </listitem>
+
+        <listitem>
+          <para>Arithmetic Aggregation Operators</para>
+        </listitem>
+
+        <listitem>
+          <para>String Aggregation Operators</para>
+        </listitem>
+
+        <listitem>
+          <para>Date Aggregation Operators</para>
+        </listitem>
+
+        <listitem>
+          <para>Conditional Aggregation Operators</para>
+        </listitem>
+      </itemizedlist>
+
+      <para>At the time of this writing we provide support for the following
+      Aggregation Operations in Spring Data MongoDB.</para>
+
+      <table>
+        <title>Aggregation Operations currently supported by Spring Data
+        MongoDB</title>
+
+        <tgroup cols="2">
+          <tbody>
+            <row>
+              <entry>Pipeline Aggregation Operators</entry>
+
+              <entry>project, skip, limit, unwind, group, sort,
+              geoNear</entry>
+            </row>
+
+            <row>
+              <entry>Group Aggregation Operators</entry>
+
+              <entry>add_to_set, first, last, max, min, avg, push, sum,
+              (*count)</entry>
+            </row>
+
+            <row>
+              <entry>Comparison Aggregation Operators</entry>
+
+              <entry>eq (*via: is), gt, gte, lt, lte, ne</entry>
+            </row>
+          </tbody>
+        </tgroup>
+      </table>
+
+      <para>Note that the aggregation operations not listed here are currently
+      not supported by Spring Data MongoDB. </para>
+
+      <para>*) The operation is mapped or added by Spring Data MongoDB.</para>
+    </section>
+
+    <section id="mongo.aggregation.introductory-example">
+      <title>Aggregation Framework Example 1</title>
+
+      <para>In this introductory example we want to aggregate a list of tags
+      to get the occurence count of a particular tag from a MongoDB collection
+      called <code>"tags"</code> sorted by the occurence count in descending
+      order. This example demonstrates the usage of grouping, sorting,
+      projections (selection) and unwinding (result splitting).</para>
+
+      <para>In order to do this we first create a new aggregation via the
+      <methodname>newAggregation</methodname> static factory method to which
+      we pass a list of aggregation operations. These aggregate operations
+      form the aggregation pipeline of our <classname>Aggregation</classname>.
+      </para>
+
+      <para>As a first step we select the <code>"tags"</code> field (which is
+      an array of strings) from the input collection with the
+      <methodname>project</methodname> operation. </para>
+
+      <para>In a second step we use the <methodname>unwind</methodname>
+      operation to generate a new document for each tag within the
+      <code>"tags"</code> array. </para>
+
+      <para>In the third step we use the <methodname>group</methodname>
+      operation to define a group for each <code>"tags"</code>-value for which
+      we aggregate the occurence count via the <methodname>count</methodname>
+      aggregation operator and collect the result in a new field called
+      <code>"n"</code>. </para>
+
+      <para>As a forth step we select the field <code>"n"</code> and create an
+      alias for the id-field generated from the previous group operation with
+      the name <code>"tag"</code>. </para>
+
+      <para>Finally as the fifth step we sort the resulting list of tags by
+      their occurence count in descending order via the
+      <methodname>sort</methodname> operation.</para>
+
+      <para>In order to let MongoDB perform the acutal aggregation operation
+      we call the <methodname>aggregate</methodname> Method on the
+      MongoTemplate with the created <classname>Aggregation</classname> as an
+      argument. </para>
+
+      <programlisting language="java">class TagCount {
+ private String tag;
+ private int n;
+…
+}
+
+import static org.springframework.data.mongodb.core.aggregation.Aggregation.*;
+…
+Aggregation agg = newAggregation(
+    project("tags"),
+    unwind("tags"),
+    group("tags").and("n").count(),
+    project("n").and("tag").previousOperation(),
+    sort(DESC, "n")
+);
+
+AggregationResults&lt;TagCount&gt; results = mongoTemplate.aggregate(agg, "tags", TagCount.class);
+List&lt;TagCount&gt; tagCount = results.getMappedResults();
+…</programlisting>
+
+      <para>Note that the input collection is explicitly specified as the
+      <code>"tags"</code> parameter to the <methodname>aggregate</methodname>
+      Method. If the name of the input collection is not specified explicitly,
+      it is derived from the input-class passed as first parameter to the
+      <methodname>newAggreation</methodname> Method. </para>
+    </section>
+
+    <section id="mongo.aggregation.example-2">
+      <title>Aggregation Framework Example 2</title>
+
+      <para>This example is based on the <ulink
+      url="http://docs.mongodb.org/manual/tutorial/aggregation-examples/#largest-and-smallest-cities-by-state">Largest
+      and Smallest Cities by State</ulink> example from the MongoDB
+      Aggregation Framework documentation. We added additional sorting to
+      produce stable results with different MongoDB versions. Here we want to
+      return the smallest and largest cities by population for each state,
+      using the aggregation framework. This example demonstrates the usage of
+      grouping, sorting and projections (selection).</para>
+
+      <para>The class <classname>ZipInfo</classname> maps the structure of the
+      given input-collection. The class <classname>ZipInfoStats</classname>
+      defines the structure in the desired output format.</para>
+
+      <para>As a first step we use the <methodname>group</methodname>
+      operation to define a group from the input-collection. The grouping
+      criteria is the combination of the fields <code>"state"</code> and
+      <code>"city" </code>which forms the id structure of the group. We
+      aggregate the value of the <code>"population"</code> field from the
+      grouped elements with by using the <methodname>sum</methodname> operator
+      saving the result in the field <code>"pop"</code>.</para>
+
+      <para>In a second step we use the <methodname>sort</methodname>
+      operation to sort the intermediate-result by the fields
+      <code>"pop"</code>, <code>"state"</code> and <code>"city"</code> in
+      ascending order, such that the smallest city is at the top and the
+      biggest city is at the bottom of the result. Note that the sorting on
+      "state" and <code>"city"</code> is implicitly performed against the
+      group id fields which Spring Data MongoDB took care of. </para>
+
+      <para>In the third step we use a <methodname>group</methodname>
+      operation again to group the intermediate result by
+      <code>"state"</code>. Note that <code>"state"</code> again implicitly
+      references an group-id field. We select the name and the population
+      count of the biggest and smallest city with calls to the
+      <code>last(…)</code> and <code>first(...)</code> operator respectively
+      via the <methodname>project</methodname> operation.</para>
+
+      <para>As the forth step we select the <code>"state"</code> field from
+      the previous <methodname>group</methodname> operation. Note that
+      <code>"state"</code> again implicitly references an group-id field. As
+      we do not want an implict generated id to appear, we exclude the id from
+      the previous operation via
+      <code>and(previousOperation()).exclude()</code>. As we want to populate
+      the nested <classname>City</classname> structures in our output-class
+      accordingly we have to emit appropriate sub-documents with the nested
+      method. </para>
+
+      <para>Finally as the fifth step we sort the resulting list of
+      <classname>StateStats</classname> by their state name in ascending order
+      via the <methodname>sort</methodname> operation.</para>
+
+      <programlisting language="java">class ZipInfo {
+   String id;
+   String city;
+   String state;
+   @Field("pop") int population;
+   @Field("loc") double[] location;
+…
+}
+
+class City {
+   String name;
+   int population;
+…
+}
+
+class ZipInfoStats {
+   String id;
+   String state;
+   City biggestCity;
+   City smallestCity;
+…
+}
+
+import static org.springframework.data.mongodb.core.aggregation.Aggregation.*;
+…
+TypedAggregation&lt;ZipInfo&gt; aggregation = newAggregation(ZipInfo.class,
+    group("state", "city").and("pop").sum("population"),
+    sort(ASC, "pop", "state", "city"),
+    group("state")
+        .and("biggestCity").last("city")
+        .and("biggestPop").last("pop")
+        .and("smallestCity").first("city")
+        .and("smallestPop").first("pop"),
+    project()
+        .and(previousOperation()).exclude()
+        .and("state").previousOperation()
+        .and("biggestCity").nested(bind("name", "biggestCity").and("population", "biggestPop"))
+        .and("smallestCity").nested(bind("name", "smallestCity").and("population", "smallestPop")),
+    sort(ASC, "state")
+);
+
+AggregationResults&lt;ZipInfoStats&gt; result = mongoTemplate.aggregate(aggregation, ZipInfoStats.class);
+ZipInfoStats firstZipInfoStats = result.getMappedResults().get(0);
+…
+</programlisting>
+
+      <para>Note that we derive the name of the input-collection from the
+      <classname>ZipInfo</classname>-class passed as first parameter to the
+      <methodname>newAggregation</methodname>-Method.</para>
+    </section>
+
+    <section id="mongo.aggregation.example-3">
+      <title>Aggregation Framework Example 3</title>
+
+      <para>This example is based on the <ulink
+      url="http://docs.mongodb.org/manual/tutorial/aggregation-examples/#states-with-populations-over-10-million">States
+      with Populations Over 10 Million </ulink>example from the MongoDB
+      Aggregation Framework documentation. We added additional sorting to
+      produce stable results with different MongoDB versions. Here we want to
+      return all states with a population greater than 10 million, using the
+      aggregation framework. This example demonstrates the usage of grouping,
+      sorting and matching (filtering).</para>
+
+      <para>In the first step we group the input collection by the
+      <code>"state"</code> field and calculate the sum of the
+      <code>"population"</code> field and store the result in the new field
+      <code>"totalPop"</code>.</para>
+
+      <para>As a second step we sort the intermediate result by the
+      id-reference of the previous group operation in addition to the
+      <code>"totalPop"</code> field in ascending order.</para>
+
+      <para>Finally in the third step we filter the intermediate result by
+      using a <methodname>match</methodname> operation which accepts a
+      <classname>Criteria</classname> query as an argument.</para>
+
+      <programlisting language="java">class StateStats {
+   @Id String id;
+   String state;
+   @Field("totalPop") int totalPopulation;
+…
+}
+
+import static org.springframework.data.mongodb.core.aggregation.Aggregation.*;
+…
+TypedAggregation&lt;ZipInfo&gt; agg = newAggregation(ZipInfo.class,
+   group("state").and("totalPop").sum("population"),
+   sort(ASC, previousOperation(), "totalPop"),
+   match(where("totalPop").gte(10 * 1000 * 1000))
+);
+
+AggregationResults&lt;StateStats&gt; result = mongoTemplate.aggregate(agg, StateStats.class);
+List&lt;StateStats&gt; stateStatsList = result.getMappedResults();
+…</programlisting>
+
+      <para>Note that we derive the name of the input-collection from the
+      <classname>ZipInfo</classname>-class passed as first parameter to the
+      <methodname>newAggregation</methodname>-Method.</para>
+    </section>
+  </section>
+
  <section id="mongo.custom-converters">
    <title>Overriding default mapping with custom converters</title>

@ -2574,9 +2971,10 @@ mongoTemplate.dropCollection("MyNewCollection");        </programlisting>
				@@ -2574,9 +2971,10 @@ mongoTemplate.dropCollection("MyNewCollection");        </programlisting>

        <listitem>
          <para><literal>&lt;T&gt; T</literal> <emphasis role="bold">execute
-          </emphasis> <literal>(DbCallback&lt;T&gt; action) </literal>
-          Executes a DbCallback translating any exceptions as
-          necessary.</para>
+          </emphasis> <literal>(DbCallback&lt;T&gt; action) Spring Data
+          MongoDB provides support for the Aggregation Framework introduced to
+          MongoDB in version 2.2.</literal> Executes a DbCallback translating
+          any exceptions as necessary.</para>
        </listitem>

        <listitem>