- Reference >
- Operators >
- Aggregation Pipeline Operators >
- Pipeline Aggregation Stages >
- $group (aggregation)
$group (aggregation)¶
On this page
Definition¶
-
$group¶ Groups documents by some specified expression and outputs to the next stage a document for each distinct grouping. The output documents contain an
_idfield which contains the distinct group by key. The output documents can also contain computed fields that hold the values of some accumulator expression grouped by the$group’s_idfield.$groupdoes not order its output documents.The
$groupstage has the following prototype form:The
_idfield is mandatory; however, you can specify an_idvalue of null, or any other constant value, to calculate accumulated values for all the input documents as a whole.The remaining computed fields are optional and computed using the
<accumulator>operators.The
_idand the<accumulator>expressions can accept any valid expression. For more information on expressions, see Expressions.
Considerations¶
Accumulator Operator¶
The <accumulator> operator must be one of the following accumulator
operators:
| Name | Description |
|---|---|
$sum |
Returns a sum of numerical values. Ignores non-numeric values. |
$avg |
Returns an average of numerical values. Ignores non-numeric values. |
$first |
Returns a value from the first document for each group. Order is only defined if the documents are in a defined order. Available in |
$last |
Returns a value from the last document for each group. Order is only defined if the documents are in a defined order. Available in |
$max |
Returns the highest expression value for each group. |
$min |
Returns the lowest expression value for each group. |
$push |
Returns an array of expression values for each group. Available in |
$addToSet |
Returns an array of unique expression values for each group. Order of the array elements is undefined. Available in |
$stdDevPop |
Returns the population standard deviation of the input values. |
$stdDevSamp |
Returns the sample standard deviation of the input values. |
$group Operator and Memory¶
The $group stage has a limit of 100 megabytes of RAM. By
default, if the stage exceeds this limit, $group will
produce an error. However, to allow for the handling of large datasets,
set the allowDiskUse option to
true to enable $group operations to write to temporary
files. See db.collection.aggregate() method and the
aggregate command for details.
Changed in version 2.6: MongoDB introduces a limit of 100 megabytes of RAM for the
$group stage as well as the allowDiskUse option to handle operations for large
datasets.
Examples¶
Calculate Count, Sum, and Average¶
Given a collection sales with the following documents:
Group by Month, Day, and Year¶
The following aggregation operation uses the $group stage
to group the documents by the month, day, and year and calculates the
total price and the average quantity as well as counts the documents
per each group:
The operation returns the following results:
Group by null¶
The following aggregation operation specifies a group _id of
null, calculating the total price and the average quantity as well
as counts for all documents in the collection:
The operation returns the following result:
Retrieve Distinct Values¶
Given a collection sales with the following documents:
The following aggregation operation uses the $group stage
to group the documents by the item to retrieve the distinct item values:
The operation returns the following result:
Pivot Data¶
A collection books contains the following documents:
Group title by author¶
The following aggregation operation pivots the data in the books
collection to have titles grouped by authors.
The operation returns the following documents:
Group Documents by author¶
The following aggregation operation uses the $$ROOT
system variable to group the documents by authors. The resulting
documents must not exceed the BSON Document Size limit.
The operation returns the following documents:
See also
The Aggregation with the Zip Code Data Set
tutorial provides an extensive example of the $group
operator in a common use case.