Do you want your ad here?

Contact us to get your ad seen by thousands of users every day!

[email protected]

MongoDB Aggregation Framework: A Beginner’s Guide

  • June 05, 2025
  • 553 Unique Views
  • 4 min read
Table of Contents
Aggregation pipelineAggregation stagesCombining stagesWrapping up

Finding exactly the data we need isn’t always a simple task.

You’ve probably faced situations where you needed to filter information, group it, and even perform calculations to produce a final result.

And often, delivering this processed data to the client is essential for the application’s success. MongoDB offers two main ways to fetch data:

  • find()
  • aggregate()

While .find() is great for basic queries, it doesn’t cover more advanced scenarios like transformations and complex data processing. That’s where the MongoDB Aggregation Framework comes in.

The MongoDB Aggregation Framework works like a pipeline—a series of stages where each step processes the data in some way. When we use the aggregate() method, we’re building this sequence of operations.

Before diving into MongoDB, here’s a simple example of a pipeline in Java:

List<String> names = Arrays.*asList*("Alice", "Aloisio",  "alice", "andre", "Ricardo", "Jose", "Maria");
var count = names.stream()
      .map(String::toLowerCase)
      .filter(name -> name.startsWith("a"))
      .distinct()
      .count();
System.out.println(count); 

// output = 3

If you look closely, it uses functions like map, filter, distinct, and count.
In other words, it:

  1. Converts each name to lowercase.
  2. Filters names that start with "a".
  3. Removes duplicate names.
  4. Counts the total unique names.

This is the essence of a pipeline: chaining operations that refine data step by step until you get the final result.

In MongoDB, we do something very similar.

Aggregation pipeline

An aggregation pipeline consists of one or more stages. Each stage represents a step that will be executed.

For example, consider a transactions collection where we want to count how many transactions contain errors. We could filter by status and then count:

[  
  {  
    $match: {  
      status: "error"  
    }  
  },  
  {  
    $count: "total errors"  
  }  
]

Here, we apply a $match filter to select only the documents with status equal to "error," and then use $count to calculate the total number of transactions in this status.

Each stage in the pipeline is executed in order, and each one only processes the results from the previous stage. So in the example above, even if there are 1,000 transactions in total, the $count stage only counts the transactions that matched the "error" status from the $match stage.

Aggregation stages

As mentioned earlier, stages are used to build a pipeline. In this section, let’s take a look at some stages that are useful for our day-to-day work.

To explore their capabilities, we’ll create a collection called articles that will contain the following documents:

db.articles.insertMany(  
   [  
       {  
           _id: 1,  
           title: "Spring Data Unlocked: Getting Started With Java and MongoDB",  
           tags: ["Java", "MongoDB", "Spring"],  
           publishedAt: ISODate("2024-11-11T00:00:00Z"),  
           authors: ["Ricardo Mello"],  
           url: "https://www.mongodb.com/developer/products/mongodb/springdata-getting-started-with-java-mongodb/"  
       },  

       {  
           _id: 2,  
           title: "Java Meets Queryable Encryption: Developing a Secure Bank Account Application",  
           tags: ["Java", "Security", "MongoDB"],  
           publishedAt: ISODate("2024-10-08T00:00:00Z"),  
           authors: ["Ricardo Mello"],  
           url: "https://www.mongodb.com/developer/products/atlas/java-queryable-encryption/"  
       },      
       {  
           _id: 3,  
           title: "Beyond Basics: Enhancing Kotlin Ktor API With Vector Search",  
           tags: ["Kotlin", "Vector Search", "MongoDB"],  
           publishedAt: ISODate("2024-09-18T00:00:00Z"),  
           authors: ["Ricardo Mello"],  
           url: "https://www.mongodb.com/developer/products/atlas/beyond-basics-enhancing-kotlin-ktor-api-vector-search/"  
       },  

   ]  
)

$Match

This is one of the most common stages you’ll use. It basically serves to filter documents based on a specific query. For example, if you only want to return the document with _id: 3, you can use:

db.articles.aggregate([  
   { $match: { _id: 3 } }  
])  
// This will return Beyond Basics's article

$Project

We use this stage to specify which fields we’d like to include in our results.

Suppose we want to return all documents and project only the title and author fields.

db.articles.aggregate([  
   { $project: { _id: 0, title: 1, authors: 1 } }  
])

The result would look like this:

{  
   "title": "Beyond Basics: Enhancing Kotlin Ktor API With Vector Search",  
   "authors": ["Ricardo Mello"]  
 },  

 //.. Others..

$Unwind

The $unwind stage is used to deconstruct an array into multiple documents. For example:

db.articles.aggregate([
   { $unwind: "$tags" }
])

For each tag in the `tags` array, the document will be repeated in the query results.
This way, you can analyze or process each tag individually:

 {
  "_id": 2,
  "title": "Spring Data Unlocked: Getting Started With Java and MongoDB",
  "tags": "Java",
  // other fields...
 }
 {
  "_id": 2,
  "title": "Spring Data Unlocked: Getting Started With Java and MongoDB",
  "tags": "MongoDB",
  // other fields...
}
 {
  "_id": 2,
  "title": "Spring Data Unlocked: Getting Started With Java and MongoDB",
  "tags": "Spring",
  // other fields...
},
  // other Documents...

$Group

As the name suggests, we use this stage to group our results. This time, we’ll use the `$unwind` stage we saw earlier to deconstruct the array of tags and find out how many articles exist for each tag:

db.articles.aggregate([  
   { $unwind: "$tags" },  
   {  
       $group: {  
           _id: "$tags",  
           totalArticles: { $sum: 1 }  
       }  
   }  
])

The result would look like this:

[  
   {  
       "_id": "Security",  
       "totalArticles": 1  
   },  
   {  
       "_id": "MongoDB",  
       "totalArticles": 4  
   },  
   {  
       "_id": "Java",  
       "totalArticles": 2  
   }  
   .. other tags (Kotlin, Vector Search ..)  
]

$Sort

Continuing with our example—what if we want to query all articles and sort them by publication date, from newest to oldest?

db.articles.aggregate([  
   { $sort: { publishedAt: -1 } }  
])

And if we want to reverse the order—showing the oldest articles first—we just use `1` instead of `-1`.

$AddFields

This stage is useful when we want to add a new field in our result.

Let’s say our client requested that we display a field called `publishedYear` containing only the year:

db.articles.aggregate([  
   {  
     $addFields: {  
       publishedYear: { $year: "$publishedAt" }  
     }  
   }  
 ])

Our result would look something like this:

 "_id": 2,  
 .. other fields  
 "publishedYear": 2024 // FIELD ADDED  
// Other fields ..

Here, you can see that we’re using an operator called $year to extract the year from our publishedAt field. To learn about other operators, check out our official documentation page on aggregation operators.

Combining stages

As we explored earlier, a pipeline can combine multiple stages. Let’s say we want to know the total number of articles published in 2025 and beyond. We can combine the $match and $count stages for this:

db.articles.aggregate(
   [
       {
           $match: {
             publishedAt: { $gt: ISODate("2024-12-31T00:00:00Z") }
           }
       },
       {
           $count: 'total'
       }
   ]
)

Notice that we’re using the $gt operator to filter for years greater than the specific date.

Wrapping up

Aggregation Pipeline is a powerful alternative that MongoDB offers for combining stages and extracting data in an accurate and efficient way. There’s a whole world of stages and operators for you to explore.

Always turn to the MongoDB community for your questions. I hope this article has been useful to you all.

Do you want your ad here?

Contact us to get your ad seen by thousands of users every day!

[email protected]

Comments (0)

Highlight your code snippets using [code lang="language name"] shortcode. Just insert your code between opening and closing tag: [code lang="java"] code [/code]. Or specify another language.

No comments yet. Be the first.

Subscribe to foojay updates:

https://foojay.io/feed/
Copied to the clipboard