Your Complete Guide to Diagnose Slow Queries in MongoDB

June 17, 2025
2637 Unique Views
24 min read

Table of Contents

1. MongoDB’s Query Profiler

What we’ll need
What is MongoDB’s Profiler?
MongoDB Atlas Query Profiler in the Atlas UI
A few important considerations
MongoDB Database Profiler
Let's make something slow on purpose

2. Understanding execution plans with explain()

What we'll need
What is explain()?
Verbosity modes
Let’s use explain() on a write
How to read execution stages
How about when we need to compare index usage?

3. Watching queries in real time with db.currentOp()

What we'll need
What is db.currentOp()?
Killing a long-running query
What to look for in db.currentOp()
A few notes

4. Identifying inefficient query and write patterns in large data sets

What we'll need
Look for query patterns that don’t scale
Measuring indexes that aren’t helping with $indexStats
Watch for document growth and update pain
When to watch for sharding symptoms
Recap:

5. Analyzing aggregation query performance

Diagnosing poor stage ordering
Improving pipeline efficiency through reordering
Profiling aggregations in Atlas
Diagnosing $lookup inefficiencies
Sorting before filtering: A hidden cost
Aggregations with MongoDB Atlas Search
Aggregation diagnostics summary

Conclusion

MongoDB is built to be fast. The real win comes from knowing how to keep it that way, even as your app grows and your data gets more complex.

This tutorial is for those developers and engineers who want to understand how MongoDB behaves when queries or writes slow down. We’re not here to hand you indexing tips or premature advice about schema design. We’re here to walk through the actual tools MongoDB gives you to observe, measure, and reason about query performance, before you start optimizing anything.

We’ll cover the full stack of diagnostics available to you:

The Query Profiler (via the Atlas UI and system.profile)
The explain() method to understand how queries are planned and executed
The db.currentOp() command for watching operations in real time
Deep dives into detecting scaling issues and aggregation pipeline inefficiencies

In order to follow along, we'll be interacting with MongoDB Atlas and MongoDB Shell (mongosh). I recommend you have these both set up before starting. Some tools shown in this will be restricted to M10+ clusters, but I'll highlight where that's relevant, and there should be something to take away for everyone, whether you're on a free-forever M0 cluster and just getting started, or a seasoned MongoDB veteran running enterprise workloads. We’ll run the commands, see what the output looks like, and learn how to read it.

1. MongoDB’s Query Profiler

We’ve all done it. We run a query. It hangs. We check the app logs. We stare. Maybe we slap an index on something and hope it helps. Or maybe, just maybe, we start guessing at what our database is actually doing behind the scenes.

But MongoDB doesn’t want us to guess. It wants to show us. And it gives us the tools to do just that:

The Atlas Query Profiler: A beautiful, scatterplot-filled experience in the MongoDB Atlas UI
The Database Profiler: A raw, firehose-level stream of operations you can tap into via mongosh

We’re gonna use both.

What we’ll need

Before we get started, make sure you have:

A MongoDB Atlas cluster (M10 or higher).
A database with the sample data loaded (Atlas makes this easy—click the “Load Sample Dataset” button).
Access to mongosh (or your preferred shell).
Permissions to set profiler levels (atlasAdmin or readWriteAnyDatabase is fine).

What is MongoDB’s Profiler?

MongoDB can keep track of which operations are running slowly and log them for us. We just need to tell it what we care about.

The MongoDB Atlas Query Profiler helps diagnose and monitor slow-running queries using log data from your cluster. It aggregates, filters, and visualizes performance statistics via the Query Insights interface.

The Database Profiler is more granular, capturing all operations that match a profiling threshold and storing them in the system.profile collection.

> Note: Changing the profiling level via db.setProfilingLevel() does not affect what is shown in the MongoDB Atlas UI. Atlas uses its own internal logic to determine what is “slow” unless configured otherwise.

MongoDB Atlas Query Profiler in the Atlas UI

Let’s check the Atlas Query Profiler now. It is enabled by default.

Go to your cluster.
Click Monitoring.
Head to Query Insights -> Query Profiler.

Blank Query profiler in Atlas UI
> Note: The Atlas Query Profiler is available on M10+ clusters.

Now, ours will be pretty empty, so let's run a few expensive queries, and give ourselves something to look at. Let’s head over to our terminal and open mongosh. First, we’ll select our sample_mflix database with the command use sample_mflix. We’re going to run a few intentionally slow queries to give the profiler something to log:

db.movies.find({ title: /The/ }).hint({ $natural: 1 });

db.movies.find(
  { plot: /love/i },
  {
    title: 1,
    runtime: 1,
    complexCalc: { $multiply: ["$runtime", 42] } 
  }
).hint({ $natural: 1 });

db.movies.find({
  $or: [
    { directors: { $size: 2 } },
    { year: { $lt: 1980 } }
  ],
  $where: function () {
    return this.cast && this.cast.includes("Robert De Niro") && this.runtime > 90;
  }
}).hint({ $natural: 1 });

These queries are intentionally inefficient. They ignore indexes and force full collection scans. Perfect for generating profiler activity.

If we go back to our UI, we should see some values populating our scatterplot. Each dot represents one or more slow operations grouped by similar characteristics. Click one and we’ll see:

The operation type (find, update, etc.).
The execution time.
Whether an index was used.
The number of docs scanned.

Populated Query Profiler in Atlas UI

And if we click “View more details,” we can have a deep dive into more metrics to give us a clear picture of how the query executed and why it might be slow.

Here’s what a slow query might look like:

Expanded Query details in Atlas UI

If docsExamined is huge and keysExamined is zero (or a low number), you’re scanning the whole collection. That’s your signal to add an index, or rewrite the query.

You’ll also see the full raw query object, which includes the query shape, execution plan summary, and any JavaScript used.

Start with these three:

millis: How long did it take?
docsExamined: How much work was done?
keysExamined: Did the query use an index?

"planSummary": "COLLSCAN",
"docsExamined": 21349,
"nreturned": 8,
"keysExamined": 0,
"cpuNanos": 236256326,
"durationMillis": 656,
"queryFramework": "classic",
"command": {
  "find": "movies",
  "filter": {
    "$or": [
      { "directors": { "$size": 2 } },
      { "year": { "$lt": 1980 } }
    ],
    "$where": {
      "$code": "function () { return this.cast && this.cast.includes(\"Robert De Niro\") && this.runtime > 90; }"
    }
  },
  "hint": { "$natural": 1 }
}

This is textbook collection scan territory:

COLLSCAN means the query didn't use an index
21K docs scanned, only eight returned
656 ms runtime, mostly wasted doing extra work

You should consider taking action when the profiler reveals signs of inefficiency. For example, if the Examined:Returned ratio is high, it often means MongoDB is doing a lot of work to return very little, suggesting it's time to add an index or restructure the query. If the query didn't use an index at all and it's one that runs frequently, that's a strong signal that an index should be created. And if the execution time (shown as millis) is consistently high, it may be worth optimizing the query logic, batching results, or even rethinking how the data is being accessed altogether.

Did you know? The Performance Advisor (next tab over from the Query Profiler in Atlas) can suggest indexes based on these logs. If you're seeing a lot of full scans, this is a great next stop.

Perfomance profiler in Atlas UI

A few important considerations

Security: The Query Profiler may surface query parameters, which can include PII or sensitive business logic. Ensure this visibility aligns with your security policies.
Performance: Changing the profiling level on a production cluster (db.setProfilingLevel) can affect performance and logging volume. Avoid overly broad settings in high-throughput environments.
Limitations:
- Up to 100,000 operations shown per selected time range
- Up to five-minute delay between log ingestion and display
- Heavy spikes in log volume may temporarily halt log collection
- Logs are sampled and binned, so zooming is required for granularity

You’ve now got full visibility into what MongoDB is doing under the hood when it runs your queries.

MongoDB Database Profiler

Atlas gives you a macro view, what’s noisy overall. Shell gives you a micro view, what’s happening right now. The profiler collects statistics for operations like:

find(), update(), delete(), aggregate().
Time taken to execute.
Whether an index was used.
How many documents were scanned vs. returned.
Command size and response length.

Let’s launch our mongosh and connect to our cluster. Now, let’s tell MongoDB to start tracking slow queries. First, we’ll select the sample_airbnb database (use sample_airbnb):

db.setProfilingLevel(2);

This will log everything. It is excessive, but for our examples, we want to see exactly what’s going on. Make sure not to leave this on for your production database, though. You will feel it!

It’s important to select the database we wish to profile before we turn on profiler as it is a per database operation.

We can adjust the threshold based on our use case.

0: Off
1: Log slow operations only (db.setProfilingLevel(1,100) will log anything slower than 100ms)
2: Log all operations

Check that it worked:

db.getProfilingStatus();

We should see something like this:

{
  "was": 2,
  "slowms": 100,
  "sampleRate": 1,
  "filter": { }
}

Let's make something slow on purpose

We’re using the sample_airbnb database now (use sample_airbnb). This one has a listingsAndReviews collection, great for some slow fun.

db.listingsAndReviews.find({
  $or: [
    { "address.market": "Berlin" },
    { "review_scores.review_scores_cleanliness": { $lt: 5 } }
  ],
  $where: function () {
    return this.amenities && this.amenities.length > 15;
  }
}).sort({ description: 1 }).hint({ $natural: 1 });

The .hint({ $natural: 1 }) tells MongoDB to read documents in the order they’re stored on disk, bypassing all indexes. It’s a guaranteed collection scan, perfect for generating profiler logs, terrible for performance.

Let’s do another one with writes:

db.listingsAndReviews.updateMany(
  {
    $or: [
      { "address.market": "Berlin" },
      { "review_scores.review_scores_cleanliness": { $lt: 5 } }
    ],
    $where: function () {
      return this.amenities && this.amenities.length > 15;
    }
  },
  {
    $set: {
      needs_review: true
    }
  },
  {
    hint: { $natural: 1 }
  }
);

Now, we can go check what just got logged:

db.system.profile.find().sort({ ts: -1 }).limit(2).pretty();

And we’ll see something like this output I abbreviated:

{
  op: 'query',
  ns: 'sample_airbnb.listingsAndReviews',
  command: {
    find: 'listingsAndReviews',
    filter: {
      '$or': [
        {
          'address.market': 'Berlin'
        },
        {
          'review_scores.review_scores_cleanliness': {
            '$lt': 5
          }
        }
      ],
      '$where': Code(
      'function () {\n' +
        '    return this.amenities && this.amenities.length > 15;\n' +
        '  }'
      )
    },
    sort: {
      description: 1
    },
    hint: {
      '$natural': 1
    },
    '$db': 'sample_airbnb'
  },
  keysExamined: 0,
  docsExamined: 5555,
  hasSortStage: true,
  nBatches: 1,
  cursorExhausted: true,
  numYield: 1,
  nreturned: 12,
  responseLength: 73402,
  millis: 46,
  planSummary: 'COLLSCAN',
  planningTimeMicros: 22775,
  execStats: {
    isCached: false,
    stage: 'SORT',
    nReturned: 12,
    executionTimeMillisEstimate: 23,
    works: 5569,
    advanced: 12,
    needTime: 5556,
    needYield: 0,
    saveState: 1,
    restoreState: 1,
    isEOF: 1,
    sortPattern: {
      description: 1
    },
    type: 'simple',
    totalDataSizeSorted: 83047,
    inputStage: {
      stage: 'COLLSCAN',
      filter: {
        '$and': [
          {
            '$or': [
              {
                'address.market': {
                  '$eq': 'Berlin'
                }
              },
              {
                'review_scores.review_scores_cleanliness': {
                  '$lt': 5
                }
              }
            ]
          },
          {
            '$where': Code(
            'function () {\n' +
              '    return this.amenities && this.amenities.length > 15;\n' +
              '  }'
            )
          }
        ]
      },
      nReturned: 12,
      executionTimeMillisEstimate: 23,
      docsExamined: 5555
    }
  },
  ...
}

We’re looking for these fields:

Field	What it tells you
`millis`	The time the operation took to execute, in milliseconds. This is our go-to number for, “Was it fast or slow?” Keep in mind it includes everything: query parsing, planning, execution, and response serialization.
`docsExamined`	The number of documents MongoDB had to read from disk or memory to complete the operation. This is key for spotting collection scans or poor index use. The lower this is, the better it should match or be close to the number of results returned.
`keysExamined`	The number of index keys MongoDB had to scan. This helps us understand how selective your query is. A high number here (with a low docsExamined) often means the query used an index efficiently. A value of 0 means no index was used—i.e., we’re likely doing a full collection scan.

That’s step one. You’ve got the x-ray. Next, let’s grab the scalpel: We’ll dissect how MongoDB is actually executing these queries using explain().

2. Understanding execution plans with `explain()`

MongoDB’s profiler told us what ran slow. Now, we want to know why. That’s where our explain() comes in.

When you run a query with explain(), MongoDB walks you through how it planned and executed it. You get details about indexes, scan strategies, how many documents were looked at, and how much time was spent. It helps you detect when you’re scanning too many documents, spot missing or inefficient indexes, and validate whether your query changes actually help.

What we'll need

Before we get started, make sure you have:

mongosh connected to your Atlas cluster.
A sample dataset loaded (especially the sample_mflix database).
Permission to run the underlying command—e.g., find.explain() requires read access only—to use explain().

What is `explain()`?

explain() gives you insight into how MongoDB processes a query: the winning plan it selected, whether indexes were used, and how many documents were scanned or returned. You can use explain() on:

find()
update()
delete()
aggregate()

The method comes in two syntactic forms:

// Most common usage
db.collection.find({ ... }).explain("executionStats");

// Alternative chaining syntax
db.collection.explain("executionStats").find({ ... });

Both syntaxes are valid, but the first is more common in shell and scripts. The second (db.collection.explain().find(...)) is helpful when chaining query modifiers like:

.sort()
.limit()
.projection()
.hint()

For example, let’s switch back to our sample_mflix database:

db.movies.explain("executionStats")
  .find({ year: { $gt: 2000 } })
  .sort({ year: -1 })
  .limit(5);

This version lets you see how all those combined modifiers affect the query plan, without needing to nest multiple .explain() calls or wrap in extra parentheses.

This returns a ton of useful info. Here’s what to look for:

Field	What it tells you
`executionTimeMillis`	How long the query actually took to run
`totalDocsExamined`	How many documents were scanned
`totalKeysExamined`	How many index entries were scanned
`nReturned`	How many documents matched the query
`stage`	What kind of scan was performed (`COLLSCAN`, `IXSCAN`, etc.)

Here’s an abbreviated version of the output:

{
  queryPlanner: {
    winningPlan: {
      stage: "SORT",
      sortPattern: { year: -1 },
      limitAmount: 5,
      inputStage: {
        stage: "COLLSCAN",
        filter: { year: { $gt: 2000 } }
      }
    }
  },
  executionStats: {
    nReturned: 5,
    executionTimeMillis: 96,
    totalDocsExamined: 21349,
    totalKeysExamined: 0,
    executionStages: {
      stage: "SORT",
      inputStage: {
        stage: "COLLSCAN",
        docsExamined: 21349
      }
    }
  }
}

How to read this:

stage: "COLLSCAN": MongoDB scanned the entire collection (no index used).
docsExamined: 21349: It had to read 21K documents to return just five.
totalKeysExamined: 0: Confirms no index was touched.
stage: "SORT": MongoDB sorted results in memory.
executionTimeMillis: 96: The whole thing took just under 100 ms.

Even though it worked fast in this small dataset, this plan doesn’t scale. A filter + sort like this should ideally use a compound index—e.g., { year: -1 }.

Tip: When using .explain().find(), the result is a document (not a cursor), so you get back the full plan immediately without needing .next() or .toArray().

Verbosity modes

The explain() method accepts a verbosity mode that controls how much detail you get:

Mode	What it shows
`"queryPlanner"` (default)	The query shape and chosen plan—no execution
`"executionStats"`	Runs the query and shows real metrics for the winning plan
`"allPlansExecution"`	Shows execution stats for all candidate plans—great for debugging plan selection

Important behavior notes:

explain() ignores the plan cache—this is intentional so you can inspect the plan fresh.
It also prevents creating a new plan cache entry.
explain() is supported in all MongoDB Atlas, Enterprise, and Community environments.

Now, let’s add an index:

db.movies.createIndex({ year: -1 });

And once this is created, we can run our query again:

db.movies.explain("executionStats")
  .find({ year: { $gt: 2000 } })
  .sort({ year: -1 })
  .limit(5);

In our new (abbreviated) output:

{
  queryPlanner: {
    winningPlan: {
      stage: "LIMIT",
      inputStage: {
        stage: "FETCH",
        inputStage: {
          stage: "IXSCAN",
          indexName: "year_-1",
          indexBounds: {
            year: ["[inf.0, 2000)"]
          }
        }
      }
    }
  },
  executionStats: {
    nReturned: 5,
    executionTimeMillis: 1,
    totalKeysExamined: 5,
    totalDocsExamined: 5
  }
}

What we’re seeing in the new explain("executionStats") output is a radically more efficient query plan, thanks to the year_-1 index we added. Instead of scanning the entire collection and then sorting the results in memory like before, MongoDB now uses an index scan (IXSCAN) to jump directly to the relevant documents, already ordered by year in descending order. The FETCH stage pulls only the actual documents for the top five matches, and the LIMIT stage stops execution as soon as it has what it needs.

As a result, MongoDB examines just five index keys and five documents, returning the desired results in a single millisecond. No in-memory sorting is needed, and there’s no wasted effort. This is the kind of plan we want to see, tight, precise, and powered by the right index.

Let’s use `explain()` on a write

Unlike find() or aggregate(), you can’t call .explain() directly on a write operation like updateOne(). Instead, you’ll need to use the lower-level runCommand() method to get execution stats for a write.

Here’s how to do it:

db.runCommand({
  explain: {
    update: "movies",
    updates: [
      {
        q: { title: "The Matrix" },
        u: { $set: { status: "classic" } },
        multi: false
      }
    ]
  },
  verbosity: "executionStats"
});

This runs an updateOne() under the hood, but wraps it with explain so you can analyze its performance—just like with reads.

And here’s what the (shortened) output looks like:

{
  queryPlanner: {
    winningPlan: {
      stage: "UPDATE",
      inputStage: {
        stage: "COLLSCAN",
        filter: { title: "The Matrix" }
      }
    }
  },
  executionStats: {
    nMatched: 1,
    nWouldModify: 1,
    totalDocsExamined: 7675,
    totalKeysExamined: 0,
    executionTimeMillis: 7
  }
}

This tells us that MongoDB had to scan 7,675 documents just to find one to update. That’s a full collection scan—confirmed by the COLLSCAN stage and totalKeysExamined: 0. It matched one document (nMatched: 1) and would’ve updated it (nWouldModify: 1), but the operation wasn’t actually run since this was an explain, not a real write.

If this is a write pattern you rely on often (like updating by title, customerId, or some other frequent lookup), you should absolutely consider creating an index to reduce the scan cost.

How to read execution stages

MongoDB breaks your query into stages—like steps on an assembly line. Each stage does a specific job, and together, they describe how your query was executed.

Here are some common stage types you might see in an explain() plan:

Stage	What it does
`COLLSCAN`	Scans all documents—slow, no index
`IXSCAN`	Scans an index—fast
`FETCH`	Fetches full documents after matching keys
`SORT`	Sorts results in-memory (unless covered by index)
`PROJECTION`	Trims returned fields

Not every query will hit all of these, and some may include others (like LIMIT, SKIP, or SHARD_MERGE). But if you understand the building blocks, you can read almost any query plan like a flowchart.

You’ll often see nested stages: IXSCAN → FETCH means MongoDB scanned the index, then fetched full documents.

Quick check:

stage: "COLLSCAN": No index used
docsExamined ≫ nReturned: Inefficient filtering
executionTimeMillis high? Consider indexing or reworking the query

How about when we need to compare index usage?

If you're curious how MongoDB chooses one query plan over others, try this:

Before we run the query, let’s make sure we have indexes on both year and rated:

db.movies.createIndex({ year: 1 });
db.movies.createIndex({ rated: 1 });

Now, we can run the query with full plan evaluation:

db.movies.find({
  year: { $gt: 1990 },
  rated: { $in: ["PG", "PG-13"] }
}).explain("allPlansExecution");

This query references two indexed fields: year and rated. Since both are eligible, MongoDB considers multiple plans and tests each one before choosing a winner.

Here’s a simplified excerpt of what you'll see:

{
  queryPlanner: {
    winningPlan: {
      stage: "FETCH",
      inputStage: {
        stage: "IXSCAN",
        indexName: "year_1"
      }
    },
    rejectedPlans: [
      { inputStage: { stage: "IXSCAN", indexName: "rated_1" } },
      { inputStage: { stage: "IXSCAN", indexName: "year_-1" } }
    ]
  },
  executionStats: {
    totalKeysExamined: 15902,
    totalDocsExamined: 15902,
    nReturned: 3091
  }
}

MongoDB tested three potential strategies:

Use the year_1 index and filter by rated
Use the rated_1 index and filter by year
Try the year_-1 index, which didn’t match the filter direction

Even though all three plans returned results, the one using year_1 was the most efficient, so it won. You can see the others listed in rejectedPlans and allPlansExecution, along with stats like documents scanned, returned, and estimated cost.

This is incredibly useful when you're wondering why your index wasn’t used. Often, it was just not chosen. Now, you can prove it.

Now that you know how to dissect a query, let’s move from diagnosis to live observation. In Section 3, we’ll track what’s running right now using db.currentOp(),and how to kill it if needed.

3. Watching queries in real time with `db.currentOp()`

So far, we’ve looked at what was slow (system.profile) and why it was slow (explain()). But what if something is running slowly right now?

MongoDB gives us a built-in way to monitor live operations—both queries and writes—using db.currentOp(). This is like opening a terminal into the brain of your database.

You can see what's currently running, how long it's been running, what operation it’s doing, and even kill it if needed.

What we'll need

Before we get started, make sure you have:

mongosh connected to your cluster.
read access to admin-level commands.
killOp permission if you want to terminate operations (e.g., atlasAdmin, dbAdminAnyDatabase, or custom roles).

What is `db.currentOp()`?

db.currentOp() returns a list of all operations currently in progress on the node (mongod process) you're connected to—including queries, updates, aggregations, and internal commands.

You can filter it to just your operations or long-running ones.

db.currentOp({ "secs_running": { $gt: 2 } });

This will return all operations running longer than two seconds.

Here’s what a live operation might look like:

{
  "opid": "12345",
  "secs_running": 5,
  "active": true,
  "ns": "sample_mflix.movies",
  "command": {
    "find": "movies",
    "filter": { "title": { "$regex": "The" } }
  },
  "planSummary": "COLLSCAN",
  "locks": { ... },
  "client": "203.0.113.25:43210"
}

opid is the operation ID you can use to track or kill the operation.

If we run this on our collection now, we’ll likely see a bunch of admin operations that run in the background on our MongoDB database.

Want to find all active queries on any namespace?

db.currentOp({ "active": true, "op": "query" });

Killing a long-running query

If you need to stop an operation—like an accidental full collection scan—use db.killOp():

db.killOp("12345");

Just replace 12345 with the opid of the operation you want to stop. You need the killOp privilege on the database to run this command.

What to look for in `db.currentOp()`

Field	What it tells you
`opid`	Unique operation ID
`ns`	Namespace (e.g., `sample_mflix.movies`)
`secs_running`	How long the operation has been running
`command`	The exact query or update being run
`planSummary`	The planner’s quick summary (`COLLSCAN`, `IXSCAN`, etc.)
`active`	Whether the operation is still in progress
`waitingForLock`	If the op is blocked on a lock

A few notes

This only shows operations on the node you're connected to. For sharded clusters or replica sets, you'll need to query each mongod.
Use maxTimeMS() to cap long-running queries so they don’t clog up your cluster.
Combine filters to reduce noise:

db.currentOp({ "active": true, "secs_running": { $gt: 2 }, "ns": /movies/ });

Now that we can observe slow queries as they happen, let’s look at why they happen as data scales. In Section 4, we’ll explore inefficient patterns, unused indexes, and how things change when collections grow.

4. Identifying inefficient query and write patterns in large data sets

Some queries feel fast… until they don’t. Performance issues in MongoDB often creep in as data grows, silently at first, until something starts to lag.

This section is our early-warning radar: We’re not fixing slow queries yet. We’re spotting them before they hurt us. We’ll look at key patterns that don’t scale, and how to recognize trouble using both the shell and the Atlas UI.

What we'll need

Sample dataset loaded (e.g., sample_mflix or sample_supplies)
mongosh connected to your cluster
Access to a MongoDB Atlas M10+ cluster
Atlas role: Project Data Access Read Only or higher

Look for query patterns that don’t scale

MongoDB gives us visibility into how queries behave—use it early and often.

Use .explain() to diagnose inefficiencies even when queries "feel fast." A fast query can just be subject to luck, rather than actually well structured queries. And luck runs out. Be intentional and proactive, and you can spot these problems with scaling before they get a chance to slow you down.

Metric	What to watch
`docsExamined` ≫ `nReturned`	Inefficient filter
`stage: "COLLSCAN"`	No index used
`executionTimeMillis` grows with `collection.count()`	Performance degrades with scale
`keysExamined` is high with few matches	Index exists but is not selective enough

Measuring indexes that aren’t helping with `$indexStats`

MongoDB can show you how often each index on a collection is being used. This is helpful when deciding which indexes to keep, which are going unused, and whether your query patterns align with what you’ve optimized for.

Let’s try it on the movies collection:

db.movies.aggregate([{ $indexStats: {} }]);

This returns a document per index, with stats about how frequently each one has been accessed since the server started tracking.

Here’s a sample of the output:

{
  name: "year_1",
  accesses: {
    ops: 0,
    since: ISODate("2025-04-11T12:31:22.682Z")
  }
}

The ops field tells us how many operations have used this index. In this case, we can see that several indexes, including year_1, year_-1, and rated_1, have ops: 0. This means no queries have used them since tracking began.

On the other hand:

{
  name: "_id_",
  accesses: {
    ops: 1,
    since: ISODate("2025-04-09T14:50:50.878Z")
  }
}

The _id_ index has been used once. That’s expected, MongoDB always creates a unique index on _id by default.

If we’re seeing a lot of unused indexes, it may be time to clean them up. Indexes that aren't used take up memory and can slow down writes. But don’t drop them just yet. Always double-check query patterns and workloads across environments before making changes.

This also works great after running queries in development. We can immediately see which ones actually triggered index access, and which ones just triggered full scans.

Tip: If all your indexes say ops: 0, and you know your queries should be using them, it might just be that the server was restarted recently. accesses.since will show you when tracking began.

Some things to check for:

Symptom	What it suggests
`accesses.ops` \= 0	Index isn’t being used at all
Similar compound indexes with different key orders	Redundant indexes
Many indexes with little usage	Possible memory bloat and write slowdown

MongoDB Atlas tracks index usage in the Indexes tab for each collection.

Indexes page in the Atlas UI

Watch for document growth and update pain

MongoDB documents can be large, and that’s often a good thing. But…

Updates to large or nested documents may rewrite the entire document.
Frequently growing arrays cause fragmentation.
Large docs mean fewer fit in memory.

Check this in the shell:

db.movies.stats().avgObjSize;

And we can have a look at the output:

The average document size in the movies collection is 1,598 bytes. This number helps estimate:

Memory usage when scanning N documents (e.g., in a COLLSCAN).
Working set size and whether it fits in RAM.
The I/O cost of operations that don’t use indexes.

If totalDocsExamined is 7,675 , and average doc size is 1,598 bytes:

7_675 docs * 1,598 bytes ≈ 12.27 MB

So, our full collection scan is reading about 12 MB from disk or memory for that single operation.

If our working set exceeds RAM, this can lead to page faults and slow performance, which will definitely show up in our Atlas Profiler.

In the Atlas UI, we can see how the memory of your MongoDB instance is affected as documents bloat.

In Atlas, go to Clusters.
Click on the name of your cluster.
From the cluster dashboard, click the "Metrics" tab at the top.
At the bottom of the screen, you can toggle what you want to inspect.

MongoDB Metrics selector for monitoring in the Atlas UI

We’ll want to toggle:

Cache Usage (shows cache dirty and used bytes—how much memory MongoDB is using and how much of it hasn’t been flushed to disk).
Cache Activity (tracks how much data is being read into or written from cache, which reflects memory churn).
Cache Ratio (shows fill ratio and dirty fill ratio, which help you understand how full the memory is and how many changes are waiting to be written).
Memory (gives you memory_resident and memory_virtual to gauge working set pressure).
System Memory (available and used—so you can correlate MongoDB’s memory usage with overall RAM limits).

These charts let you visualize how much of your instance’s memory is in active use, and how much of that is getting clogged by updates or large documents. If cache dirty steadily rises or you see frequent spikes in eviction activity, it’s a sign that your working set is too big for your RAM. Document growth, overly large updates, and inefficient schemas all make this worse.

MongoDB metrics graph in the Atlas UI

This is where performance starts to degrade: MongoDB has to flush dirty pages more frequently or pull data from disk more often, which slows queries and increases I/O. If we see high Cache Usage, low System Memory Available, and rising Disk IOPS or Latency, it’s a strong signal to review our schema, indexing strategy, or scale up the instance.

This is our live view into how MongoDB balances memory and disk as our documents grow.

All of these are signs that MongoDB is under memory pressure and spending more time loading documents from disk rather than serving them directly from memory.

> Tip: Combine this with $indexStats and avgObjSize to understand why; is it bloated documents? Unused indexes? Or just too much data for the current instance size?

When to watch for sharding symptoms

We might not need sharding yet, but MongoDB will start telling us when vertical scaling is no longer enough.

Check for:

Symptom	What it suggests
Queries slow down as data grows	Indexes can’t keep up—look at `executionTimeMillis` trends
Working set no longer fits in memory	High disk I/O, cache evictions in Atlas
Insert-heavy workloads bottlenecking writes	CPU/memory pressure on primaries
Collection is approaching storage limits	Time to scale out

If we see regular performance degradation due to growth, especially on a single hot collection, it’s time to evaluate sharding as a next step.

Recap:

What to monitor	How
Query inefficiencies	`.explain("executionStats")` + Atlas Query Profiler
Index effectiveness	`$indexStats` + Atlas Indexes tab
Document bloat	`avgObjSize`, `storageSize`, update patterns
Memory pressure	Cache + disk I/O metrics
Scaling pressure	Query performance vs. dataset growth

You're now equipped to spot performance concerns before they explode, whether it’s slow queries hiding in plain sight or documents quietly growing out of control.

In Section 5, we’ll turn our attention to aggregation pipelines, and how to diagnose performance issues stage-by-stage using explain().

5. Analyzing aggregation query performance

Aggregation pipelines are often where performance issues hide in MongoDB. On the surface, everything seems clean, an elegant series of stages reshaping documents, filtering out noise, grouping and sorting. But underneath, inefficiencies can quietly stack up. Poor pipeline ordering, unindexed lookups, broad scans, or early memory-intensive operations like $group or $sort may not be obvious until they degrade performance significantly at scale.

This section is about teaching you how to read a pipeline, profile its performance, and understand where execution cost is being incurred, before it becomes a real problem. We’ll use .explain("executionStats") to trace each stage and observe where time and resources are spent. As always, the goal here is not to solve the performance problem, but to identify the parts of the pipeline that need attention.

Diagnosing poor stage ordering

To illustrate how pipeline structure affects performance, let’s start with a deliberately inefficient example. In the sample_supplies database, let's run the following:

db.sales.aggregate([
  { $group: { _id: "$item", totalSales: { $sum: "$price" } } },
  { $match: { totalSales: { $gt: 100 } } }
]).explain("executionStats");

This pipeline groups all documents in the collection before applying a filter on the computed totalSales value. From a functional standpoint, it’s valid. From a performance perspective, it’s costly. We’re forcing MongoDB to load every document into memory to compute the groupings, regardless of whether that document will later be filtered out.

The explain output will confirm this. We should expect to see a COLLSCAN stage in the cursor source, along with high values for totalDocsExamined and executionTimeMillis. For example:

{
  "stages": [
    {
      "$cursor": {
        "executionStats": {
          "stage": "COLLSCAN",
          "nReturned": 220,
          "totalDocsExamined": 250000,
          "executionTimeMillisEstimate": 92
        }
      }
    },
    {
      "$group": {
        "executionTimeMillisEstimate": 31
      }
    },
    {
      "$match": {
        "executionTimeMillisEstimate": 1
      }
    }
  ]
}

This output shows that most of the time is spent scanning and grouping all 250,000 documents, even though only a small number are returned. There’s no filtering until the final stage, meaning the $group stage is carrying a much heavier load than necessary.

Improving pipeline efficiency through reordering

Now, let’s reverse the order of the pipeline so that the filter occurs before the grouping:

db.sales.aggregate([
  { $match: { price: { $gt: 1 } } },
  { $group: { _id: "$item", totalSales: { $sum: "$price" } } }
]).explain("executionStats");

In this case, the explain output should show dramatically fewer documents examined, a lower execution time, and, if we’ve indexed the price field, an IXSCAN instead of a COLLSCAN. We’ve offloaded unnecessary work from $group by letting $match reduce the input set early.

This is one of the most important structural concepts when reading or profiling aggregations: place filtering stages as early as possible. $match, $project, and $limit reduce the document set and should appear toward the top of your pipeline. Stages like $group, $sort, and $lookup that grow memory usage or execution time should appear later, after the document set has already been narrowed.

Profiling aggregations in Atlas

You can observe the impact of pipeline reordering visually in the Atlas UI. Go to your cluster’s Query Profiler, then filter by operation type to show only aggregate queries. Locate your slow group-first query in the scatterplot and click to expand the timeline. You’ll see each stage listed with timing information.

Look specifically at the execution time for the $group stage. It will likely dominate the operation. Then, compare it to the reordered pipeline where $match runs first—stage timings will show a reduced burden on $group and often a significantly lower execution time overall.

Diagnosing `$lookup` inefficiencies

Now, consider a common source of hidden aggregation slowness: unindexed $lookup joins.

Let’s look at a real example using the sample_analytics dataset. We’ll join the transactions collection with customers, matching the customer_id field from transactions to the email field in customers. This is deliberately inefficient, no index exists on email in the foreign collection, and neither field is guaranteed to be unique.

db.transactions.aggregate([
  {
    $lookup: {
      from: "customers",
      localField: "customer_id",
      foreignField: "email",
      as: "customerDetails"
    }
  }
]);

This query returns some documents. In fact, it returned 1,746 documents when we ran it. But behind the scenes, it scanned thousands more.

Here’s the important diagnostic output from explain("executionStats"):

{
  "executionStats": {
    "nReturned": 1746,
    "executionTimeMillis": 8,
    "totalDocsExamined": 2246,
    "collectionScans": 2,
    "indexesUsed": [],
    "executionStages": {
      "stage": "hash_lookup",
      "usedDisk": false,
      "spilledRecords": 0
    }
  },
  "winningPlan": {
    "stage": "EQ_LOOKUP",
    "strategy": "HashJoin",
    "foreignCollection": "sample_analytics.customers",
    "inputStage": {
      "stage": "COLLSCAN"
    }
  }
}

The query returned fast, but:

collectionScans: 2: Both collections were fully scanned.
indexesUsed: []: No index was used on the join fields.
totalDocsExamined: 2,246: 2.2K documents read to return 1.7K. When totalDocsExamined is more than nReturned, that is wasted I/O effort.
strategy: HashJoin: MongoDB built an in-memory hash table to support the join. This is a memory-intensive join strategy.

These are red flags for joins at scale. As your data grows, this will become a bottleneck. Each transaction document triggers a scan on the foreign collection, and if the field isn’t indexed, or worse, contains duplicate values, the cost increases linearly.

To correct it, let’s add an index on customers.email:

db.customers.createIndex({ email: 1 });

Then, re-run the aggregation with explain(“execuationStats”) and compare the difference.

{
  "nReturned": 1746,
  "executionTimeMillis": 9,
  "totalDocsExamined": 1746,
  "indexesUsed": ["email_1"],
  "winningPlan": {
    "stage": "EQ_LOOKUP",
    "strategy": "IndexedLoopJoin",
    "indexName": "email_1",
    "inputStage": {
      "stage": "COLLSCAN"
    }
  }
}

Even though execution time rose slightly (possibly due to internal re-planning or I/O timing variance), the structure is much healthier. We’ve replaced a memory-intensive hash join with an index-backed nested loop, and reduced total document examination to match the number of documents returned.

This is what we want from performance tuning: not always instant speed wins, but structural improvements that scale.

Always check for $lookup joins over unindexed fields.
Use explain("executionStats") to reveal strategy and index usage.
Look for strategy: IndexedLoopJoin and avoid HashJoin where possible.

If we want to look at this information without using mongosh or running .explain() again and again, we can just pop back over to the Query Profiler in the Atlas UI, and examine it there, from our live queries running in our applications. It will pick it up as long as our queries are running slow enough.

Query profiler for slow lookup operations in the Atlas UI

Sorting before filtering: A hidden cost

Another subtle performance trap is using $sort early in a pipeline, especially before a $match. Sorting large datasets in memory is expensive, and doing so before filtering means we’re paying that cost for documents we might never use. Use the sample_supplies database, and let’s run the “same aggregation,” but with a different ordering of stages, and look at the outputs.

Try this pipeline:

db.sales.aggregate([
  { $sort: { saleDate: -1 } },
  { $match: { "items.name": "binder" } }
]).explain("executionStats");

The explain plan will show that the sort is operating on the full dataset, not just the relevant subset. The executionTimeMillisEstimate for the sort stage may be significantly higher than the filter.

{
  queryPlanner: {
    stage: 'SORT',
    sortPattern: { saleDate: -1 },
    inputStage: {
      stage: 'COLLSCAN',
      filter: { 'items.name': { '$eq': 'binder' } },
      docsExamined: 5000
    }
  },
  executionStats: {
    executionTimeMillis: 28,
    totalDocsExamined: 5000,
    nReturned: 3396,
    executionStages: {
      stage: 'SORT',
      executionTimeMillisEstimate: 20,
      inputStage: {
        stage: 'COLLSCAN'
      }
    }
  }
}

MongoDB scans all 5000 documents, and then returns the 3396 matches, 28ms to run. Now, let’s simply reverse the order so that the $match runs first:

db.sales.aggregate([
  { $match: { "items.name": "binder" } },
  { $sort: { saleDate: -1 } }
]).explain("executionStats");

And our output should show some favourable changes:

{
  queryPlanner: {
    stage: 'SORT',
    sortPattern: { saleDate: -1 },
    inputStage: {
      stage: 'COLLSCAN',
      filter: { 'items.name': { '$eq': 'binder' } },
      docsExamined: 5000
    }
  },
  executionStats: {
    executionTimeMillis: 15,
    totalDocsExamined: 5000,
    nReturned: 3396,
    executionStages: {
      stage: 'SORT',
      executionTimeMillisEstimate: 13,
      inputStage: {
        stage: 'COLLSCAN'
      }
    }
  }
}

The same number of documents is examined, but by filtering first, we halve the $sort time. A lot of that time is saved in sorting.

Also note that if you sort on a field that has an index, MongoDB can use that index to avoid an in-memory sort. Use planSummary in the explain output to confirm whether that’s happening. This is your cue to check stage order and index coverage.

Aggregations with MongoDB Atlas Search

If you're using Atlas Search, remember that $search is a special stage that must appear first in your pipeline. Unlike $match, $search does not use MongoDB’s native indexes, and you won’t see it in explain() output.

To diagnose slow $search pipelines, you’ll need to use Atlas Profiler. Find the query under operation type aggregate, and click through to inspect the total duration and number of documents returned. MongoDB Atlas Search queries typically show their performance in terms of latency and result count rather than index scan stats.

If $search is slow, check whether your search index includes the right fields and analyzers, and whether you're retrieving more fields than necessary after $search. Adding a $project stage immediately after $search can reduce document size and improve throughput.

To simulate the slow query, we can create an index on our Atlas (as long as we have admin permissions) for our search queries:

db.runCommand({
  createSearchIndexes: "movies",
  indexes: [
    {
      name: "titleIndex",
      definition: {
        mappings: {
          dynamic: false,
          fields: {
            title: {
              type: "string"
            }
          }
        }
      }
    }
  ]
});

Give this a minute to create, then we can run the following query. This uses a common term to trigger a high-result count and returns full documents without limiting or projecting fields:

db.movies.aggregate([
  {
    $search: {
      index: "titleIndex",
      text: {
        query: "the",
        path: "title"
      }
    }
  }
]);

A query as generic as “the” in movie titles will give us quite high latency, and we can dissect it and learn how to improve it in our profiler.
Query profiler for slow aggregation operations in the Atlas UI

Aggregation diagnostics summary

Aggregation pipelines offer incredible power but carry significant performance implications when not constructed thoughtfully. By using explain("executionStats") at every step, you can see how each stage behaves, whether indexes are used, and where MongoDB is spending the most time.

In MongoDB Atlas, use the Query Profiler to correlate slow queries with real execution data, and use the Indexes tab to verify whether your lookups and sorts are supported by indexes. Pay close attention to ordering—filter early, reduce early, and avoid letting costly stages like $group, $sort, or $lookup operate on more documents than necessary.

When profiling, don’t just look at total execution time. Look at stage execution time, document counts, and index plans. MongoDB is giving us a detailed play-by-play.

Conclusion

We’ve now got the full diagnostic toolkit: a profiler to see what ran slowly, an execution plan to understand why, a live view to watch it happen in real time, and a deep dive into how performance scales. We’re not stomping around, hunting problems. We’re observant and well tooled. That’s the foundation every great optimization effort is built on.

This wasn’t a guide to optimization—it was a guide to observation. Because you can’t fix what you can’t see.

If you want to keep digging deeper, here are a few useful next stops. Check out the Atlas Query Profiler Documentation, or read up on Aggregation Pipeline Optimization Best Practices. MongoDB gives you a lot of visibility if you know where to look. Now, you do.

Don’t Forget to Share This Post!

Tim Kelly

Author

Developer Advocate, MongoDB

Java 22 to 24: Level up your Java Code by embracing new features in a safe way

Introducing a New Java DMX512 Library With Demo JavaFX User Interface

Your Complete Guide to Diagnose Slow Queries in MongoDB

CodeRabbit Tutorial for Java Developers

Project Panama for Newbies (Part 1)

Video series “JavaFX In Action”, Part 5 with Cormac Redmond (KafkIO), Brian Schlining (Annotating the Deep-Sea Wildlife), Gerrit Grunwald (JavaFX Libraries), Dirk Lemmermann (JavaFX Libraries and Applications, JFX Central)

Creating Scalable OpenAI GPT Applications in Java

Java, What’s Old? Part II: Utils

FEPCOS-J (3) – Build native executables of Java-coded networked systems

Building a Real-Time AI Fraud Detection System with Spring Kafka and MongoDB

foojay: A Place for Friends of OpenJDK

Dashboard for OpenJDK Update Release Details

JDK14: New Features and Enhancements

Fun with Flags: My Top 10 Resources for JVM Flags

Performance of Modern Java on Data-Heavy Workloads: Real-Time Streaming

Performance of Modern Java on Data-Heavy Workloads: Batch Processing

How does Java handle different Images and ColorSpaces – Part 1

How does Java handle different Images and ColorSpaces – Part 2

How does Java handle different Images and ColorSpaces – Part 3

How does Java handle different Images and ColorSpaces – Part 4

Indexing all of Wikipedia, on a laptop

Working with Multiple Carets in IntelliJ IDEA

Clean Shutdown of Spring Boot Applications

Java 17 on the Raspberry Pi

Project Panama for Newbies (Part 1)

How to Create Mobile Apps with JavaFX (Part 1)

Foojay Slack: bit.ly/join-foojay-slack

Beginning JavaFX Applications with IntelliJ IDE

SpringBoot 3.2 + CRaC

Debugging Java on the Command Line

Apache Kafka Performance on Azul Platform Prime vs Vanilla OpenJDK

Learn about a number of experiments that have been conducted with Apache Kafka performance on Azul Platform Prime, compared to vanilla OpenJDK. Roughly 40% improvements in performance, both throughput and latency, are achieved.

Stable, Secure, and Affordable Java

Azul Platform Core is the #1 Oracle Java alternative, offering OpenJDK support for more versions (including Java 6 & 7) and more configurations for the greatest business value and lowest TCO.

Stable, Secure, and Affordable Java

Step up your coding with the Continuous Feedback Udemy Course: Additional coupons are available

Jakarta EE 11: Beyond the Era of Java EE

Do you want your ad here?

Your Complete Guide to Diagnose Slow Queries in MongoDB

1. MongoDB’s Query Profiler

What we’ll need

What is MongoDB’s Profiler?

MongoDB Atlas Query Profiler in the Atlas UI

A few important considerations

MongoDB Database Profiler

Let's make something slow on purpose

2. Understanding execution plans with explain()

What we'll need

What is explain()?

Verbosity modes

Let’s use explain() on a write

How to read execution stages

How about when we need to compare index usage?

3. Watching queries in real time with db.currentOp()

What we'll need

What is db.currentOp()?

Killing a long-running query

What to look for in db.currentOp()

A few notes

4. Identifying inefficient query and write patterns in large data sets

What we'll need

Look for query patterns that don’t scale

Measuring indexes that aren’t helping with $indexStats

Watch for document growth and update pain

When to watch for sharding symptoms

Recap:

5. Analyzing aggregation query performance

Diagnosing poor stage ordering

Improving pipeline efficiency through reordering

Profiling aggregations in Atlas

Diagnosing $lookup inefficiencies

Sorting before filtering: A hidden cost

Aggregations with MongoDB Atlas Search

Aggregation diagnostics summary

Conclusion

Tim Kelly

Tim Kelly

Thanks to our Sponsors!

Azul

Redis

CodeRabbit

Reo

Zencoder

Payara

Digma

adesso

Trending

Apache Kafka Performance on Azul Platform Prime vs Vanilla OpenJDK

Stable, Secure, and Affordable Java

Stable, Secure, and Affordable Java

Step up your coding with the Continuous Feedback Udemy Course: Additional coupons are available

Jakarta EE 11: Beyond the Era of Java EE

Do you want your ad here?

Related Articles

Comments (0)

Set Event Reminder

Subscribe to foojay updates:

Share with

2. Understanding execution plans with `explain()`

What is `explain()`?

Let’s use `explain()` on a write

3. Watching queries in real time with `db.currentOp()`

What is `db.currentOp()`?

What to look for in `db.currentOp()`

Measuring indexes that aren’t helping with `$indexStats`

Diagnosing `$lookup` inefficiencies