Your Complete Guide to Diagnose Slow Queries in MongoDB
- June 17, 2025
- 164 Unique Views
- 24 min read
- What we’ll need
- What is MongoDB’s Profiler?
- MongoDB Atlas Query Profiler in the Atlas UI
- A few important considerations
- MongoDB Database Profiler
- Let's make something slow on purpose
- What we'll need
- What is explain()?
- Verbosity modes
- Let’s use explain() on a write
- How to read execution stages
- How about when we need to compare index usage?
- What we'll need
- What is db.currentOp()?
- Killing a long-running query
- What to look for in db.currentOp()
- A few notes
- What we'll need
- Look for query patterns that don’t scale
- Measuring indexes that aren’t helping with $indexStats
- Watch for document growth and update pain
- When to watch for sharding symptoms
- Recap:
- Diagnosing poor stage ordering
- Improving pipeline efficiency through reordering
- Profiling aggregations in Atlas
- Diagnosing $lookup inefficiencies
- Sorting before filtering: A hidden cost
- Aggregations with MongoDB Atlas Search
- Aggregation diagnostics summary
MongoDB is built to be fast. The real win comes from knowing how to keep it that way, even as your app grows and your data gets more complex.
This tutorial is for those developers and engineers who want to understand how MongoDB behaves when queries or writes slow down. We’re not here to hand you indexing tips or premature advice about schema design. We’re here to walk through the actual tools MongoDB gives you to observe, measure, and reason about query performance, before you start optimizing anything.
We’ll cover the full stack of diagnostics available to you:
- The Query Profiler (via the Atlas UI and
system.profile
) - The
explain()
method to understand how queries are planned and executed - The
db.currentOp()
command for watching operations in real time - Deep dives into detecting scaling issues and aggregation pipeline inefficiencies
In order to follow along, we'll be interacting with MongoDB Atlas and MongoDB Shell (mongosh
). I recommend you have these both set up before starting. Some tools shown in this will be restricted to M10+ clusters, but I'll highlight where that's relevant, and there should be something to take away for everyone, whether you're on a free-forever M0 cluster and just getting started, or a seasoned MongoDB veteran running enterprise workloads. We’ll run the commands, see what the output looks like, and learn how to read it.
1. MongoDB’s Query Profiler
We’ve all done it. We run a query. It hangs. We check the app logs. We stare. Maybe we slap an index on something and hope it helps. Or maybe, just maybe, we start guessing at what our database is actually doing behind the scenes.
But MongoDB doesn’t want us to guess. It wants to show us. And it gives us the tools to do just that:
- The Atlas Query Profiler: A beautiful, scatterplot-filled experience in the MongoDB Atlas UI
- The Database Profiler: A raw, firehose-level stream of operations you can tap into via
mongosh
We’re gonna use both.
What we’ll need
Before we get started, make sure you have:
- A MongoDB Atlas cluster (M10 or higher).
- A database with the sample data loaded (Atlas makes this easy—click the “Load Sample Dataset” button).
- Access to mongosh (or your preferred shell).
- Permissions to set profiler levels (
atlasAdmin
orreadWriteAnyDatabase
is fine).
What is MongoDB’s Profiler?
MongoDB can keep track of which operations are running slowly and log them for us. We just need to tell it what we care about.
The MongoDB Atlas Query Profiler helps diagnose and monitor slow-running queries using log data from your cluster. It aggregates, filters, and visualizes performance statistics via the Query Insights interface.
The Database Profiler is more granular, capturing all operations that match a profiling threshold and storing them in the system.profile
collection.
> Note: Changing the profiling level via db.setProfilingLevel()
does not affect what is shown in the MongoDB Atlas UI. Atlas uses its own internal logic to determine what is “slow” unless configured otherwise.
MongoDB Atlas Query Profiler in the Atlas UI
Let’s check the Atlas Query Profiler now. It is enabled by default.
- Go to your cluster.
- Click Monitoring.
- Head to Query Insights -> Query Profiler.
> Note: The Atlas Query Profiler is available on M10+ clusters.
Now, ours will be pretty empty, so let's run a few expensive queries, and give ourselves something to look at. Let’s head over to our terminal and open mongosh. First, we’ll select our sample_mflix
database with the command use sample_mflix
. We’re going to run a few intentionally slow queries to give the profiler something to log:
db.movies.find({ title: /The/ }).hint({ $natural: 1 });
db.movies.find( { plot: /love/i }, { title: 1, runtime: 1, complexCalc: { $multiply: ["$runtime", 42] } } ).hint({ $natural: 1 });
db.movies.find({ $or: [ { directors: { $size: 2 } }, { year: { $lt: 1980 } } ], $where: function () { return this.cast && this.cast.includes("Robert De Niro") && this.runtime > 90; } }).hint({ $natural: 1 });
These queries are intentionally inefficient. They ignore indexes and force full collection scans. Perfect for generating profiler activity.
If we go back to our UI, we should see some values populating our scatterplot. Each dot represents one or more slow operations grouped by similar characteristics. Click one and we’ll see:
- The operation type (
find
,update
, etc.). - The execution time.
- Whether an index was used.
- The number of docs scanned.
And if we click “View more details,” we can have a deep dive into more metrics to give us a clear picture of how the query executed and why it might be slow.
Here’s what a slow query might look like:
If docsExamined
is huge and keysExamined
is zero (or a low number), you’re scanning the whole collection. That’s your signal to add an index, or rewrite the query.
You’ll also see the full raw query object, which includes the query shape, execution plan summary, and any JavaScript used.
Start with these three:
millis
: How long did it take?docsExamined
: How much work was done?keysExamined
: Did the query use an index?
"planSummary": "COLLSCAN", "docsExamined": 21349, "nreturned": 8, "keysExamined": 0, "cpuNanos": 236256326, "durationMillis": 656, "queryFramework": "classic", "command": { "find": "movies", "filter": { "$or": [ { "directors": { "$size": 2 } }, { "year": { "$lt": 1980 } } ], "$where": { "$code": "function () { return this.cast && this.cast.includes(\"Robert De Niro\") && this.runtime > 90; }" } }, "hint": { "$natural": 1 } }
This is textbook collection scan territory:
- COLLSCAN means the query didn't use an index
- 21K docs scanned, only eight returned
- 656 ms runtime, mostly wasted doing extra work
You should consider taking action when the profiler reveals signs of inefficiency. For example, if the Examined:Returned
ratio is high, it often means MongoDB is doing a lot of work to return very little, suggesting it's time to add an index or restructure the query. If the query didn't use an index at all and it's one that runs frequently, that's a strong signal that an index should be created. And if the execution time (shown as millis
) is consistently high, it may be worth optimizing the query logic, batching results, or even rethinking how the data is being accessed altogether.
Did you know? The Performance Advisor (next tab over from the Query Profiler in Atlas) can suggest indexes based on these logs. If you're seeing a lot of full scans, this is a great next stop.
A few important considerations
- Security: The Query Profiler may surface query parameters, which can include PII or sensitive business logic. Ensure this visibility aligns with your security policies.
- Performance: Changing the profiling level on a production cluster (
db.setProfilingLevel
) can affect performance and logging volume. Avoid overly broad settings in high-throughput environments. - Limitations:
- Up to 100,000 operations shown per selected time range
- Up to five-minute delay between log ingestion and display
- Heavy spikes in log volume may temporarily halt log collection
- Logs are sampled and binned, so zooming is required for granularity
You’ve now got full visibility into what MongoDB is doing under the hood when it runs your queries.
MongoDB Database Profiler
Atlas gives you a macro view, what’s noisy overall. Shell gives you a micro view, what’s happening right now. The profiler collects statistics for operations like:
find()
,update()
,delete()
,aggregate()
.- Time taken to execute.
- Whether an index was used.
- How many documents were scanned vs. returned.
- Command size and response length.
Let’s launch our mongosh
and connect to our cluster. Now, let’s tell MongoDB to start tracking slow queries. First, we’ll select the sample_airbnb
database (use sample_airbnb
):
db.setProfilingLevel(2);
This will log everything. It is excessive, but for our examples, we want to see exactly what’s going on. Make sure not to leave this on for your production database, though. You will feel it!
It’s important to select the database we wish to profile before we turn on profiler as it is a per database operation.
We can adjust the threshold based on our use case.
0
: Off1
: Log slow operations only (db.setProfilingLevel(1,100)
will log anything slower than 100ms)2
: Log all operations
Check that it worked:
db.getProfilingStatus();
We should see something like this:
{ "was": 2, "slowms": 100, "sampleRate": 1, "filter": { } }
Let's make something slow on purpose
We’re using the sample_airbnb
database now (use sample_airbnb
). This one has a listingsAndReviews
collection, great for some slow fun.
db.listingsAndReviews.find({ $or: [ { "address.market": "Berlin" }, { "review_scores.review_scores_cleanliness": { $lt: 5 } } ], $where: function () { return this.amenities && this.amenities.length > 15; } }).sort({ description: 1 }).hint({ $natural: 1 });
The .hint({ $natural: 1 })
tells MongoDB to read documents in the order they’re stored on disk, bypassing all indexes. It’s a guaranteed collection scan, perfect for generating profiler logs, terrible for performance.
Let’s do another one with writes:
db.listingsAndReviews.updateMany( { $or: [ { "address.market": "Berlin" }, { "review_scores.review_scores_cleanliness": { $lt: 5 } } ], $where: function () { return this.amenities && this.amenities.length > 15; } }, { $set: { needs_review: true } }, { hint: { $natural: 1 } } );
Now, we can go check what just got logged:
db.system.profile.find().sort({ ts: -1 }).limit(2).pretty();
And we’ll see something like this output I abbreviated:
{ op: 'query', ns: 'sample_airbnb.listingsAndReviews', command: { find: 'listingsAndReviews', filter: { '$or': [ { 'address.market': 'Berlin' }, { 'review_scores.review_scores_cleanliness': { '$lt': 5 } } ], '$where': Code( 'function () {\n' + ' return this.amenities && this.amenities.length > 15;\n' + ' }' ) }, sort: { description: 1 }, hint: { '$natural': 1 }, '$db': 'sample_airbnb' }, keysExamined: 0, docsExamined: 5555, hasSortStage: true, nBatches: 1, cursorExhausted: true, numYield: 1, nreturned: 12, responseLength: 73402, millis: 46, planSummary: 'COLLSCAN', planningTimeMicros: 22775, execStats: { isCached: false, stage: 'SORT', nReturned: 12, executionTimeMillisEstimate: 23, works: 5569, advanced: 12, needTime: 5556, needYield: 0, saveState: 1, restoreState: 1, isEOF: 1, sortPattern: { description: 1 }, type: 'simple', totalDataSizeSorted: 83047, inputStage: { stage: 'COLLSCAN', filter: { '$and': [ { '$or': [ { 'address.market': { '$eq': 'Berlin' } }, { 'review_scores.review_scores_cleanliness': { '$lt': 5 } } ] }, { '$where': Code( 'function () {\n' + ' return this.amenities && this.amenities.length > 15;\n' + ' }' ) } ] }, nReturned: 12, executionTimeMillisEstimate: 23, docsExamined: 5555 } }, ... }
We’re looking for these fields:
Field | What it tells you |
---|---|
millis |
The time the operation took to execute, in milliseconds. This is our go-to number for, “Was it fast or slow?” Keep in mind it includes everything: query parsing, planning, execution, and response serialization. |
docsExamined |
The number of documents MongoDB had to read from disk or memory to complete the operation. This is key for spotting collection scans or poor index use. The lower this is, the better it should match or be close to the number of results returned. |
keysExamined |
The number of index keys MongoDB had to scan. This helps us understand how selective your query is. A high number here (with a low docsExamined) often means the query used an index efficiently. A value of 0 means no index was used—i.e., we’re likely doing a full collection scan. |
That’s step one. You’ve got the x-ray. Next, let’s grab the scalpel: We’ll dissect how MongoDB is actually executing these queries using explain()
.
2. Understanding execution plans with explain()
MongoDB’s profiler told us what ran slow. Now, we want to know why. That’s where our explain()
comes in.
When you run a query with explain()
, MongoDB walks you through how it planned and executed it. You get details about indexes, scan strategies, how many documents were looked at, and how much time was spent. It helps you detect when you’re scanning too many documents, spot missing or inefficient indexes, and validate whether your query changes actually help.
What we'll need
Before we get started, make sure you have:
mongosh
connected to your Atlas cluster.- A sample dataset loaded (especially the
sample_mflix
database). - Permission to run the underlying command—e.g.,
find.explain()
requires read access only—to useexplain()
.
What is explain()
?
explain()
gives you insight into how MongoDB processes a query: the winning plan it selected, whether indexes were used, and how many documents were scanned or returned. You can use explain()
on:
find()
update()
delete()
aggregate()
The method comes in two syntactic forms:
// Most common usage db.collection.find({ ... }).explain("executionStats"); // Alternative chaining syntax db.collection.explain("executionStats").find({ ... });
Both syntaxes are valid, but the first is more common in shell and scripts. The second (db.collection.explain().find(...)
) is helpful when chaining query modifiers like:
.sort()
.limit()
.projection()
.hint()
For example, let’s switch back to our sample_mflix
database:
db.movies.explain("executionStats") .find({ year: { $gt: 2000 } }) .sort({ year: -1 }) .limit(5);
This version lets you see how all those combined modifiers affect the query plan, without needing to nest multiple .explain()
calls or wrap in extra parentheses.
This returns a ton of useful info. Here’s what to look for:
Field | What it tells you |
---|---|
executionTimeMillis |
How long the query actually took to run |
totalDocsExamined |
How many documents were scanned |
totalKeysExamined |
How many index entries were scanned |
nReturned |
How many documents matched the query |
stage |
What kind of scan was performed (COLLSCAN , IXSCAN , etc.) |
Here’s an abbreviated version of the output:
{ queryPlanner: { winningPlan: { stage: "SORT", sortPattern: { year: -1 }, limitAmount: 5, inputStage: { stage: "COLLSCAN", filter: { year: { $gt: 2000 } } } } }, executionStats: { nReturned: 5, executionTimeMillis: 96, totalDocsExamined: 21349, totalKeysExamined: 0, executionStages: { stage: "SORT", inputStage: { stage: "COLLSCAN", docsExamined: 21349 } } } }
How to read this:
stage: "COLLSCAN"
: MongoDB scanned the entire collection (no index used).docsExamined: 21349
: It had to read 21K documents to return just five.totalKeysExamined: 0
: Confirms no index was touched.stage: "SORT"
: MongoDB sorted results in memory.executionTimeMillis: 96
: The whole thing took just under 100 ms.
Even though it worked fast in this small dataset, this plan doesn’t scale. A filter + sort like this should ideally use a compound index—e.g., { year: -1 }
.
Tip: When using .explain().find()
, the result is a document (not a cursor), so you get back the full plan immediately without needing .next()
or .toArray()
.
Verbosity modes
The explain()
method accepts a verbosity mode that controls how much detail you get:
Mode | What it shows |
---|---|
"queryPlanner" (default) |
The query shape and chosen plan—no execution |
"executionStats" |
Runs the query and shows real metrics for the winning plan |
"allPlansExecution" |
Shows execution stats for all candidate plans—great for debugging plan selection |
Important behavior notes:
explain()
ignores the plan cache—this is intentional so you can inspect the plan fresh.- It also prevents creating a new plan cache entry.
explain()
is supported in all MongoDB Atlas, Enterprise, and Community environments.
Now, let’s add an index:
db.movies.createIndex({ year: -1 });
And once this is created, we can run our query again:
db.movies.explain("executionStats") .find({ year: { $gt: 2000 } }) .sort({ year: -1 }) .limit(5);
In our new (abbreviated) output:
{ queryPlanner: { winningPlan: { stage: "LIMIT", inputStage: { stage: "FETCH", inputStage: { stage: "IXSCAN", indexName: "year_-1", indexBounds: { year: ["[inf.0, 2000)"] } } } } }, executionStats: { nReturned: 5, executionTimeMillis: 1, totalKeysExamined: 5, totalDocsExamined: 5 } }
What we’re seeing in the new explain("executionStats")
output is a radically more efficient query plan, thanks to the year_-1
index we added. Instead of scanning the entire collection and then sorting the results in memory like before, MongoDB now uses an index scan (IXSCAN
) to jump directly to the relevant documents, already ordered by year
in descending order. The FETCH
stage pulls only the actual documents for the top five matches, and the LIMIT
stage stops execution as soon as it has what it needs.
As a result, MongoDB examines just five index keys and five documents, returning the desired results in a single millisecond. No in-memory sorting is needed, and there’s no wasted effort. This is the kind of plan we want to see, tight, precise, and powered by the right index.
Let’s use explain()
on a write
Unlike find()
or aggregate()
, you can’t call .explain()
directly on a write operation like updateOne()
. Instead, you’ll need to use the lower-level runCommand()
method to get execution stats for a write.
Here’s how to do it:
db.runCommand({ explain: { update: "movies", updates: [ { q: { title: "The Matrix" }, u: { $set: { status: "classic" } }, multi: false } ] }, verbosity: "executionStats" });
This runs an updateOne()
under the hood, but wraps it with explain
so you can analyze its performance—just like with reads.
And here’s what the (shortened) output looks like:
{ queryPlanner: { winningPlan: { stage: "UPDATE", inputStage: { stage: "COLLSCAN", filter: { title: "The Matrix" } } } }, executionStats: { nMatched: 1, nWouldModify: 1, totalDocsExamined: 7675, totalKeysExamined: 0, executionTimeMillis: 7 } }
This tells us that MongoDB had to scan 7,675 documents just to find one to update. That’s a full collection scan—confirmed by the COLLSCAN
stage and totalKeysExamined: 0
. It matched one document (nMatched: 1
) and would’ve updated it (nWouldModify: 1
), but the operation wasn’t actually run since this was an explain
, not a real write.
If this is a write pattern you rely on often (like updating by title
, customerId
, or some other frequent lookup), you should absolutely consider creating an index to reduce the scan cost.
How to read execution stages
MongoDB breaks your query into stages—like steps on an assembly line. Each stage does a specific job, and together, they describe how your query was executed.
Here are some common stage types you might see in an explain()
plan:
Stage | What it does |
---|---|
COLLSCAN |
Scans all documents—slow, no index |
IXSCAN |
Scans an index—fast |
FETCH |
Fetches full documents after matching keys |
SORT |
Sorts results in-memory (unless covered by index) |
PROJECTION |
Trims returned fields |
Not every query will hit all of these, and some may include others (like LIMIT
, SKIP
, or SHARD_MERGE
). But if you understand the building blocks, you can read almost any query plan like a flowchart.
You’ll often see nested stages: IXSCAN
→ FETCH
means MongoDB scanned the index, then fetched full documents.
Quick check:
stage: "COLLSCAN"
: No index useddocsExamined ≫ nReturned
: Inefficient filteringexecutionTimeMillis
high? Consider indexing or reworking the query
How about when we need to compare index usage?
If you're curious how MongoDB chooses one query plan over others, try this:
Before we run the query, let’s make sure we have indexes on both year
and rated
:
db.movies.createIndex({ year: 1 }); db.movies.createIndex({ rated: 1 });
Now, we can run the query with full plan evaluation:
db.movies.find({ year: { $gt: 1990 }, rated: { $in: ["PG", "PG-13"] } }).explain("allPlansExecution");
This query references two indexed fields: year
and rated
. Since both are eligible, MongoDB considers multiple plans and tests each one before choosing a winner.
Here’s a simplified excerpt of what you'll see:
{ queryPlanner: { winningPlan: { stage: "FETCH", inputStage: { stage: "IXSCAN", indexName: "year_1" } }, rejectedPlans: [ { inputStage: { stage: "IXSCAN", indexName: "rated_1" } }, { inputStage: { stage: "IXSCAN", indexName: "year_-1" } } ] }, executionStats: { totalKeysExamined: 15902, totalDocsExamined: 15902, nReturned: 3091 } }
MongoDB tested three potential strategies:
- Use the
year_1
index and filter byrated
- Use the
rated_1
index and filter byyear
- Try the
year_-1
index, which didn’t match the filter direction
Even though all three plans returned results, the one using year_1
was the most efficient, so it won. You can see the others listed in rejectedPlans
and allPlansExecution
, along with stats like documents scanned, returned, and estimated cost.
This is incredibly useful when you're wondering why your index wasn’t used. Often, it was just not chosen. Now, you can prove it.
Now that you know how to dissect a query, let’s move from diagnosis to live observation. In Section 3, we’ll track what’s running right now using db.currentOp()
,and how to kill it if needed.
3. Watching queries in real time with db.currentOp()
So far, we’ve looked at what was slow (system.profile
) and why it was slow (explain()
). But what if something is running slowly right now?
MongoDB gives us a built-in way to monitor live operations—both queries and writes—using db.currentOp()
. This is like opening a terminal into the brain of your database.
You can see what's currently running, how long it's been running, what operation it’s doing, and even kill it if needed.
What we'll need
Before we get started, make sure you have:
mongosh
connected to your cluster.read
access to admin-level commands.killOp
permission if you want to terminate operations (e.g.,atlasAdmin
,dbAdminAnyDatabase
, or custom roles).
What is db.currentOp()
?
db.currentOp()
returns a list of all operations currently in progress on the node (mongod process) you're connected to—including queries, updates, aggregations, and internal commands.
You can filter it to just your operations or long-running ones.
db.currentOp({ "secs_running": { $gt: 2 } });
This will return all operations running longer than two seconds.
Here’s what a live operation might look like:
{ "opid": "12345", "secs_running": 5, "active": true, "ns": "sample_mflix.movies", "command": { "find": "movies", "filter": { "title": { "$regex": "The" } } }, "planSummary": "COLLSCAN", "locks": { ... }, "client": "203.0.113.25:43210" }
opid
is the operation ID you can use to track or kill the operation.
If we run this on our collection now, we’ll likely see a bunch of admin operations that run in the background on our MongoDB database.
Want to find all active queries on any namespace?
db.currentOp({ "active": true, "op": "query" });
Killing a long-running query
If you need to stop an operation—like an accidental full collection scan—use db.killOp()
:
db.killOp("12345");
Just replace 12345
with the opid
of the operation you want to stop. You need the killOp
privilege on the database to run this command.
What to look for in db.currentOp()
Field | What it tells you |
---|---|
opid |
Unique operation ID |
ns |
Namespace (e.g., sample_mflix.movies ) |
secs_running |
How long the operation has been running |
command |
The exact query or update being run |
planSummary |
The planner’s quick summary (COLLSCAN , IXSCAN , etc.) |
active |
Whether the operation is still in progress |
waitingForLock |
If the op is blocked on a lock |
A few notes
- This only shows operations on the node you're connected to. For sharded clusters or replica sets, you'll need to query each
mongod
. - Use
maxTimeMS()
to cap long-running queries so they don’t clog up your cluster. - Combine filters to reduce noise:
db.currentOp({ "active": true, "secs_running": { $gt: 2 }, "ns": /movies/ });
Now that we can observe slow queries as they happen, let’s look at why they happen as data scales. In Section 4, we’ll explore inefficient patterns, unused indexes, and how things change when collections grow.
4. Identifying inefficient query and write patterns in large data sets
Some queries feel fast… until they don’t. Performance issues in MongoDB often creep in as data grows, silently at first, until something starts to lag.
This section is our early-warning radar: We’re not fixing slow queries yet. We’re spotting them before they hurt us. We’ll look at key patterns that don’t scale, and how to recognize trouble using both the shell and the Atlas UI.
What we'll need
- Sample dataset loaded (e.g.,
sample_mflix
orsample_supplies
) mongosh
connected to your cluster- Access to a MongoDB Atlas M10+ cluster
- Atlas role: Project Data Access Read Only or higher
Look for query patterns that don’t scale
MongoDB gives us visibility into how queries behave—use it early and often.
Use .explain()
to diagnose inefficiencies even when queries "feel fast." A fast query can just be subject to luck, rather than actually well structured queries. And luck runs out. Be intentional and proactive, and you can spot these problems with scaling before they get a chance to slow you down.
Metric | What to watch |
---|---|
docsExamined ≫ nReturned |
Inefficient filter |
stage: "COLLSCAN" |
No index used |
executionTimeMillis grows with collection.count() |
Performance degrades with scale |
keysExamined is high with few matches |
Index exists but is not selective enough |
Measuring indexes that aren’t helping with $indexStats
MongoDB can show you how often each index on a collection is being used. This is helpful when deciding which indexes to keep, which are going unused, and whether your query patterns align with what you’ve optimized for.
Let’s try it on the movies
collection:
db.movies.aggregate([{ $indexStats: {} }]);
This returns a document per index, with stats about how frequently each one has been accessed since the server started tracking.
Here’s a sample of the output:
{ name: "year_1", accesses: { ops: 0, since: ISODate("2025-04-11T12:31:22.682Z") } }
The ops
field tells us how many operations have used this index. In this case, we can see that several indexes, including year_1
, year_-1
, and rated_1
, have ops: 0
. This means no queries have used them since tracking began.
On the other hand:
{ name: "_id_", accesses: { ops: 1, since: ISODate("2025-04-09T14:50:50.878Z") } }
The _id_
index has been used once. That’s expected, MongoDB always creates a unique index on _id
by default.
If we’re seeing a lot of unused indexes, it may be time to clean them up. Indexes that aren't used take up memory and can slow down writes. But don’t drop them just yet. Always double-check query patterns and workloads across environments before making changes.
This also works great after running queries in development. We can immediately see which ones actually triggered index access, and which ones just triggered full scans.
Tip: If all your indexes say ops: 0
, and you know your queries should be using them, it might just be that the server was restarted recently. accesses.since
will show you when tracking began.
Some things to check for:
Symptom | What it suggests |
---|---|
accesses.ops \= 0 |
Index isn’t being used at all |
Similar compound indexes with different key orders | Redundant indexes |
Many indexes with little usage | Possible memory bloat and write slowdown |
MongoDB Atlas tracks index usage in the Indexes tab for each collection.
Watch for document growth and update pain
MongoDB documents can be large, and that’s often a good thing. But…
- Updates to large or nested documents may rewrite the entire document.
- Frequently growing arrays cause fragmentation.
- Large docs mean fewer fit in memory.
Check this in the shell:
db.movies.stats().avgObjSize;
And we can have a look at the output:
1598
The average document size in the movies
collection is 1,598 bytes. This number helps estimate:
- Memory usage when scanning N documents (e.g., in a COLLSCAN).
- Working set size and whether it fits in RAM.
- The I/O cost of operations that don’t use indexes.
If totalDocsExamined
is 7,675 , and average doc size is 1,598 bytes:
7_675 docs * 1,598 bytes ≈ 12.27 MB
So, our full collection scan is reading about 12 MB from disk or memory for that single operation.
If our working set exceeds RAM, this can lead to page faults and slow performance, which will definitely show up in our Atlas Profiler.
In the Atlas UI, we can see how the memory of your MongoDB instance is affected as documents bloat.
- In Atlas, go to Clusters.
- Click on the name of your cluster.
- From the cluster dashboard, click the "Metrics" tab at the top.
- At the bottom of the screen, you can toggle what you want to inspect.
We’ll want to toggle:
- Cache Usage (shows cache dirty and used bytes—how much memory MongoDB is using and how much of it hasn’t been flushed to disk).
- Cache Activity (tracks how much data is being read into or written from cache, which reflects memory churn).
- Cache Ratio (shows fill ratio and dirty fill ratio, which help you understand how full the memory is and how many changes are waiting to be written).
- Memory (gives you memory_resident and memory_virtual to gauge working set pressure).
- System Memory (available and used—so you can correlate MongoDB’s memory usage with overall RAM limits).
These charts let you visualize how much of your instance’s memory is in active use, and how much of that is getting clogged by updates or large documents. If cache dirty steadily rises or you see frequent spikes in eviction activity, it’s a sign that your working set is too big for your RAM. Document growth, overly large updates, and inefficient schemas all make this worse.
This is where performance starts to degrade: MongoDB has to flush dirty pages more frequently or pull data from disk more often, which slows queries and increases I/O. If we see high Cache Usage, low System Memory Available, and rising Disk IOPS or Latency, it’s a strong signal to review our schema, indexing strategy, or scale up the instance.
This is our live view into how MongoDB balances memory and disk as our documents grow.
All of these are signs that MongoDB is under memory pressure and spending more time loading documents from disk rather than serving them directly from memory.
> Tip: Combine this with $indexStats and avgObjSize to understand why; is it bloated documents? Unused indexes? Or just too much data for the current instance size?
When to watch for sharding symptoms
We might not need sharding yet, but MongoDB will start telling us when vertical scaling is no longer enough.
Check for:
Symptom | What it suggests |
---|---|
Queries slow down as data grows | Indexes can’t keep up—look at executionTimeMillis trends |
Working set no longer fits in memory | High disk I/O, cache evictions in Atlas |
Insert-heavy workloads bottlenecking writes | CPU/memory pressure on primaries |
Collection is approaching storage limits | Time to scale out |
If we see regular performance degradation due to growth, especially on a single hot collection, it’s time to evaluate sharding as a next step.
Recap:
What to monitor | How |
---|---|
Query inefficiencies | .explain("executionStats") + Atlas Query Profiler |
Index effectiveness | $indexStats + Atlas Indexes tab |
Document bloat | avgObjSize , storageSize , update patterns |
Memory pressure | Cache + disk I/O metrics |
Scaling pressure | Query performance vs. dataset growth |
You're now equipped to spot performance concerns before they explode, whether it’s slow queries hiding in plain sight or documents quietly growing out of control.
In Section 5, we’ll turn our attention to aggregation pipelines, and how to diagnose performance issues stage-by-stage using explain()
.
5. Analyzing aggregation query performance
Aggregation pipelines are often where performance issues hide in MongoDB. On the surface, everything seems clean, an elegant series of stages reshaping documents, filtering out noise, grouping and sorting. But underneath, inefficiencies can quietly stack up. Poor pipeline ordering, unindexed lookups, broad scans, or early memory-intensive operations like $group
or $sort
may not be obvious until they degrade performance significantly at scale.
This section is about teaching you how to read a pipeline, profile its performance, and understand where execution cost is being incurred, before it becomes a real problem. We’ll use .explain("executionStats")
to trace each stage and observe where time and resources are spent. As always, the goal here is not to solve the performance problem, but to identify the parts of the pipeline that need attention.
Diagnosing poor stage ordering
To illustrate how pipeline structure affects performance, let’s start with a deliberately inefficient example. In the sample_supplies
database, let's run the following:
db.sales.aggregate([ { $group: { _id: "$item", totalSales: { $sum: "$price" } } }, { $match: { totalSales: { $gt: 100 } } } ]).explain("executionStats");
This pipeline groups all documents in the collection before applying a filter on the computed totalSales
value. From a functional standpoint, it’s valid. From a performance perspective, it’s costly. We’re forcing MongoDB to load every document into memory to compute the groupings, regardless of whether that document will later be filtered out.
The explain
output will confirm this. We should expect to see a COLLSCAN
stage in the cursor source, along with high values for totalDocsExamined
and executionTimeMillis
. For example:
{ "stages": [ { "$cursor": { "executionStats": { "stage": "COLLSCAN", "nReturned": 220, "totalDocsExamined": 250000, "executionTimeMillisEstimate": 92 } } }, { "$group": { "executionTimeMillisEstimate": 31 } }, { "$match": { "executionTimeMillisEstimate": 1 } } ] }
This output shows that most of the time is spent scanning and grouping all 250,000 documents, even though only a small number are returned. There’s no filtering until the final stage, meaning the $group
stage is carrying a much heavier load than necessary.
Improving pipeline efficiency through reordering
Now, let’s reverse the order of the pipeline so that the filter occurs before the grouping:
db.sales.aggregate([ { $match: { price: { $gt: 1 } } }, { $group: { _id: "$item", totalSales: { $sum: "$price" } } } ]).explain("executionStats");
In this case, the explain output should show dramatically fewer documents examined, a lower execution time, and, if we’ve indexed the price
field, an IXSCAN
instead of a COLLSCAN
. We’ve offloaded unnecessary work from $group
by letting $match
reduce the input set early.
This is one of the most important structural concepts when reading or profiling aggregations: place filtering stages as early as possible. $match
, $project
, and $limit
reduce the document set and should appear toward the top of your pipeline. Stages like $group
, $sort
, and $lookup
that grow memory usage or execution time should appear later, after the document set has already been narrowed.
Profiling aggregations in Atlas
You can observe the impact of pipeline reordering visually in the Atlas UI. Go to your cluster’s Query Profiler, then filter by operation type to show only aggregate
queries. Locate your slow group-first query in the scatterplot and click to expand the timeline. You’ll see each stage listed with timing information.
Look specifically at the execution time for the $group
stage. It will likely dominate the operation. Then, compare it to the reordered pipeline where $match
runs first—stage timings will show a reduced burden on $group
and often a significantly lower execution time overall.
Diagnosing $lookup
inefficiencies
Now, consider a common source of hidden aggregation slowness: unindexed $lookup
joins.
Let’s look at a real example using the sample_analytics
dataset. We’ll join the transactions
collection with customers
, matching the customer_id
field from transactions to the email
field in customers. This is deliberately inefficient, no index exists on email
in the foreign collection, and neither field is guaranteed to be unique.
db.transactions.aggregate([ { $lookup: { from: "customers", localField: "customer_id", foreignField: "email", as: "customerDetails" } } ]);
This query returns some documents. In fact, it returned 1,746 documents when we ran it. But behind the scenes, it scanned thousands more.
Here’s the important diagnostic output from explain("executionStats")
:
{ "executionStats": { "nReturned": 1746, "executionTimeMillis": 8, "totalDocsExamined": 2246, "collectionScans": 2, "indexesUsed": [], "executionStages": { "stage": "hash_lookup", "usedDisk": false, "spilledRecords": 0 } }, "winningPlan": { "stage": "EQ_LOOKUP", "strategy": "HashJoin", "foreignCollection": "sample_analytics.customers", "inputStage": { "stage": "COLLSCAN" } } }
The query returned fast, but:
collectionScans: 2
: Both collections were fully scanned.indexesUsed: []
: No index was used on the join fields.totalDocsExamined: 2,246
: 2.2K documents read to return 1.7K. WhentotalDocsExamined
is more thannReturned
, that is wasted I/O effort.strategy: HashJoin
: MongoDB built an in-memory hash table to support the join. This is a memory-intensive join strategy.
These are red flags for joins at scale. As your data grows, this will become a bottleneck. Each transaction document triggers a scan on the foreign collection, and if the field isn’t indexed, or worse, contains duplicate values, the cost increases linearly.
To correct it, let’s add an index on customers.email
:
db.customers.createIndex({ email: 1 });
Then, re-run the aggregation with explain(“execuationStats”)
and compare the difference.
{ "nReturned": 1746, "executionTimeMillis": 9, "totalDocsExamined": 1746, "indexesUsed": ["email_1"], "winningPlan": { "stage": "EQ_LOOKUP", "strategy": "IndexedLoopJoin", "indexName": "email_1", "inputStage": { "stage": "COLLSCAN" } } }
Even though execution time rose slightly (possibly due to internal re-planning or I/O timing variance), the structure is much healthier. We’ve replaced a memory-intensive hash join with an index-backed nested loop, and reduced total document examination to match the number of documents returned.
This is what we want from performance tuning: not always instant speed wins, but structural improvements that scale.
- Always check for
$lookup
joins over unindexed fields. - Use
explain("executionStats")
to reveal strategy and index usage. - Look for
strategy: IndexedLoopJoin
and avoidHashJoin
where possible.
If we want to look at this information without using mongosh or running .explain()
again and again, we can just pop back over to the Query Profiler in the Atlas UI, and examine it there, from our live queries running in our applications. It will pick it up as long as our queries are running slow enough.
Sorting before filtering: A hidden cost
Another subtle performance trap is using $sort
early in a pipeline, especially before a $match
. Sorting large datasets in memory is expensive, and doing so before filtering means we’re paying that cost for documents we might never use. Use the sample_supplies
database, and let’s run the “same aggregation,” but with a different ordering of stages, and look at the outputs.
Try this pipeline:
db.sales.aggregate([ { $sort: { saleDate: -1 } }, { $match: { "items.name": "binder" } } ]).explain("executionStats");
The explain plan will show that the sort is operating on the full dataset, not just the relevant subset. The executionTimeMillisEstimate
for the sort stage may be significantly higher than the filter.
{ queryPlanner: { stage: 'SORT', sortPattern: { saleDate: -1 }, inputStage: { stage: 'COLLSCAN', filter: { 'items.name': { '$eq': 'binder' } }, docsExamined: 5000 } }, executionStats: { executionTimeMillis: 28, totalDocsExamined: 5000, nReturned: 3396, executionStages: { stage: 'SORT', executionTimeMillisEstimate: 20, inputStage: { stage: 'COLLSCAN' } } } }
MongoDB scans all 5000 documents, and then returns the 3396 matches, 28ms to run. Now, let’s simply reverse the order so that the $match
runs first:
db.sales.aggregate([ { $match: { "items.name": "binder" } }, { $sort: { saleDate: -1 } } ]).explain("executionStats");
And our output should show some favourable changes:
{ queryPlanner: { stage: 'SORT', sortPattern: { saleDate: -1 }, inputStage: { stage: 'COLLSCAN', filter: { 'items.name': { '$eq': 'binder' } }, docsExamined: 5000 } }, executionStats: { executionTimeMillis: 15, totalDocsExamined: 5000, nReturned: 3396, executionStages: { stage: 'SORT', executionTimeMillisEstimate: 13, inputStage: { stage: 'COLLSCAN' } } } }
The same number of documents is examined, but by filtering first, we halve the $sort
time. A lot of that time is saved in sorting.
Also note that if you sort on a field that has an index, MongoDB can use that index to avoid an in-memory sort. Use planSummary
in the explain output to confirm whether that’s happening. This is your cue to check stage order and index coverage.
Aggregations with MongoDB Atlas Search
If you're using Atlas Search, remember that $search
is a special stage that must appear first in your pipeline. Unlike $match
, $search
does not use MongoDB’s native indexes, and you won’t see it in explain()
output.
To diagnose slow $search
pipelines, you’ll need to use Atlas Profiler. Find the query under operation type aggregate
, and click through to inspect the total duration and number of documents returned. MongoDB Atlas Search queries typically show their performance in terms of latency and result count rather than index scan stats.
If $search
is slow, check whether your search index includes the right fields and analyzers, and whether you're retrieving more fields than necessary after $search
. Adding a $project
stage immediately after $search
can reduce document size and improve throughput.
To simulate the slow query, we can create an index on our Atlas (as long as we have admin permissions) for our search queries:
db.runCommand({ createSearchIndexes: "movies", indexes: [ { name: "titleIndex", definition: { mappings: { dynamic: false, fields: { title: { type: "string" } } } } } ] });
Give this a minute to create, then we can run the following query. This uses a common term to trigger a high-result count and returns full documents without limiting or projecting fields:
db.movies.aggregate([ { $search: { index: "titleIndex", text: { query: "the", path: "title" } } } ]);
A query as generic as “the” in movie titles will give us quite high latency, and we can dissect it and learn how to improve it in our profiler.
Aggregation diagnostics summary
Aggregation pipelines offer incredible power but carry significant performance implications when not constructed thoughtfully. By using explain("executionStats")
at every step, you can see how each stage behaves, whether indexes are used, and where MongoDB is spending the most time.
In MongoDB Atlas, use the Query Profiler to correlate slow queries with real execution data, and use the Indexes tab to verify whether your lookups and sorts are supported by indexes. Pay close attention to ordering—filter early, reduce early, and avoid letting costly stages like $group
, $sort
, or $lookup
operate on more documents than necessary.
When profiling, don’t just look at total execution time. Look at stage execution time, document counts, and index plans. MongoDB is giving us a detailed play-by-play.
Conclusion
We’ve now got the full diagnostic toolkit: a profiler to see what ran slowly, an execution plan to understand why, a live view to watch it happen in real time, and a deep dive into how performance scales. We’re not stomping around, hunting problems. We’re observant and well tooled. That’s the foundation every great optimization effort is built on.
This wasn’t a guide to optimization—it was a guide to observation. Because you can’t fix what you can’t see.
If you want to keep digging deeper, here are a few useful next stops. Check out the Atlas Query Profiler Documentation, or read up on Aggregation Pipeline Optimization Best Practices. MongoDB gives you a lot of visibility if you know where to look. Now, you do.
Don’t Forget to Share This Post!
Comments (0)
No comments yet. Be the first.