Fix: MongoDB Aggregation Pipeline Not Working — Wrong Results or Empty Array
Part of: Database Errors
Quick Answer
How to fix MongoDB aggregation pipeline issues — $lookup field matching, $unwind on missing fields, $match placement, $group _id, type mismatches, and pipeline debugging.
The Problem
A MongoDB aggregation pipeline returns an empty array when it shouldn’t:
db.orders.aggregate([
{
$lookup: {
from: 'users',
localField: 'userId',
foreignField: '_id',
as: 'user',
},
},
{ $unwind: '$user' },
{ $match: { 'user.role': 'admin' } },
]);
// Returns [] — but matching documents existOr $group produces unexpected results:
db.sales.aggregate([
{
$group: {
_id: '$category',
total: { $sum: 'amount' }, // Returns 0 for all groups
},
},
]);Or $lookup returns an empty array for every document despite matching data existing in the joined collection.
Why This Happens
MongoDB aggregation operates on a stream of documents flowing through a series of stages, where each stage transforms or filters the stream and passes its output to the next stage. The model is powerful but unforgiving: stages do not raise errors when they receive unexpected input, they simply produce different output. A $match against a misspelled field name returns no documents instead of throwing. A $sum against a missing field returns zero instead of complaining. A $lookup with a type mismatch returns an empty array per input document instead of warning.
The most common silent failure is a type mismatch between localField and foreignField in $lookup. MongoDB does not coerce types during equality comparisons in joins. An ObjectId does not equal the string representation of the same hex value. This is the single largest source of “the aggregation runs but returns nothing” bug reports.
The second class of issues is around the cache-like behavior of pipeline optimization. The query planner moves and merges stages — pushing $match before $project, merging adjacent $match stages. The stages you wrote are not necessarily the stages that ran when you read .explain() output.
MongoDB aggregation has several non-obvious behaviors:
- Type mismatch in
$lookup—localFieldis often astringwhileforeignField(_id) is anObjectId. These don’t match. The$lookupsilently returns an empty array for each document. $unwindremoves documents with missing or empty arrays — if$lookupreturns[]for a document,$unwinddrops that document entirely. This silently filters your results.$matchplacement affects performance and results —$matchearly in the pipeline uses indexes. After$lookup, it operates on the joined documents. Order matters.$sumrequires a field reference with$—$sum: 'amount'is a literal string (value 0).$sum: '$amount'is a field reference. The missing$is a silent bug.$group _idmust be exact —$group: { _id: '$category' }groups by the exact value.nullgroups everything together.- Case sensitivity — field names and string values in
$matchare case-sensitive.$match: { status: 'Active' }won’t match{ status: 'active' }.
Version History That Changes the Failure Mode
Aggregation operators and stages have been added to MongoDB on almost every major release. A pipeline written against MongoDB 7 will throw Unrecognized pipeline stage name on a 4.4 cluster, and a 4.4 pipeline may miss optimizations that ship later.
- MongoDB 4.4 (Jul 2020) — added
$merge(write aggregation results back into a collection),$unionWith(combine multiple collections), and union pipelines.$lookupwithlet/pipelinewas stabilized. Most modern aggregation patterns work from 4.4 onward. - MongoDB 5.0 (Jul 2021) — added
$setWindowFieldsfor window functions (running totals, rank, lag/lead). Time-series collections introduced, with optimizations to aggregations grouped by time buckets. Versioned API stable v1. - MongoDB 6.0 (Jul 2022) — added Queryable Encryption,
$densify,$fill, and improvements to$lookupwith concise correlated subqueries. Change streams gained pre/post-image support, often consumed by aggregation downstream. - MongoDB 7.0 (Aug 2023) — added compound wildcard indexes (which finally make wildcard indexes useful behind
$matchin aggregations),$percentileand$medianoperators, and approximate sharded$count. Time series scalability significantly improved. - MongoDB 8.0 (Oct 2024) — query optimizer rewrites for
$lookup, sharded$lookupperformance improvements, and explicit$rankFusion/$scorefor hybrid search pipelines. Improvedexplainoutput for staged pipelines. - Atlas Search aggregation stages (rolling, 2022 onward) —
$searchand$searchMetaonly work on Atlas. On self-hosted clusters they fail with “Unrecognized pipeline stage name: ‘$search’”. Don’t copy/paste Atlas examples into a community-edition cluster.
Confirm your cluster version with db.version() before assuming a stage exists.
Fix 1: Fix $lookup Type Mismatches
The most common $lookup failure is a type mismatch between the joined fields:
// Documents in 'orders' collection:
// { _id: ObjectId("..."), userId: "64a1b2c3d4e5f6789abcdef0" } ← string
// Documents in 'users' collection:
// { _id: ObjectId("64a1b2c3d4e5f6789abcdef0"), name: "Alice" } ← ObjectId
// WRONG — string vs ObjectId, $lookup returns [] for every order
db.orders.aggregate([
{
$lookup: {
from: 'users',
localField: 'userId', // string: "64a1b2c3d4e5f6789abcdef0"
foreignField: '_id', // ObjectId: ObjectId("64a1b2...")
as: 'user',
},
},
]);Fix option 1: Convert the local field to ObjectId with $addFields:
db.orders.aggregate([
{
$addFields: {
userObjectId: { $toObjectId: '$userId' }, // Convert string → ObjectId
},
},
{
$lookup: {
from: 'users',
localField: 'userObjectId', // Now matches ObjectId _id
foreignField: '_id',
as: 'user',
},
},
]);Fix option 2: Fix the data model — store userId as ObjectId, not string. This is the root cause:
// Fix existing documents in the orders collection
db.orders.find({ userId: { $type: 'string' } }).forEach(doc => {
db.orders.updateOne(
{ _id: doc._id },
{ $set: { userId: new ObjectId(doc.userId) } }
);
});Fix option 3: Use $lookup with a pipeline for more control:
db.orders.aggregate([
{
$lookup: {
from: 'users',
let: { orderId: { $toObjectId: '$userId' } },
pipeline: [
{ $match: { $expr: { $eq: ['$_id', '$$orderId'] } } }
],
as: 'user',
},
},
]);Fix 2: Handle $unwind on Missing or Empty Arrays
$unwind drops documents where the field is missing, null, or an empty array by default:
// Orders where $lookup found no matching user → user: []
// $unwind will DROP these documents
// WRONG — silently drops orders with no matching user
db.orders.aggregate([
{ $lookup: { from: 'users', localField: 'userId', foreignField: '_id', as: 'user' } },
{ $unwind: '$user' }, // Drops orders where user: []
]);
// CORRECT — use preserveNullAndEmptyArrays to keep all documents
db.orders.aggregate([
{ $lookup: { from: 'users', localField: 'userId', foreignField: '_id', as: 'user' } },
{
$unwind: {
path: '$user',
preserveNullAndEmptyArrays: true, // Keep documents with no match
},
},
]);Debug $unwind results by running the pipeline up to that stage:
// Run only the $lookup stage to check what 'user' contains
db.orders.aggregate([
{ $lookup: { from: 'users', localField: 'userId', foreignField: '_id', as: 'user' } },
{ $limit: 5 }, // Check a sample
]);
// If user: [] for all documents → type mismatch in $lookup
// If user: [{ ... }] → $lookup works, check $unwindFix 3: Fix $group and $sum
Missing $ prefix on field references is a silent bug — $sum treats a bare string as a literal value:
// WRONG — 'amount' is a string literal (not a field reference)
// $sum of string literal = 0
db.sales.aggregate([
{
$group: {
_id: '$category',
total: { $sum: 'amount' }, // ← Missing $, sums the string 'amount' = 0
count: { $sum: 1 }, // ← Literal 1, this is correct for counting
},
},
]);
// CORRECT — use $ prefix for field references
db.sales.aggregate([
{
$group: {
_id: '$category',
total: { $sum: '$amount' }, // ← $amount references the 'amount' field
avgPrice: { $avg: '$price' },
maxQty: { $max: '$quantity' },
items: { $push: '$name' }, // Collect all names into an array
},
},
]);Group by multiple fields:
db.sales.aggregate([
{
$group: {
_id: {
category: '$category',
year: { $year: '$date' },
},
total: { $sum: '$amount' },
},
},
{ $sort: { '_id.year': -1, total: -1 } },
]);Group all documents (no grouping key):
db.sales.aggregate([
{
$group: {
_id: null, // null groups everything into one document
grandTotal: { $sum: '$amount' },
docCount: { $sum: 1 },
},
},
]);Fix 4: Optimize $match Placement
$match early in the pipeline uses indexes and filters documents before expensive operations like $lookup:
// WRONG — $match after $lookup processes all documents first
db.orders.aggregate([
{
$lookup: {
from: 'users',
localField: 'userId',
foreignField: '_id',
as: 'user',
},
},
{ $match: { status: 'completed' } }, // Filters AFTER joining all users
]);
// CORRECT — filter early, join fewer documents
db.orders.aggregate([
{ $match: { status: 'completed' } }, // Uses index, reduces documents
{
$lookup: {
from: 'users',
localField: 'userId',
foreignField: '_id',
as: 'user',
},
},
]);$match with $expr for comparing fields:
// Match documents where endDate > startDate (compare two fields in same doc)
db.events.aggregate([
{
$match: {
$expr: { $gt: ['$endDate', '$startDate'] },
},
},
]);
// After $lookup, match on joined field
db.orders.aggregate([
{ $lookup: { from: 'users', localField: 'userId', foreignField: '_id', as: 'user' } },
{ $unwind: { path: '$user', preserveNullAndEmptyArrays: false } },
{ $match: { 'user.active': true } }, // Filter on joined field
]);Fix 5: Use $project and $addFields Correctly
Control which fields appear in the output:
db.orders.aggregate([
{ $match: { status: 'completed' } },
{
$lookup: {
from: 'users',
localField: 'userId',
foreignField: '_id',
as: 'user',
},
},
{ $unwind: '$user' },
{
$project: {
// 1 = include, 0 = exclude
orderId: '$_id', // Rename _id to orderId
amount: 1, // Include amount
status: 1, // Include status
customerName: '$user.name', // Promote nested field
customerEmail: '$user.email',
_id: 0, // Exclude _id from output
user: 0, // Exclude the full user object
},
},
]);
// $addFields — add computed fields without removing others
db.orders.aggregate([
{
$addFields: {
totalWithTax: { $multiply: ['$amount', 1.1] },
isLargeOrder: { $gte: ['$amount', 1000] },
year: { $year: '$createdAt' },
},
},
]);Fix 6: Debug with $explain and Stage-by-Stage
Break a complex pipeline into stages to find where it goes wrong:
// Step 1: Run only the first stage
db.orders.aggregate([
{ $match: { status: 'completed' } },
{ $limit: 5 },
]).toArray();
// Step 2: Add the next stage
db.orders.aggregate([
{ $match: { status: 'completed' } },
{ $lookup: { from: 'users', localField: 'userId', foreignField: '_id', as: 'user' } },
{ $limit: 5 },
]).toArray();
// Step 3: Check what 'user' looks like — empty array or populated?
// Continue adding stages until you find the stage that breaks results
// Check if the pipeline uses indexes
db.orders.explain('executionStats').aggregate([
{ $match: { status: 'completed' } },
{ $lookup: { from: 'users', localField: 'userId', foreignField: '_id', as: 'user' } },
]);
// Look for 'IXSCAN' (index scan) vs 'COLLSCAN' (full scan)Count documents at each stage:
// Add a $count stage to verify documents aren't being filtered out
db.orders.aggregate([
{ $match: { status: 'completed' } },
{ $count: 'afterMatch' },
]);
db.orders.aggregate([
{ $match: { status: 'completed' } },
{ $lookup: { from: 'users', localField: 'userId', foreignField: '_id', as: 'user' } },
{ $count: 'afterLookup' },
]);
// If afterMatch > 0 but afterLookup = 0 → problem is in the $lookupFix 7: Common Aggregation Patterns
Frequently used patterns that are easy to get wrong:
// Pagination with $skip and $limit
db.orders.aggregate([
{ $match: { status: 'completed' } },
{ $sort: { createdAt: -1 } },
{ $skip: 20 }, // Skip first 20
{ $limit: 10 }, // Take next 10
]);
// Faceted search — multiple aggregations in parallel
db.products.aggregate([
{ $match: { inStock: true } },
{
$facet: {
byCategory: [
{ $group: { _id: '$category', count: { $sum: 1 } } },
{ $sort: { count: -1 } },
],
priceStats: [
{ $group: { _id: null, avg: { $avg: '$price' }, max: { $max: '$price' } } },
],
total: [{ $count: 'count' }],
},
},
]);
// Date-based grouping
db.sales.aggregate([
{
$group: {
_id: {
year: { $year: '$date' },
month: { $month: '$date' },
},
monthlyRevenue: { $sum: '$amount' },
},
},
{ $sort: { '_id.year': 1, '_id.month': 1 } },
]);
// Array operations — filter and transform array fields
db.posts.aggregate([
{
$project: {
title: 1,
// Filter tags array to only include specific values
techTags: {
$filter: {
input: '$tags',
as: 'tag',
cond: { $in: ['$$tag', ['javascript', 'typescript', 'python']] },
},
},
tagCount: { $size: '$tags' },
},
},
]);Still Not Working?
$lookup returns multiple documents when you expect one — if the foreign collection has duplicate values in foreignField, $lookup returns all matches as an array. Use $unwind to flatten, or $arrayElemAt: ['$user', 0] to take the first element.
Aggregation is slow — use $match and $sort as early as possible. Ensure indexes exist on fields used in $match, $lookup localField/foreignField, and $sort. Run .explain('executionStats') to verify index usage.
$lookup across databases — $lookup can only join collections within the same database. To join across databases, use $merge to copy data to the same database first, or perform the join in application code.
Numbers stored as strings — if numeric fields were imported as strings, $sum and $avg return 0. Convert with $toDouble or $toInt in a $addFields stage before grouping.
allowDiskUse: false with large $group or $sort — pipeline stages have a 100 MB memory limit. Hitting it returns an error rather than a wrong result, but only when allowDiskUse is explicitly false. Add allowDiskUse: true or restructure the pipeline to filter earlier.
Stage order changed by the optimizer — .explain() shows the executed pipeline, not the source. If you see $match stages appearing in unexpected positions, the optimizer rewrote them. This is normally fine, but if you rely on side-effects of stage order (rare but possible with $out), disable optimization with aggregate(..., { hint: ... }) or restructure.
Atlas-only stages on self-hosted — $search, $searchMeta, and $vectorSearch require Atlas. On self-hosted clusters they raise Unrecognized pipeline stage name.
For related MongoDB issues, see Fix: MongoDB Duplicate Key Error, Fix: MongoDB Schema Validation Error, Fix: Next.js API Route Not Working, and Fix: Jest Async Test Timeout.
Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.
Was this article helpful?
Related Articles
Fix: MongoDB Schema Validation Error — Document Failed Validation
How to fix MongoDB schema validation errors — $jsonSchema rules, required fields, type mismatches, enum constraints, bypassing validation for migrations, and Mongoose schema conflicts.
Fix: MongoDB "not primary" Write Error (Replica Set)
How to fix MongoDB 'not primary' errors when writing to a replica set — read preference misconfiguration, connecting to a secondary, replica set elections, and write concern settings.
Fix: PocketBase Not Working — Auth Failing, Real-time Subscriptions Broken, or Collection Rules Blocking Requests
How to fix PocketBase issues — authentication, collection access rules, real-time subscriptions, file uploads, relations, and self-hosted deployment.
Fix: Neon Database Not Working — Connection Timeout, Branching Errors, or Serverless Driver Issues
How to fix Neon Postgres issues — connection string setup, serverless HTTP driver vs TCP, database branching, connection pooling, Drizzle and Prisma integration, and cold start optimization.