Case Study· Personal Project

Inserting 1 Million Records into MongoDB — What Actually Makes It Fast

PerformanceBackendNode.jsGoMongoDB

A batch job that used to run over 3 minutes — or crash outright — now finishes in 20 seconds.

1M records processed in 20s using batched inserts and concurrency, down from 3m19s or an out-of-memory crash.

1M rows

Dataset

20s

Fastest

3m 19s

Slowest (completed)

The Question

What’s the most efficient way to insert 1 million records into MongoDB on constrained hardware? I ran this benchmark on a GCP e2-micro VM — 1 shared vCPU, 1 GB RAM, Intel Broadwell — using the Yelp dataset (1 million rows). The constraint is intentional — decisions that don’t matter on a 64GB machine start to matter here.

Approaches

main.js — row-by-row insert

Loads the entire CSV into memory, then inserts each record individually. Crashed after ~2 minutes with a heap out-of-memory error. On a 1GB VM, holding 1 million records in RAM simultaneously is not viable.

mainV2.js — streaming with 1K batch

Switched to a streaming CSV parser with insertMany in batches of 1,000. Memory stayed stable at 609 MB and the import completed in 3 minutes 19 seconds. The bottleneck here is round trips — 1,000 batches of 1K each means 1,000 separate insertMany calls to MongoDB.

mainV3.js — streaming with 10K batch

Increased batch size to 10,000 records (~3.45 MB per batch, 100 total batches). Memory usage barely changed (625 MB) but time dropped to 54 seconds — 3.7× faster than V2. Fewer round trips to MongoDB is the main driver.

main.go — goroutines with 10K batch

Rewrote mainV3.js in Go using a buffered channel (capacity 10) and goroutines for concurrent batch insertion. Finished in 20 seconds with ~400 MB memory — lower than the Node.js versions despite doing more work concurrently. Goroutines are more efficient than the Node.js event loop for this type of I/O-bound workload. Channel capacity was capped at 10 because more goroutines on a 1GB VM causes OOM.

Benchmark Results

Approach	Lang	Strategy	Time	Memory
main.js	Node.js	Row-by-row	DNF	OOM
mainV2.js	Node.js	Batch 1K	3m 19s	609 MB
mainV3.js	Node.js	Batch 10K	54s	625 MB
main.go	Go	Goroutine + Batch 10K	20s	~400 MB

Takeaway

Never load the full dataset into memory. Stream it.
Batch size directly impacts performance — 10K vs 1K is a 3.7× difference with almost no memory cost.
Go’s goroutines outperform Node.js event loop on concurrent I/O work, especially under memory constraints.
On constrained hardware, concurrency level needs to be tuned — more goroutines is not always faster.