20K+
Facial records tested
737 ms
Fastest endpoint
FaceNet512
Model
I was learning computer vision and wanted to go beyond tutorials. As a backend engineer, the natural extension was: build something production-shaped. After validating the idea with a few AI tools, I confirmed it was feasible and started building a public face recognition API — three endpoints, real inference, real dataset.
GPU vs CPU inference
First decision: run FaceNet512 on GPU or CPU. On GPU, cold start took 10+ seconds due to CUDA driver loading. On CPU, cold start was faster and inference latency was close enough to GPU that the difference didn't justify the overhead. Went with CPU on a private server.
Microservice architecture
The original plan was a microservice design built for scale — separate services, multi-region pricing model, and an evaluation of Firebase vs Supabase vs full self-host. This was the wrong frame for an MVP. All of it was running on a single server anyway.
Storage and embedding strategy
Facial photos are stored resized to 512px. DeepFace automatically generates and caches embeddings in a .pkl file on first scan — so search operates against cached embeddings, not raw images. Metadata is stored in Firestore. At 20K+ records, Firestore latency became noticeable and migrating to self-hosted PostgreSQL on the same server was considered.
The over-engineering realization
The microservice design, multi-region pricing research, and database provider debates were all premature. None of them mattered at MVP stage. The API worked. The latency was acceptable. The lesson: solve the problem in front of you, not the one three steps ahead.
| Endpoint | Description | Latency |
|---|---|---|
| /enroll | Register a new face | ~1,844 ms |
| /verify | 1:1 identity check | ~1,037 ms |
| /find | 1:N search across 20K+ records | ~737 ms |
Tested against 20K+ facial embeddings on a private server using CPU inference.
The API works. The latency is acceptable for the use case. But the bigger lesson was about scope: microservice design, multi-region pricing, and database provider debates are not MVP problems. Ship first. Optimize the parts that actually break.