End-to-end sales automation for car retailers. Lead capture to qualified conversation, zero manual intervention.
200 → 4
leads per day
7–15 min
response delay
24 / 7
no supervision
THE PROBLEM
Car dealerships running Meta ad campaigns receive hundreds of leads daily. Most are not ready to buy. Sales managers were spending the majority of their time chasing cold prospects instead of closing warm ones, and real buyers were waiting days for a first response. The economics only get worse as ad spend grows.
200+
Daily leads from Meta campaigns
Most unqualified, all requiring manual triage
Hours
Wasted per manager per day
Cold calling prospects with no buying intent
Days
Average first response time
Buyers engage the first dealership that responds
SYSTEM ARCHITECTURE
Every component is purpose-built for throughput. The pipeline runs continuously, handles concurrency without degradation, and surfaces only what managers need to act on.
STAGE 01
Facebook Lead Ads webhook fires on form submission. Lead data streams directly into PostgreSQL; name, contact, vehicle interest with zero latency and zero manual input.
STAGE 02
Gemini 2.5 initiates a WhatsApp conversation 7 to 15 minutes after capture, randomized to avoid bot detection. Natural dialogue extracts intent, timeline, and budget.
STAGE 03
Custom RAG pipeline, no framework overhead, queries 768-dimensional pgvector embeddings over company inventory, pricing, and regulations. Every response grounded. Zero hallucinations.
STAGE 04
Qualified leads scored and routed to the manager dashboard. From 200 daily inputs, managers see 4 to 5 prospects ready to close. Cold calling eliminated entirely.
TECHNICAL STACK
No unnecessary abstractions. Every library chosen for a specific reason, every integration justified by throughput or precision.
Drives all WhatsApp conversations. Context-aware, grounded through RAG, never hallucinates.
768-dimensional embeddings over company knowledge. Semantic search in milliseconds.
Conversation state and hot queries cached. Sub-second response times under any load.
Native WhatsApp integration with randomized 7 to 15 minute delays for human-paced behavior.
Webhook-driven real-time lead ingestion from Facebook and Instagram ad campaigns.
Drip email campaigns for leads not ready to convert. Keeps prospects warm until buying intent peaks.
WHAT IT TAUGHT ME
Building the RAG pipeline without a framework forced a deeper understanding of how retrieval actually works, what the embeddings represent, where semantic search breaks down, and why chunking strategy matters more than model choice.
The 7 to 15 minute delay was not a technical decision. It was the most important product decision in the whole system. A response in seconds gets ignored. A response that feels human gets a conversation.