MongoDB
Designing a Fraud Detection System
Build a real-time fraud detection engine using rule-based systems, ML anomaly detection, velocity checks, and device fingerprinting.
S
srikanthtelkalapally888@gmail.com
Designing a Fraud Detection System
Fraud detection identifies and blocks fraudulent transactions in real-time before financial loss occurs.
Types of Fraud
Payment fraud: Stolen card used for purchases
Account takeover: Login with stolen credentials
Promo abuse: Multiple accounts for bonuses
Bot attacks: Automated fake account creation
Friendly fraud: Legitimate customer disputes valid charges
Architecture
Transaction/Event
↓
Feature Extraction
↓
Rules Engine → Block/Allow
↓
ML Scorer → Risk Score 0-100
↓
Decision Engine
├── Score < 30: Auto-approve
├── Score 30-70: Step-up auth (2FA, CAPTCHA)
└── Score > 70: Block + review queue
Feature Engineering
Device features:
- Device fingerprint
- IP geolocation
- VPN/proxy/Tor detection
Behavioral features:
- Time since account creation
- Number of transactions today
- Amount vs historical average
- Transaction velocity
Relationship features:
- Recipient seen before?
- Merchant risk category
- Card country vs IP country
Velocity Checks
Rules:
3 failed logins in 5 min → Lock account
5 payments in 1 hour → Flag for review
$1000+ in 30 min → Step-up auth
New device + large txn → Block + 2FA
Implement with Redis sliding window counters.
ML Model
Model type: Gradient Boosted Trees (XGBoost)
Training data: Historical transactions + fraud labels
Features: 150+ features per transaction
Latency: <20ms inference
Challenge: Class imbalance (fraud = 0.1% of transactions)
Solution: Oversampling (SMOTE) + cost-sensitive learning
False Positive Cost
Fraud systems must balance:
False Positive: Block legitimate transaction → Customer friction
False Negative: Miss fraud → Financial loss
Target: <0.5% false positive rate
Feedback Loop
Decision → Outcome (chargeback? user confirms?)
→ Label data
→ Retrain model weekly
Conclusion
Fraud detection is a layered system: fast rules for known patterns, ML for novel ones, and human review for high-value edge cases.