NoSQL databases solve problems relational databases struggle with—flexible schemas, horizontal scaling, and specialized data models. Here's how to choose the right one.
NoSQL Categories#
Document Stores:
- MongoDB, CouchDB, Firebase
- JSON-like documents
- Flexible schema
- Use for: Content management, user profiles, catalogs
Key-Value Stores:
- Redis, DynamoDB, Memcached
- Simple key→value mapping
- Extremely fast
- Use for: Caching, sessions, real-time data
Column-Family:
- Cassandra, HBase, ScyllaDB
- Wide columns, sparse data
- Write-optimized
- Use for: Time series, IoT, analytics
Graph Databases:
- Neo4j, ArangoDB, Neptune
- Nodes and relationships
- Pattern matching
- Use for: Social networks, recommendations, fraud detection
Document Databases (MongoDB)#
1// Flexible schema
2interface User {
3 _id: ObjectId;
4 name: string;
5 email: string;
6 addresses: Address[]; // Embedded documents
7 preferences: { // Nested objects
8 theme: string;
9 notifications: boolean;
10 };
11}
12
13// Query examples
14const users = await db.collection('users')
15 .find({
16 'addresses.city': 'New York',
17 'preferences.notifications': true,
18 })
19 .sort({ createdAt: -1 })
20 .limit(10)
21 .toArray();
22
23// Aggregation pipeline
24const stats = await db.collection('orders').aggregate([
25 { $match: { status: 'completed' } },
26 { $group: {
27 _id: '$userId',
28 totalSpent: { $sum: '$total' },
29 orderCount: { $count: {} },
30 }},
31 { $sort: { totalSpent: -1 } },
32]).toArray();Best for:
✓ Flexible/evolving schemas
✓ Hierarchical data
✓ Rapid development
✓ Content management
Avoid when:
✗ Complex transactions required
✗ Many-to-many relationships
✗ Strong consistency critical
Key-Value Stores (Redis)#
1import Redis from 'ioredis';
2
3const redis = new Redis();
4
5// Simple key-value
6await redis.set('user:123', JSON.stringify(user));
7const user = JSON.parse(await redis.get('user:123'));
8
9// With TTL
10await redis.setex('session:abc', 3600, sessionData);
11
12// Hash (structured data)
13await redis.hset('user:123', {
14 name: 'John',
15 email: 'john@example.com',
16 lastLogin: Date.now().toString(),
17});
18const name = await redis.hget('user:123', 'name');
19
20// Sorted sets (leaderboards)
21await redis.zadd('leaderboard', score, 'player:123');
22const topPlayers = await redis.zrevrange('leaderboard', 0, 9, 'WITHSCORES');
23
24// Pub/sub
25await redis.publish('notifications', JSON.stringify(message));
26redis.subscribe('notifications', (message) => {
27 console.log('Received:', message);
28});Best for:
✓ Caching
✓ Session storage
✓ Real-time leaderboards
✓ Rate limiting
✓ Pub/sub messaging
Avoid when:
✗ Complex queries needed
✗ Data larger than memory
✗ Relationships between data
Column-Family (Cassandra)#
1-- Schema definition
2CREATE KEYSPACE myapp
3WITH replication = {
4 'class': 'NetworkTopologyStrategy',
5 'dc1': 3
6};
7
8CREATE TABLE events (
9 partition_key text,
10 event_time timestamp,
11 event_type text,
12 payload text,
13 PRIMARY KEY (partition_key, event_time)
14) WITH CLUSTERING ORDER BY (event_time DESC);
15
16-- Write (optimized for high throughput)
17INSERT INTO events (partition_key, event_time, event_type, payload)
18VALUES ('user:123', toTimestamp(now()), 'click', '{"page": "/home"}');
19
20-- Read by partition key (fast)
21SELECT * FROM events
22WHERE partition_key = 'user:123'
23AND event_time > '2024-01-01'
24LIMIT 100;Best for:
✓ Time series data
✓ High write throughput
✓ Geographic distribution
✓ IoT sensor data
Avoid when:
✗ Ad-hoc queries needed
✗ Frequent updates to same records
✗ Strong consistency required
✗ Complex joins needed
Graph Databases (Neo4j)#
1// Create nodes and relationships
2CREATE (john:Person {name: 'John', age: 30})
3CREATE (jane:Person {name: 'Jane', age: 28})
4CREATE (tech:Company {name: 'TechCorp'})
5CREATE (john)-[:WORKS_AT {since: 2020}]->(tech)
6CREATE (jane)-[:WORKS_AT {since: 2021}]->(tech)
7CREATE (john)-[:KNOWS]->(jane)
8
9// Find connections
10MATCH (p:Person)-[:WORKS_AT]->(c:Company)
11WHERE c.name = 'TechCorp'
12RETURN p.name, p.age
13
14// Friends of friends
15MATCH (me:Person {name: 'John'})-[:KNOWS*2]->(fof:Person)
16WHERE NOT (me)-[:KNOWS]->(fof) AND me <> fof
17RETURN DISTINCT fof.name
18
19// Shortest path
20MATCH path = shortestPath(
21 (a:Person {name: 'John'})-[:KNOWS*]-(b:Person {name: 'Alice'})
22)
23RETURN path
24
25// Recommendation
26MATCH (u:User)-[:PURCHASED]->(p:Product)<-[:PURCHASED]-(other:User)
27WHERE u.id = '123'
28WITH other, COUNT(p) as common
29ORDER BY common DESC LIMIT 10
30MATCH (other)-[:PURCHASED]->(rec:Product)
31WHERE NOT (u)-[:PURCHASED]->(rec)
32RETURN rec, COUNT(*) as score
33ORDER BY score DESC LIMIT 5Best for:
✓ Social networks
✓ Recommendation engines
✓ Fraud detection
✓ Knowledge graphs
✓ Network analysis
Avoid when:
✗ Simple CRUD operations
✗ No relationship queries
✗ Bulk data processing
Comparison Matrix#
| Feature | Document | Key-Value | Column | Graph |
|------------------|----------|-----------|-----------|-----------|
| Schema | Flexible | None | Flexible | Schema |
| Query Language | Rich | Get/Set | CQL | Cypher |
| Scaling | Horizon. | Horizon. | Horizon. | Limited |
| Transactions | Limited | Limited | No | Yes |
| Relationships | Embedded | None | Limited | Native |
| Best Performance | Reads | Both | Writes | Traversal |
Decision Framework#
1. What's your data structure?
- Hierarchical/nested → Document
- Simple values → Key-Value
- Wide/sparse → Column-Family
- Connected → Graph
2. What are your access patterns?
- By unique key → Key-Value
- By attributes → Document
- By time range → Column-Family
- By relationships → Graph
3. What's your scale?
- Read-heavy → Document, Key-Value
- Write-heavy → Column-Family
- Relationship-heavy → Graph
4. What's your consistency need?
- Strong → Document, Graph
- Eventual → Column-Family, Key-Value
Hybrid Approaches#
1// Use multiple databases for different needs
2class DataLayer {
3 private mongo: MongoClient; // Primary data
4 private redis: Redis; // Caching
5 private neo4j: Neo4jDriver; // Relationships
6
7 async getUser(id: string): Promise<User> {
8 // Try cache first
9 const cached = await this.redis.get(`user:${id}`);
10 if (cached) return JSON.parse(cached);
11
12 // Fetch from primary
13 const user = await this.mongo.collection('users').findOne({ _id: id });
14
15 // Cache result
16 await this.redis.setex(`user:${id}`, 3600, JSON.stringify(user));
17
18 return user;
19 }
20
21 async getUserFriends(id: string): Promise<User[]> {
22 // Graph traversal for relationships
23 const result = await this.neo4j.run(
24 'MATCH (u:User {id: $id})-[:FRIENDS]->(f:User) RETURN f.id',
25 { id }
26 );
27
28 const friendIds = result.records.map(r => r.get('f.id'));
29
30 // Batch fetch from primary
31 return this.mongo.collection('users')
32 .find({ _id: { $in: friendIds } })
33 .toArray();
34 }
35}Conclusion#
NoSQL databases aren't replacements for SQL—they're specialized tools. Choose based on your data model, access patterns, and scaling needs.
Often, the best architecture combines multiple database types, each handling what it does best.