Caching Strategy and Cost Control
Overview
Entitybase uses multiple cache layers to achieve sub-second response times while controlling infrastructure costs. This strategy is designed for 1M+ entities/week scale with efficient hit rates and appropriate TTL policies.
Cache Architecture
Client Request
↓
┌───────────────────────────────────────┐
│ 1. Browser/Client Cache │
│ - HTTP caching headers │
│ - ETag / Last-Modified │
│ - Max-age directives │
└───────────────────────────────────────┘
↓ (cache miss)
┌───────────────────────────────────────┐
│ 2. CDN Cache │
│ - CloudFront / Cloudflare │
│ - Edge location caching │
│ - S3 snapshots (immutable) │
│ - Cache-Control: public, max-age=31536000 (1 year) │
└───────────────────────────────────────┘
↓ (CDN miss)
┌───────────────────────────────────────┐
│ 3. Application Object Cache │
│ - Valkey / Memcached │
│ - entity_id_mapping lookups │ ← NEW: Hybrid ID translation
│ - entity_head lookups │
│ - entity metadata │
└───────────────────────────────────────┘
↓ (object cache miss)
┌───────────────────────────────────────┐
│ 4. Vitess Database │
│ - entity_id_mapping table │
│ - entity_head table │
│ - entity_revisions table │
└───────────────────────────────────────┘
↓ (database miss)
┌───────────────────────────────────────┐
│ 5. S3 Object Store │
│ - Immutable snapshots │
│ - S3 GET operations │
└───────────────────────────────────────┘
Cache Layers
Layer 1: Client Cache
Purpose: Eliminate redundant requests from clients
Mechanism: - HTTP caching headers (ETag, Last-Modified) - Cache-Control directives - Conditional requests (If-None-Match, If-Modified-Since)
Configuration:
Cache-Control: public, max-age=3600, s-maxage=86400
ETag: "revision_id:content_hash"
Last-Modified: revision.created_at
Expected hit rate: 40-60% (depending on client usage patterns)
Layer 2: CDN Cache (Optional)
Purpose: Serve immutable snapshots from edge locations worldwide
Key characteristics: - Immutable snapshots: S3 objects never change after write - Infinite cacheability: No invalidation needed - Edge delivery: Sub-50ms latency globally
S3 Cache Configuration:
S3 Object Metadata:
Cache-Control: "public, max-age=31536000, immutable" # 1 year
Expires: 1 year from now
x-amz-meta-revision-id: "42"
x-amz-meta-content-hash: "sha256:..."
CDN Configuration (CloudFront/Cloudflare): - Cache S3 GET responses for 1 year - Enable Gzip/Brotli compression - Enable HTTP/2 and HTTP/3
Expected hit rate: 70-85% for hot entities (Q42, Q5, etc.)
Layer 3: Application Object Cache
Purpose: Cache database queries and identifier mappings
Technologies: Valkey (recommended) or Memcached
3.1 Entity ID Mapping Cache (NEW for Hybrid ID Strategy)
Purpose: Cache entity_id_mapping lookups for fast external → internal ID translation
Cache key format:
entity_id:{external_id}
Examples:
entity_id:Q123
entity_id:P42
entity_id:L999
Cache value:
{
"internal_id": 1424675744195114,
"entity_type": "item",
"created_at": "2025-01-15T10:30:00Z"
}
TTL: 3600 seconds (1 hour) - Mappings rarely change after creation - Long enough to handle load spikes - Short enough to pick up changes if needed
Implementation:
from redis import redis
import json
class EntityIdCache:
def __init__(self, valkey_client: redis.Redis):
self.valkey = valkey_client # Valkey uses redis-py client (Redis-compatible)
self.ttl = 3600 # 1 hour
self.key_prefix = "entity_id:"
def get_internal_id(self, external_id: str) -> int:
"""Lookup internal_id by external_id (Q123, P42, L999)"""
cache_key = f"{self.key_prefix}{external_id}"
cached = self.valkey.get(cache_key)
if cached:
# Cache hit
cache_data = json.loads(cached)
return cache_data['internal_id']
# Cache miss: query database
result = db.query(
"SELECT internal_id, entity_type, created_at FROM entity_id_mapping WHERE external_id = %s",
(external_id,)
)
if not result:
raise NotFoundError(f"Entity {external_id} not found")
# Populate cache
cache_value = json.dumps({
"internal_id": result['internal_id'],
"entity_type": result['entity_type'],
"created_at": str(result['created_at'])
})
self.valkey.setex(cache_key, self.ttl, cache_value)
return result['internal_id']
def invalidate(self, external_id: str):
"""Invalidate cache entry on entity update/delete"""
cache_key = f"{self.key_prefix}{external_id}"
self.valkey.delete(cache_key)
def warm_up(self, external_ids: list[str]):
"""Warm up cache for frequently accessed entities"""
with self.valkey.pipeline() as pipe:
for external_id in external_ids:
cache_key = f"{self.key_prefix}{external_id}"
# Query database (batch)
results = db.query_batch(
"SELECT internal_id, entity_type, created_at FROM entity_id_mapping WHERE external_id IN (%s)" %
','.join(['%s'] * len(external_ids)),
external_ids
)
for result in results:
cache_value = json.dumps({
"internal_id": result['internal_id'],
"entity_type": result['entity_type'],
"created_at": str(result['created_at'])
})
pipe.setex(cache_key, self.ttl, cache_value)
pipe.execute()
3.2 Entity Head Cache
Purpose: Cache current head revision pointer to avoid frequent entity_head table queries
Cache key format:
entity_head:{internal_id}
Example:
entity_head:1424675744195114
Cache value:
{
"internal_id": 1424675744195114,
"head_revision_id": 42,
"updated_at": "2025-01-15T10:30:00Z"
}
TTL: 300 seconds (5 minutes) - Head revisions update frequently on active entities - Short TTL to stay current
3.3 Entity Metadata Cache
Purpose: Cache entity metadata (labels, descriptions, types) for search and browse endpoints
Cache key format:
entity_meta:{internal_id}
Example:
entity_meta:1424675744195114
Cache value:
{
"internal_id": 1424675744195114,
"external_id": "Q123",
"entity_type": "item",
"labels": {"en": "Douglas Adams", "de": "Douglas Adams"},
"descriptions": {"en": "British author"}
}
TTL: 1800 seconds (30 minutes)
Layer 4: Vitess Database Query Cache
Purpose: Cache frequently executed SQL queries at database level
Configuration:
-- Enable Vitess query cache
SET GLOBAL query_cache_size = 256M;
SET GLOBAL query_cache_type = ON;
-- Cache common lookups
-- (entity_id_mapping queries, entity_head lookups)
Expected hit rate: 30-40% for repetitive queries
Layer 5: S3 Object Store
Purpose: System of record for all entity snapshots
Characteristics: - Immutable: Never modified after write - Cache-friendly: CDN caches for 1 year - Global replication: Multi-region S3 for low latency
No caching strategy needed at S3 layer due to immutability.
Cache Invalidation Strategy
Immutable Data (No Invalidation)
S3 Snapshots: - Never invalidated - immutable by design - If content is wrong, create new revision
entity_id_mapping (after creation): - Rarely changes - mapping is immutable - Only invalidated if entity is deleted and recreated (very rare)
Mutable Data (Active Invalidation)
entity_head (head revision pointer): - Invalidated on successful write - TTL fallback: 5 minutes ensures eventual correctness
Entity metadata (labels, descriptions): - Invalidated on entity update - TTL fallback: 30 minutes
Invalidation Implementation
class CacheInvalidator:
def invalidate_entity(self, external_id: str, internal_id: int):
"""Invalidate all cache entries for entity"""
# 1. Invalidate entity_id_mapping cache
entity_id_cache.invalidate(external_id)
# 2. Invalidate entity_head cache
entity_head_cache_key = f"entity_head:{internal_id}"
valkey_client.delete(entity_head_cache_key)
# 3. Invalidate entity metadata cache
entity_meta_cache_key = f"entity_meta:{internal_id}"
valkey_client.delete(entity_meta_cache_key)
# 4. Clear CDN cache (if needed)
# Note: S3 snapshots are immutable, no CDN invalidation needed
# Example: Invalidate after entity update
def update_entity(entity_data: dict):
external_id = entity_data['external_id']
internal_id = entity_data['internal_id']
# Write new S3 snapshot
s3.put(f"revisions/{external_id}/r{entity_data['revision_id']}.json", entity_data)
# Update Vitess (CAS update entity_head)
db.update_entity_head(internal_id, entity_data['revision_id'])
# Invalidate caches
cache_invalidator.invalidate_entity(external_id, internal_id)
return {"status": "success"}
Performance Targets
Expected Hit Rates
| Cache Layer | Expected Hit Rate | Target P50 Latency | Target P99 Latency |
|---|---|---|---|
| Client cache | 40-60% | 0ms (local) | 0ms (local) |
| CDN cache | 70-85% | 50ms | 200ms |
| entity_id_mapping cache | >95% | <1ms | <5ms |
| entity_head cache | 80-90% | <2ms | <10ms |
| Entity metadata cache | 70-80% | <5ms | <20ms |
| Vitess query cache | 30-40% | <10ms | <50ms |
Latency Targets
GET /entity/Q123 (hot entity, cached)
↓ Client cache HIT: <1ms (local)
GET /entity/Q123 (hot entity, not in client cache)
↓ CDN HIT: 50ms (edge)
GET /entity/Q123 (cold entity)
↓ CDN MISS → object cache HIT: 60ms
↓ Object cache MISS → Vitess: 100ms
↓ Vitess → S3: 200ms (total P99)
Cost Control
Cost Optimization Strategies
1. Prefer Cache Over Database
Rule: Check all cache layers before querying database
Cost impact: - Valkey: $0.01/10,000 operations - Vitess: $0.10/10,000 operations - 10x cheaper to cache
2. Long TTLs for Immutable Data
Strategy: - S3 snapshots: 1 year (never invalidated) - entity_id_mapping: 1 hour (rarely changes)
Cost impact: - Reduces database queries by >90% for hot entities - Estimated savings: $500-2000/month at scale
3. CDN Over Direct S3 Access
Strategy: - All public reads go through CDN (CloudFront/Cloudflare) - CDN caching reduces S3 GET operations by 80%
Cost impact: - CDN: $0.085/GB (vs S3: $0.09/GB) - Data transfer cost similar, but performance much better - S3 operations reduced significantly
4. Optimize Cache Memory Usage
Strategy: - Compress cached values (gzip) - Use efficient data structures - Set appropriate TTL to prevent memory bloat
Implementation:
import gzip
def compress_cache_value(value: dict) -> bytes:
"""Compress cache value to reduce memory usage"""
json_str = json.dumps(value)
return gzip.compress(json_str.encode('utf-8'))
def decompress_cache_value(compressed: bytes) -> dict:
"""Decompress cache value"""
return json.loads(gzip.decompress(compressed).decode('utf-8'))
# Usage in cache
cache_value = {"internal_id": 1424675744195114, ...}
compressed = compress_cache_value(cache_value)
valkey_client.set("entity_id:Q123", compressed)
# Decompress on read
compressed = valkey_client.get("entity_id:Q123")
cache_value = decompress_cache_value(compressed)
Monitoring
Key Metrics
Cache Hit Rates
entity_id_mapping_cache_hit_rate
- labels: {cache: valkey}
- gauge: 0.95-0.99 (target >95%)
- alert: <90% for >5 minutes
entity_head_cache_hit_rate
- labels: {cache: valkey}
- gauge: 0.80-0.90
- alert: <70% for >5 minutes
cdn_cache_hit_rate
- labels: {cdn: cloudfront}
- gauge: 0.70-0.85
- alert: <60% for >15 minutes
client_cache_hit_rate
- labels: {source: api}
- gauge: 0.40-0.60
- alert: <30% for >1 hour
Cache Performance
cache_lookup_latency_p50
- labels: {cache: valkey, operation: entity_id_mapping}
- gauge: <1ms
- alert: >2ms for >5 minutes
cache_lookup_latency_p99
- labels: {cache: valkey, operation: entity_id_mapping}
- gauge: <5ms
- alert: >10ms for >5 minutes
valkey_memory_usage_bytes
- gauge: <50GB (for 10M entities, 100 bytes each)
- alert: >40GB (90% capacity)
- alert: >45GB (90% capacity)
Cost Metrics
s3_get_operations_total
- counter: ~1M/week (baseline)
- alert: >2M/week (cache not effective)
vitess_query_latency_p99
- gauge: <50ms
- alert: >100ms (cache not effective)
cdn_bytes_served_total
- counter: ~10TB/week
- target: >80% of traffic
Warm-up Strategies
Initial Warm-up on Deployment
Purpose: Pre-populate caches with frequently accessed entities
Strategy:
def warm_up_entity_id_cache():
"""Warm up entity_id_mapping cache for top entities"""
# Query top 10,000 entities by access count (or all if small dataset)
top_entities = db.query(
"SELECT external_id, internal_id, entity_type FROM entity_id_mapping "
"ORDER BY access_count DESC LIMIT 10000"
)
# Batch populate cache
cache.warm_up([e['external_id'] for e in top_entities])
logging.info(f"Warmed up cache for {len(top_entities)} entities")
# Run on deployment
warm_up_entity_id_cache()
Periodic Warm-up
Schedule: Every 6 hours
Purpose: Refresh cache for entities that expired
Implementation:
# Cron job to warm up cache
0 */6 * * * * warm-up-cache.sh
Configuration
Valkey Configuration
# valkey.conf
maxmemory 64gb
maxmemory-policy allkeys-lru # Evict least recently used keys
save 900 1 # Save to disk every 15 minutes if 1+ changes
save 300 10 # Save every 5 minutes if 10+ changes
save 60 10000 # Save every minute if 10000+ changes
appendonly yes # AOF persistence for durability
tcp-keepalive 300 # Keep connections alive
timeout 300 # Close idle connections after 5 minutes
Cache Client Configuration
# Python redis-py client (Valkey-compatible)
from redis import redis
valkey_client = redis.Redis(
host='valkey.internal',
port=6379,
db=0,
socket_timeout=5,
socket_connect_timeout=5,
retry_on_timeout=True,
health_check_interval=30,
max_connections=50
)
Operational Procedures
Cache Clearing
Full cache clear (emergency):
# Flush all Valkey keys (use with caution)
valkey-cli FLUSHDB
# Verify cache is empty
valkey-cli DBSIZE
Selective cache clear:
# Clear only entity_id_mapping cache
valkey-cli --scan --pattern 'entity_id:*' | xargs valkey-cli DEL
# Clear only entity_head cache
valkey-cli --scan --pattern 'entity_head:*' | xargs valkey-cli DEL
Cache Backfill
Scenario: Cache cleared or Valkey replaced
Procedure:
def backfill_entity_id_cache():
"""Backfill entity_id_mapping cache from database"""
# Query all entity_id_mapping records
all_mappings = db.query(
"SELECT external_id, internal_id, entity_type, created_at "
"FROM entity_id_mapping ORDER BY created_at"
)
# Batch populate cache
batch_size = 1000
for i in range(0, len(all_mappings), batch_size):
batch = all_mappings[i:i+batch_size]
cache.warm_up([m['external_id'] for m in batch])
logging.info(f"Backfilled {i+batch_size}/{len(all_mappings)} mappings")
time.sleep(0.1) # Prevent overwhelming Valkey
References
- STORAGE-ARCHITECTURE.md - S3 + Vitess storage design
- ENTITY-MODEL.md - Entity identifiers and usage patterns
- SCALING-PROPERTIES.md - System scaling characteristics