Distributed Inventory Management at Scale

Building mission-critical real-time inventory systems serving as the single source of truth for multi-city marketplace operations

Inventory management in a distributed marketplace isn't just about tracking items—it's about building the foundational system that every part of the business depends on. When customers search for vehicles, hosts manage their fleet, and multiple business verticals make real-time decisions, they all rely on one truth: the inventory system.

This is the technical story of a distributed inventory management system that handled 300K+ daily operations across 25 cities, serving as the authoritative source of truth while maintaining real-time consistency and graceful conflict resolution.

The Mission-Critical Systems Challenge

Inventory systems in marketplaces face a unique engineering challenge: they must serve as the single source of truth for multiple stakeholders with different requirements, all while maintaining real-time accuracy under high concurrency.

System Requirements: Serve as authoritative inventory source for customers (search/booking), vendors (fleet management), and business verticals (analytics/operations) while handling 300K+ daily operations with sub-second response times.

Scale and Complexity:

Stakeholder Requirements

Customer-Facing Systems:

Vendor/Host Operations:

Business Verticals:

Distributed Concurrency Architecture

The core challenge in distributed inventory management is handling simultaneous operations from multiple actors while maintaining data consistency and preventing race conditions.

Multi-Level Concurrency Control

The system employs a layered approach to concurrency control, combining JVM-level synchronization with distributed atomic operations:

CONCURRENCY CONTROL ARCHITECTURE APPLICATION LAYER +----------------+ +----------------+ +----------------+ +----------------+ | Customer |--->| ReentrantLock |--->| Booking |--->| Status | | Booking | | (JVM Level) | | Validation | | Update | | Request | | | | | | | +----------------+ +----------------+ +----------------+ +----------------+ | | | v v v +----------------+ +----------------+ +----------------+ +----------------+ | Host Fleet |--->| Thread |--->| Business |--->| Conflict | | Management | | Synchronization| | Logic | | Resolution | +----------------+ +----------------+ +----------------+ +----------------+ DISTRIBUTED LAYER +----------------+ +----------------+ +----------------+ +----------------+ | Redis |--->| Lua Script |--->| Atomic |--->| Database | | Coordination | | Execution | | Operations | | Propagation | +----------------+ +----------------+ +----------------+ +----------------+

ReentrantLock for JVM-Level Synchronization

At the application level, ReentrantLock provides fine-grained concurrency control for critical sections involving inventory state changes:

Locking Strategy: Vehicle-level locks prevent simultaneous booking attempts for the same inventory item while allowing concurrent operations on different vehicles to proceed without blocking.
Concurrency Control Logic: 1. Acquire ReentrantLock for specific vehicle_id 2. Validate inventory state and business rules 3. Execute booking/blocking/status change logic 4. Trigger atomic Redis update via Lua script 5. Release lock after successful operation Lock Granularity: Per-vehicle locking minimizes contention while ensuring consistency

Lua Scripts for Atomic Distributed Operations

Redis Lua scripts provide atomic execution of complex operations that span multiple data structures and business logic:

Atomic Operations: Lua scripts handle vehicle booking, completion, accident blocking, maintenance scheduling, and status transitions as single atomic operations in Redis.

Lua Script Responsibilities:

Example Lua Script Logic (Vehicle Booking): -- Atomic booking operation in Redis local vehicle_id = KEYS[1] local booking_details = ARGV[1] -- Check current availability local current_status = redis.call('HGET', 'vehicle:' .. vehicle_id, 'status') if current_status ~= 'available' then return {err = 'Vehicle not available'} end -- Atomic status update redis.call('HSET', 'vehicle:' .. vehicle_id, 'status', 'booked') redis.call('HSET', 'vehicle:' .. vehicle_id, 'booking_id', booking_details) redis.call('SADD', 'booked_vehicles', vehicle_id) redis.call('SREM', 'available_vehicles', vehicle_id) -- Update city-level availability counters redis.call('HINCRBY', 'city_inventory', city_id, -1) return {ok = 'Booking successful'}

CQRS and Event-Driven Architecture

With a 50:1 read-to-write ratio, the system employs Command Query Responsibility Segregation (CQRS) to optimize for different access patterns while maintaining strong consistency where required.

Read-Write Separation Strategy

Architectural Decision: Separate read and write paths with dedicated infrastructure for each pattern, enabling optimization for high-volume reads while ensuring write consistency and durability.
CQRS ARCHITECTURE WRITE PATH +----------------+ +----------------+ +----------------+ +----------------+ | Write |--->| Business |--->| Concurrency |--->| Redis | | Commands | | Validation | | Control | | Update | | (Booking/ | | | | (Lock + Lua) | | (Atomic) | | Status Change) | | | | | | | +----------------+ +----------------+ +----------------+ +----------------+ | v +----------------+ | Event | | Publishing | | (Status | | Changes) | +----------------+ | v READ PATH +----------------+ +----------------+ +----------------+ +----------------+ | Database | | Read |--->| Read |--->| Cached |<--| Sync | | Queries | | Replicas | | Responses | | (Eventual) | | (Search/ | | (Optimized | | (High | | | | Availability) | | for Reads) | | Performance) | | | +----------------+ +----------------+ +----------------+ +----------------+

Event-Driven Synchronization

The system maintains consistency between Redis (operational data store) and PostgreSQL (persistent storage) through event-driven synchronization rather than polling mechanisms:

Event Flow Architecture:

Optimistic Business Model for Conflict Resolution

Rather than implementing strict pessimistic locking that could impact availability, the system employs an optimistic approach aligned with business reality.

Strategic Trade-off: Accept potential double-booking scenarios and resolve conflicts gracefully through user communication and alternative vehicle offers, prioritizing business continuity over perfect technical consistency.

Conflict Resolution Strategy

Optimistic Acceptance Model:

Conflict Resolution Flow: 1. Accept booking request optimistically 2. Process through normal booking validation 3. Detect conflicts during post-processing validation 4. If conflict detected: a. Maintain user engagement (don't cancel immediately) b. Search for equivalent or upgraded vehicle alternatives c. Present options with potential incentives/upgrades d. Allow user choice rather than forced cancellation 5. Track conflict resolution success rates for system optimization

Business Impact of Optimistic Model

Revenue Protection: Converting 70%+ of potential booking conflicts into successful alternative bookings through proactive customer service rather than technical rejection.

This approach recognizes that in marketplace businesses, customer acquisition and retention often outweigh perfect technical consistency, especially when conflicts can be resolved satisfactorily.

API Design for Multi-Consumer Patterns

Supporting diverse stakeholders requires carefully designed APIs that optimize for different usage patterns while maintaining a consistent data model.

Consumer-Specific API Design

Customer-Facing APIs:

Vendor/Host APIs:

Business Vertical APIs:

API Design Patterns: • Real-time APIs: WebSocket connections for live updates • Batch APIs: Bulk operations for administrative tasks • Analytics APIs: Aggregated data with appropriate caching • Integration APIs: Standardized formats for cross-system compatibility Rate Limiting: Consumer-specific limits based on usage patterns Authentication: Role-based access with fine-grained permissions

Strong Consistency Requirements

Despite the optimistic business model, certain aspects of the inventory system require strong consistency to maintain user trust and business integrity.

Non-Negotiable Consistency: Customer-facing search results must reflect real-time inventory status. Displaying unavailable vehicles as available damages user experience and business credibility.

Consistency Guarantees

Strong Consistency Domains:

Eventual Consistency Domains:

Read Replica Strategy

All stakeholders read from designated read replicas that maintain strong consistency for customer-facing operations while providing eventual consistency for analytical workloads:

Read Replica Architecture: • Primary Read Replica: Strong consistency for customer-facing operations • Analytics Replica: Optimized for complex queries with eventual consistency • Geographic Replicas: City-specific replicas for performance optimization • Cross-Region Replicas: Disaster recovery and geographic distribution Replication Strategy: • Synchronous replication for customer-facing reads • Asynchronous replication for analytics and reporting • Automatic failover with consistency verification

Performance Results and System Impact

Mission-Critical Reliability: Achieved 99.9%+ availability while serving as the authoritative inventory source for entire marketplace operations across 25 cities.

Performance Characteristics:

Business Impact:

Key Engineering Insights

Business-aligned technical decisions outperform perfect consistency: Optimistic conflict resolution with graceful degradation provides better business outcomes than strict technical consistency in marketplace environments.
CQRS enables optimization for different stakeholder patterns: Separating read and write paths allows optimization for high-volume customer reads while maintaining consistency for critical operations.
Multi-level concurrency control scales with complexity: Combining JVM-level locks with distributed atomic operations provides both performance and consistency at different system layers.

Evolution and Scaling Considerations

Geographic Distribution: As marketplace operations expand globally, the system requires evolution toward geo-distributed consistency models with regional autonomy and cross-region synchronization.

Real-time Personalization: Future iterations could incorporate real-time inventory allocation based on user preferences and booking patterns, optimizing availability presentation for individual users.

Predictive Availability: Advanced systems could leverage machine learning to predict inventory availability patterns and proactively manage conflicts before they occur.

← Back to All Writing