Distributed Inventory Management at Scale

Building mission-critical real-time inventory systems serving as the single source of truth for multi-city marketplace operations

Home | ← Writing | Comics | Patents

Inventory management in a distributed marketplace isn't just about tracking items—it's about building the foundational system that every part of the business depends on. When customers search for vehicles, hosts manage their fleet, and multiple business verticals make real-time decisions, they all rely on one truth: the inventory system.

This is the technical story of a distributed inventory management system that handled 300K+ daily operations across 25 cities, serving as the authoritative source of truth while maintaining real-time consistency and graceful conflict resolution.

The Mission-Critical Systems Challenge

Inventory systems in marketplaces face a unique engineering challenge: they must serve as the single source of truth for multiple stakeholders with different requirements, all while maintaining real-time accuracy under high concurrency.

System Requirements: Serve as authoritative inventory source for customers (search/booking), vendors (fleet management), and business verticals (analytics/operations) while handling 300K+ daily operations with sub-second response times.

Scale and Complexity:

300K+ daily operations with 50:1 read-to-write ratio
25 cities with geo-distributed operations
10K+ vehicles with real-time availability tracking
Multiple stakeholders requiring different data views and update patterns
Zero tolerance for stale data in customer-facing applications

Stakeholder Requirements

Customer-Facing Systems:

Search Results: Real-time availability with accurate inventory status
Booking Flow: Immediate inventory reservation with conflict detection
Dynamic Updates: Live availability changes during browsing sessions

Vendor/Host Operations:

Fleet Management: Real-time vehicle status updates (available, booked, maintenance)
Availability Control: Immediate inventory blocking for maintenance or personal use
Booking Notifications: Instant updates when vehicles are reserved

Business Verticals:

Analytics: Consistent inventory metrics across all reporting systems
Operations: Real-time fleet utilization and availability insights
Finance: Accurate revenue attribution and utilization calculations

Distributed Concurrency Architecture

The core challenge in distributed inventory management is handling simultaneous operations from multiple actors while maintaining data consistency and preventing race conditions.

Multi-Level Concurrency Control

The system employs a layered approach to concurrency control, combining JVM-level synchronization with distributed atomic operations:

ReentrantLock for JVM-Level Synchronization

At the application level, ReentrantLock provides fine-grained concurrency control for critical sections involving inventory state changes:

Locking Strategy: Vehicle-level locks prevent simultaneous booking attempts for the same inventory item while allowing concurrent operations on different vehicles to proceed without blocking.

Concurrency Control Logic: 1. Acquire ReentrantLock for specific vehicle_id 2. Validate inventory state and business rules 3. Execute booking/blocking/status change logic 4. Trigger atomic Redis update via Lua script 5. Release lock after successful operation Lock Granularity: Per-vehicle locking minimizes contention while ensuring consistency

Lua Scripts for Atomic Distributed Operations

Redis Lua scripts provide atomic execution of complex operations that span multiple data structures and business logic:

Atomic Operations: Lua scripts handle vehicle booking, completion, accident blocking, maintenance scheduling, and status transitions as single atomic operations in Redis.

Lua Script Responsibilities:

Vehicle Booking: Atomic reservation with availability validation and status updates
Booking Completion: Status transitions with timeline updates and availability restoration
Accident/Maintenance Blocking: Immediate inventory removal with proper status tracking
Status Synchronization: Ensuring Redis state consistency across multiple data structures

Example Lua Script Logic (Vehicle Booking): -- Atomic booking operation in Redis local vehicle_id = KEYS[1] local booking_details = ARGV[1] -- Check current availability local current_status = redis.call('HGET', 'vehicle:' .. vehicle_id, 'status') if current_status ~= 'available' then return {err = 'Vehicle not available'} end -- Atomic status update redis.call('HSET', 'vehicle:' .. vehicle_id, 'status', 'booked') redis.call('HSET', 'vehicle:' .. vehicle_id, 'booking_id', booking_details) redis.call('SADD', 'booked_vehicles', vehicle_id) redis.call('SREM', 'available_vehicles', vehicle_id) -- Update city-level availability counters redis.call('HINCRBY', 'city_inventory', city_id, -1) return {ok = 'Booking successful'}

CQRS and Event-Driven Architecture

With a 50:1 read-to-write ratio, the system employs Command Query Responsibility Segregation (CQRS) to optimize for different access patterns while maintaining strong consistency where required.

Read-Write Separation Strategy

Architectural Decision: Separate read and write paths with dedicated infrastructure for each pattern, enabling optimization for high-volume reads while ensuring write consistency and durability.

Event-Driven Synchronization

The system maintains consistency between Redis (operational data store) and PostgreSQL (persistent storage) through event-driven synchronization rather than polling mechanisms:

Event Flow Architecture:

Write Operations: Update Redis immediately for real-time availability
Event Publishing: Publish status change events after successful Redis updates
Database Sync: Consume events to update PostgreSQL with eventual consistency
Read Operations: Serve from optimized read replicas with strong consistency guarantees

Optimistic Business Model for Conflict Resolution

Rather than implementing strict pessimistic locking that could impact availability, the system employs an optimistic approach aligned with business reality.

Strategic Trade-off: Accept potential double-booking scenarios and resolve conflicts gracefully through user communication and alternative vehicle offers, prioritizing business continuity over perfect technical consistency.

Conflict Resolution Strategy

Optimistic Acceptance Model:

Accept Bookings: Allow potentially conflicting bookings rather than rejecting requests
Detect Conflicts: Identify double-booking or unavailability issues post-acceptance
Graceful Resolution: Offer alternative vehicles with user communication and incentives
Business Continuity: Convert potential rejections into customer retention opportunities

Conflict Resolution Flow: 1. Accept booking request optimistically 2. Process through normal booking validation 3. Detect conflicts during post-processing validation 4. If conflict detected: a. Maintain user engagement (don't cancel immediately) b. Search for equivalent or upgraded vehicle alternatives c. Present options with potential incentives/upgrades d. Allow user choice rather than forced cancellation 5. Track conflict resolution success rates for system optimization

Business Impact of Optimistic Model

Revenue Protection: Converting 70%+ of potential booking conflicts into successful alternative bookings through proactive customer service rather than technical rejection.

This approach recognizes that in marketplace businesses, customer acquisition and retention often outweigh perfect technical consistency, especially when conflicts can be resolved satisfactorily.

API Design for Multi-Consumer Patterns

Supporting diverse stakeholders requires carefully designed APIs that optimize for different usage patterns while maintaining a consistent data model.

Consumer-Specific API Design

Customer-Facing APIs:

Search API: Optimized for bulk availability queries with geographic filtering
Real-time Availability: WebSocket connections for live inventory updates during browsing
Booking API: Immediate reservation with optimistic conflict handling

Vendor/Host APIs:

Fleet Management: Batch operations for managing multiple vehicle statuses
Status Control: Immediate blocking/unblocking for maintenance or personal use
Revenue Tracking: Real-time utilization and booking analytics

Business Vertical APIs:

Analytics APIs: Aggregate data queries optimized for reporting and dashboards
Operations APIs: City-wide inventory insights and utilization metrics
Integration APIs: Standardized data access for downstream business systems

API Design Patterns: • Real-time APIs: WebSocket connections for live updates • Batch APIs: Bulk operations for administrative tasks • Analytics APIs: Aggregated data with appropriate caching • Integration APIs: Standardized formats for cross-system compatibility Rate Limiting: Consumer-specific limits based on usage patterns Authentication: Role-based access with fine-grained permissions

Strong Consistency Requirements

Despite the optimistic business model, certain aspects of the inventory system require strong consistency to maintain user trust and business integrity.

Non-Negotiable Consistency: Customer-facing search results must reflect real-time inventory status. Displaying unavailable vehicles as available damages user experience and business credibility.

Consistency Guarantees

Strong Consistency Domains:

Search Results: Real-time availability reflection in customer-facing applications
Booking Validation: Immediate inventory status verification during reservation flow
Host Dashboard: Current fleet status for host decision-making

Eventual Consistency Domains:

Analytics Reporting: Slight delays acceptable for aggregate metrics and dashboards
Historical Data: Audit trails and long-term analytics can tolerate sync delays
Cross-System Integration: Downstream systems can handle eventual consistency patterns

Read Replica Strategy

All stakeholders read from designated read replicas that maintain strong consistency for customer-facing operations while providing eventual consistency for analytical workloads:

Read Replica Architecture: • Primary Read Replica: Strong consistency for customer-facing operations • Analytics Replica: Optimized for complex queries with eventual consistency • Geographic Replicas: City-specific replicas for performance optimization • Cross-Region Replicas: Disaster recovery and geographic distribution Replication Strategy: • Synchronous replication for customer-facing reads • Asynchronous replication for analytics and reporting • Automatic failover with consistency verification

Performance Results and System Impact

Mission-Critical Reliability: Achieved 99.9%+ availability while serving as the authoritative inventory source for entire marketplace operations across 25 cities.

Performance Characteristics:

Throughput: 300K+ daily operations with sub-second response times
Concurrency: Handled simultaneous operations from multiple business verticals
Consistency: Strong consistency for customer-facing reads, eventual consistency for analytics
Conflict Resolution: 70%+ success rate in converting booking conflicts to alternative bookings

Business Impact:

Revenue Protection: Optimistic booking model prevented revenue loss from technical rejections
Operational Efficiency: Single source of truth eliminated data inconsistencies across business units
User Experience: Real-time availability updates maintained customer trust and engagement
System Reliability: Mission-critical uptime requirements met consistently across all markets

Key Engineering Insights

Business-aligned technical decisions outperform perfect consistency: Optimistic conflict resolution with graceful degradation provides better business outcomes than strict technical consistency in marketplace environments.

CQRS enables optimization for different stakeholder patterns: Separating read and write paths allows optimization for high-volume customer reads while maintaining consistency for critical operations.

Multi-level concurrency control scales with complexity: Combining JVM-level locks with distributed atomic operations provides both performance and consistency at different system layers.

Evolution and Scaling Considerations

Geographic Distribution: As marketplace operations expand globally, the system requires evolution toward geo-distributed consistency models with regional autonomy and cross-region synchronization.

Real-time Personalization: Future iterations could incorporate real-time inventory allocation based on user preferences and booking patterns, optimizing availability presentation for individual users.

Predictive Availability: Advanced systems could leverage machine learning to predict inventory availability patterns and proactively manage conflicts before they occur.

← Back to All Writing