Problem
Create a URL shortener.
Input: long URL
Output: short URL
Details
Two main problems:
- How to generate a short code
- Handle collision (if applicable)
Two Approaches
- Counter + Base62 encoding: No collisions
- Hash + Collision Handling
Approach 1: Counter + Base62 encoding
Table: urls
- id (primary key, auto-increment)
- long_url (text, not null)
- short_url (varchar(100), not null)
- created_at (timestamp)
Algorithm
- Check if long_url exists in DB
- If yes, return existing short_url
- If no, continue
- Get row count → increment by 1 → base62 encode
- Insert (long_url, short_url) into table
- Return short_url
Result: There is no chance of collision here.
Code: encoding.py
Approach 2: Hash + Collision Resolution
Algorithm
- Hash long_url with md5 → take first 7 chars
- Check if hash exists in DB:
- Exists + same URL → duplicate → return existing
- Exists + different URL → collision → append counter (or re-hash with salt) until empty slot is found
- Doesn't exist → insert directly
- Return short_url
Code: hashing.py
When to Use
Counter + Base62:
- Guaranteed uniqueness (no collision handling)
- Simpler implementation
- NOT idempotent: Same URL gets different codes on repeated insertions (depends on primary key/insertion order)
Hash:
- Deterministic (same URL → always same code)
- Idempotent: Multiple insertions of same URL return same short code
- Requires collision handling