back to home

URL Shortener: Counter vs Hash

Problem

Create a URL shortener.

Input: long URL
Output: short URL

Details

Two main problems:

  1. How to generate a short code
  2. Handle collision (if applicable)

Two Approaches

  1. Counter + Base62 encoding: No collisions
  2. Hash + Collision Handling

Approach 1: Counter + Base62 encoding

Table: urls

- id (primary key, auto-increment)
- long_url (text, not null)
- short_url (varchar(100), not null)
- created_at (timestamp)

Algorithm

  1. Check if long_url exists in DB
    • If yes, return existing short_url
    • If no, continue
  2. Get row count → increment by 1 → base62 encode
  3. Insert (long_url, short_url) into table
  4. Return short_url

Result: There is no chance of collision here.

Code: encoding.py

Approach 2: Hash + Collision Resolution

Algorithm

  1. Hash long_url with md5 → take first 7 chars
  2. Check if hash exists in DB:
    • Exists + same URL → duplicate → return existing
    • Exists + different URL → collision → append counter (or re-hash with salt) until empty slot is found
    • Doesn't exist → insert directly
  3. Return short_url

Code: hashing.py

When to Use

Counter + Base62:

  • Guaranteed uniqueness (no collision handling)
  • Simpler implementation
  • NOT idempotent: Same URL gets different codes on repeated insertions (depends on primary key/insertion order)

Hash:

  • Deterministic (same URL → always same code)
  • Idempotent: Multiple insertions of same URL return same short code
  • Requires collision handling
back to all posts