Knowledge Guide
HomeDSA

Sorting

Step 15 in the DSA path · 2 concepts · 7 problems

0 / 9 complete

📘 Learn Sorting from zero

Sorting arranges elements into a defined order (numeric, lexicographic, by frequency, by a custom key) so that a later scan can rely on adjacency and monotonicity. Most "hard" problems become trivial once sorted, because sorting turns a global question ("do any intervals overlap?", "what is the k-th largest?") into a local, left-to-right one. Trigger: reach for sorting when the answer depends on order, ranks, or relative position rather than identity — overlapping intervals, top-K / frequency ordering, pairing smallest-with-largest, or "make adjacent elements comparable." Canonical example: Meeting Rooms II — given intervals, find the minimum number of rooms needed so no two overlapping meetings share a room.

✨ Added by the guide to build intuition — not from the source course.

🏗️ Visual walkthrough — trace it step by step

Step 1 — The input and the key idea
meetings = [[0,30], [5,10], [15,20]]

We never need to know WHICH meeting is in a room,
only HOW MANY rooms are busy at once.
So split each interval into two timeline events:

  starts = [ 0,  5, 15]   (a room is needed)
  ends   = [30, 10, 20]   (a room frees up)

A room is needed at every start and freed at every end. The peak number of simultaneously-busy rooms is the answer. Why this reframing is safe: overlap is purely a function of how many starts have happened before a given end — meeting identity is irrelevant. (Note: extracted in input order the ends are [30,10,20]; we sort them next.)

Step 2 — Sort both arrays independently
starts (sorted): [ 0,  5, 15]
ends   (sorted): [10, 20, 30]
                  ^         ^
                 si=0      ei=0

rooms = 0   (busy rooms right now)
max   = 0   (peak ever seen)

We sort starts and ends separately in ascending order (ends becomes [10, 20, 30]). Why this is safe: we are sweeping a timeline, so we only ever care about the next start vs the next end — not which end belongs to which start. Two pointers si and ei walk the two sorted lists in time order, and the loop runs only while si has starts left.

Step 3 — Process the earliest event: start at t=0
starts: [ 0,  5, 15]   ends: [10, 20, 30]
          ^                    ^
         si=0                 ei=0

starts[si]=0  vs  ends[ei]=10
0 < 10  -> a meeting STARTS first

rooms = 0 -> 1     si: 0 -> 1
max   = max(0,1) = 1

The earliest event in the whole timeline is the start at t=0 (since 0 < 10). A new meeting begins, so we allocate a room and advance si. Why safe: comparing the two pointer fronts always finds the globally next event, because both lists are sorted.

Step 4 — Next event: start at t=5 (overlap!)
starts: [ 0,  5, 15]   ends: [10, 20, 30]
              ^              ^
             si=1           ei=0

starts[1]=5 < ends[0]=10  -> another meeting STARTS
before the soonest end (10) has passed.

rooms = 1 -> 2     si: 1 -> 2
max   = max(1,2) = 2

At t=5 a second meeting starts while the [0,30] meeting is still running and the earliest free time is 10 (5 < 10). They overlap, so we need a second room. Why safe: because ends is sorted, ends[ei]=10 is the earliest possible free time; if even that hasn't passed, no room is free.

Step 5 — Next event: end at t=10 (a room frees)
starts: [ 0,  5, 15]   ends: [10, 20, 30]
                  ^          ^
                 si=2        ei=0

starts[2]=15  vs  ends[0]=10
15 >= 10  -> the next event is an END

rooms = 2 -> 1     ei: 0 -> 1
max   = max(2,1) = 2   (unchanged)

The next chronological event is now an end at t=10 (since 15 >= 10): the [5,10] meeting finishes before the next meeting starts. A room is released, so we advance ei (not si). Why safe: when the next start is not earlier than the soonest end, that room genuinely becomes reusable, so we decrement instead of allocating.

Step 6 — Next event: start at t=15 (reuse the freed room)
starts: [ 0,  5, 15]   ends: [10, 20, 30]
                  ^               ^
                 si=2            ei=1

starts[2]=15 < ends[1]=20  -> a meeting STARTS
A room freed at t=10, so we reuse it.

rooms = 1 -> 2     si: 2 -> 3 (all starts consumed)
max   = max(2,2) = 2

At t=15 the third meeting begins (15 < 20, the soonest remaining end). One room freed at t=10, so the running count goes 1 -> 2 — it does not exceed the previous peak. After advancing, si=3 reaches the end of starts, which is our loop stop condition. Why safe: remaining ends only decrease the count and can never raise the peak, so the sweep can halt here.

Step 7 — Terminate and read the answer
All starts consumed (si=3). Stop the sweep
after 4 events processed.

Timeline of busy-room count:
  t=0  : 1
  t=5  : 2   <- PEAK
  t=10 : 1
  t=15 : 2

answer = max = 2

We stop as soon as every start is placed; trailing ends can only lower the count. The peak concurrency was 2, so 2 meeting rooms suffice. Complexity: sorting dominates at O(n log n) time, O(n) space — the sweep itself is linear. The sort is what made a single left-to-right pass correct.

🎯 Guided practice

  1. Easy — Sort Array by Increasing Frequency. Given [2,3,1,3,2], sort so least-frequent values come first; ties broken by larger value first. Step 1: recognize "by frequency" → build a count map: {2:2, 3:2, 1:1}. Step 2: the sort key for each element is a pair (freq, -value) — freq ascending, value descending on ties. Step 3: sort the original elements with comparator key key(x) = (count[x], -x): 1 (freq 1) sorts first; then 3 and 2 both have freq 2, but -3 < -2 so 3 precedes 2. Result: [1,3,3,2,2]. Pattern learned: a hash map turns a derived property (frequency) into a sortable key, and a tuple comparator encodes multi-level tie-breaks cleanly — negating a numeric field flips just that level's direction. O(n log n) time, O(n) space.
  2. Medium — Meeting Rooms II (minimum rooms for overlapping intervals). Given [[0,30],[5,10],[15,20]], find the fewest rooms. Step 1: "overlapping intervals / max concurrency" → sort by start time: already [[0,30],[5,10],[15,20]]. Step 2: keep a min-heap of end times for rooms currently in use. Process [0,30]: heap empty → open a room, heap = [30]. Step 3: [5,10]: earliest end is 30 > start 5, so the meeting overlaps → open a new room, heap = [10,30]. Step 4: [15,20]: earliest end 10 ≤ start 15 → that room is free (touching/finished); reuse it (pop 10, push 20), heap = [20,30]. Answer: peak heap size = 2 rooms. Pattern learned: sort by start to impose chronological order, then a min-heap tracks the soonest-freeing room so each arrival checks "can I reuse?" in O(log n). Sort sets up the sweep; the heap answers reuse. Total O(n log n) time, O(n) space.

✨ Added by the guide — work these before the full problem set.

Lessons in this topic