Searching
Step 16 in the DSA path · 1 concepts · 6 problems
📘 Learn Searching from zero
The Searching pattern means reframing a problem as binary search: keep a candidate range, probe its midpoint, and discard the half that provably cannot contain the answer — halving the space each step for O(log N). The classic trigger is a sorted array, but the real FAANG unlock is binary search on the answer: when the answer lives in a numeric range and a predicate is monotonic (true before some threshold, false after it), you binary-search the value space itself. Canonical example:mySqrt(x) — find floor(sqrt(17)). The "array" is the conceptual range [1..17], and the monotone predicate is mid*mid <= x.✨ Added by the guide to build intuition — not from the source course.
🏗️ Visual walkthrough — trace it step by step
x = 17, find largest m with m*m <= 17 candidate value space (conceptual, sorted): 1 2 3 4 5 6 7 8 9 10 11 ... 17 ^ ^ lo hi predicate P(m) = (m*m <= 17): m: 1 2 3 4 5 6 7 ... 17 P: T T T T F F F ... F <- monotone: once false, stays false
There is no real array — we binary-search the answer range lo=1, hi=17. The predicate m*m <= 17 is monotone (a True block then a False block), so a threshold exists and binary search is valid. We want the last True, so we track ans whenever the predicate holds.
lo=1 hi=17 1 2 3 4 5 6 7 8 [9]10 ... 17 ^ ^ ^ lo mid hi mid = 1 + (17-1)/2 = 9 mid*mid = 81 > 17 -> predicate FALSE
mid = lo + (hi-lo)/2 = 9, and 9*9 = 81 > 17. Since the predicate is monotone, every value >= 9 also fails, so the entire right half (9..17) is safe to discard. We move hi = mid - 1 = 8. State: lo=1, hi=8, ans=-1.
lo=1 hi=8 1 2 3 [4] 5 6 7 8 ^ ^ ^ lo mid hi mid = 1 + (8-1)/2 = 4 mid*mid = 16 <= 17 -> predicate TRUE ans = 4 (record this candidate)
mid = 1 + (8-1)/2 = 4, and 4*4 = 16 <= 17, so 4 is a valid answer. Because we seek the largest valid value, we record ans = 4 and search higher: lo = mid + 1 = 5. Everything <= 4 is now known-good and discarded. State: lo=5, hi=8, ans=4.
lo=5 hi=8 5 [6] 7 8 ^ ^ ^ lo mid hi mid = 5 + (8-5)/2 = 6 mid*mid = 36 > 17 -> predicate FALSE
mid = 5 + (8-5)/2 = 6, and 6*6 = 36 > 17. The right half (6..8) all fails by monotonicity, so we discard it: hi = mid - 1 = 5. State: lo=5, hi=5, ans=4 — the window has shrunk to a single value.
lo=5 hi=5 [5] ^ lo=mid=hi mid = 5 + (5-5)/2 = 5 mid*mid = 25 > 17 -> predicate FALSE
The window is one element: mid = 5, and 5*5 = 25 > 17, predicate FALSE. We discard it with hi = mid - 1 = 4. Now lo = 5 > hi = 4, so the loop will terminate — the range is empty and no candidate is left to test. State: lo=5, hi=4, ans=4.
lo=5 > hi=4 -> loop ends (empty range)
final predicate picture:
m : 1 2 3 4 | 5 6 7 ... 17
P : T T T T | F F F ... F
^
threshold between last-True (4) and first-False (5)
return ans = 4 (floor(sqrt(17)) = 4, since 4^2=16<=17<25=5^2)The loop exits exactly when lo > hi; ans holds the last value that satisfied the predicate, which is the floor of sqrt we wanted: 4. We probed only 4 values (9, 4, 6, 5) instead of 17 — that is the O(log N) win.
BINARY SEARCH ON ANSWER (template)
lo = min feasible, hi = max feasible
ans = -1
while lo <= hi:
mid = lo + (hi - lo) / 2 # avoids overflow
if P(mid): ans = mid; lo = mid + 1 # want largest True
else: hi = mid - 1
return ans
Map to the in-scope problems:
Sqrt(x) P(m): m*m <= x
Freq of Most Frequent Element P(len): window costs <= k ops
Minimize Max of Two Arrays P(v): v has enough multiples
Search a 2D Matrix II staircase from top-right cornerWhenever the answer is a number in a range and a yes/no test on a guess is monotone, binary-search the value space. Use mid = lo + (hi-lo)/2 to dodge integer overflow, and decide the lo=mid+1 vs hi=mid-1 update by whether you want the last-True or first-True boundary.
🎯 Guided practice
- Easy — Maximum Count of Positive and Negative Integers. Given a sorted array (e.g.
[-3,-1,0,0,2,4,5]), returnmax(count of negatives, count of positives). Brute force is an O(n) scan, but "sorted" is the trigger for O(log n). Step 1: negatives form a contiguous prefix, positives a contiguous suffix, zeros sit between — a monotonic structure. Step 2: binary-search the first index where value ≥ 0; since every negative is strictly before it, that index equals the count of negatives (here, index 2 → 2 negatives). Step 3: binary-search the first index where value ≥ 1 (the first strictly-positive element, index 4); positives =n - that index=7 - 4 = 3. Step 4: returnmax(2, 3) = 3. Core pattern learned: "first index satisfying a monotonic condition" (the lower-bound template) converts counting into two O(log n) boundary searches. - Medium — Frequency of the Most Frequent Element. Given
numsand budgetk(allowed total increments), maximize the count of one equal value. Step 1 (recognize): increments only go up, and it's cheapest to raise smaller numbers up to a target — so sort the array first:[1,2,4],k=5. Step 2 (window invariant): for a window[l..r], the cheapest equalization makes everything equal to the largest,nums[r]. Cost =nums[r]*(r-l+1) - sum(window). Step 3 (slide): expandrone step, add to a running window sum; while cost exceedsk, shrink froml. Track the max window size. Walk it on sorted[1,2,4]: window[1]cost 0 (size 1); window[1,2]cost2*2-3=1 ≤ 5(size 2); window[1,2,4]cost4*3-7=5 ≤ 5(size 3). Answer: 3. Why it's a searching pattern: the cost is monotonically non-decreasing as the window grows for a fixed right end, so the two-pointer sweep is O(n) after the O(n log n) sort — far better than O(n²) checking every pair. (Binary-searching the leftmost validlfor eachralso works; two pointers is the tighter version of the same monotonic idea.)
✨ Added by the guide — work these before the full problem set.