Knowledge Guide
HomeDSA

Searching

Step 16 in the DSA path · 1 concepts · 6 problems

0 / 7 complete

📘 Learn Searching from zero

The Searching pattern means reframing a problem as binary search: keep a candidate range, probe its midpoint, and discard the half that provably cannot contain the answer — halving the space each step for O(log N). The classic trigger is a sorted array, but the real FAANG unlock is binary search on the answer: when the answer lives in a numeric range and a predicate is monotonic (true before some threshold, false after it), you binary-search the value space itself. Canonical example: mySqrt(x) — find floor(sqrt(17)). The "array" is the conceptual range [1..17], and the monotone predicate is mid*mid <= x.

✨ Added by the guide to build intuition — not from the source course.

🏗️ Visual walkthrough — trace it step by step

Step 1 — Frame the search space
x = 17,  find largest m with m*m <= 17

candidate value space (conceptual, sorted):
  1  2  3  4  5  6  7  8  9 10 11 ... 17
  ^                                    ^
  lo                                  hi

predicate P(m) = (m*m <= 17):
  m:  1  2  3  4  5  6  7  ...  17
  P:  T  T  T  T  F  F  F  ...   F   <- monotone: once false, stays false

There is no real array — we binary-search the answer range lo=1, hi=17. The predicate m*m <= 17 is monotone (a True block then a False block), so a threshold exists and binary search is valid. We want the last True, so we track ans whenever the predicate holds.

Step 2 — Probe mid = 9
lo=1                              hi=17
  1  2  3  4  5  6  7  8 [9]10 ... 17
  ^                      ^          ^
  lo                    mid        hi

mid = 1 + (17-1)/2 = 9
mid*mid = 81  >  17   -> predicate FALSE

mid = lo + (hi-lo)/2 = 9, and 9*9 = 81 > 17. Since the predicate is monotone, every value >= 9 also fails, so the entire right half (9..17) is safe to discard. We move hi = mid - 1 = 8. State: lo=1, hi=8, ans=-1.

Step 3 — Probe mid = 4
lo=1         hi=8
  1  2  3 [4] 5  6  7  8
  ^        ^           ^
  lo      mid         hi

mid = 1 + (8-1)/2 = 4
mid*mid = 16  <=  17  -> predicate TRUE
ans = 4  (record this candidate)

mid = 1 + (8-1)/2 = 4, and 4*4 = 16 <= 17, so 4 is a valid answer. Because we seek the largest valid value, we record ans = 4 and search higher: lo = mid + 1 = 5. Everything <= 4 is now known-good and discarded. State: lo=5, hi=8, ans=4.

Step 4 — Probe mid = 6
      lo=5     hi=8
  5  [6] 7  8
  ^   ^      ^
  lo mid    hi

mid = 5 + (8-5)/2 = 6
mid*mid = 36  >  17  -> predicate FALSE

mid = 5 + (8-5)/2 = 6, and 6*6 = 36 > 17. The right half (6..8) all fails by monotonicity, so we discard it: hi = mid - 1 = 5. State: lo=5, hi=5, ans=4 — the window has shrunk to a single value.

Step 5 — Probe mid = 5
  lo=5
  hi=5
 [5]
  ^
  lo=mid=hi

mid = 5 + (5-5)/2 = 5
mid*mid = 25  >  17  -> predicate FALSE

The window is one element: mid = 5, and 5*5 = 25 > 17, predicate FALSE. We discard it with hi = mid - 1 = 4. Now lo = 5 > hi = 4, so the loop will terminate — the range is empty and no candidate is left to test. State: lo=5, hi=4, ans=4.

Step 6 — Terminate and return ans
lo=5 > hi=4   -> loop ends (empty range)

final predicate picture:
  m :  1  2  3  4 | 5  6  7 ... 17
  P :  T  T  T  T | F  F  F ...  F
                  ^
        threshold between last-True (4) and first-False (5)

return ans = 4        (floor(sqrt(17)) = 4, since 4^2=16<=17<25=5^2)

The loop exits exactly when lo > hi; ans holds the last value that satisfied the predicate, which is the floor of sqrt we wanted: 4. We probed only 4 values (9, 4, 6, 5) instead of 17 — that is the O(log N) win.

Step 7 — Generalize the pattern
BINARY SEARCH ON ANSWER (template)

 lo = min feasible, hi = max feasible
 ans = -1
 while lo <= hi:
     mid = lo + (hi - lo) / 2     # avoids overflow
     if P(mid):      ans = mid; lo = mid + 1   # want largest True
     else:           hi = mid - 1
 return ans

Map to the in-scope problems:
  Sqrt(x)                         P(m): m*m <= x
  Freq of Most Frequent Element   P(len): window costs <= k ops
  Minimize Max of Two Arrays      P(v): v has enough multiples
  Search a 2D Matrix II           staircase from top-right corner

Whenever the answer is a number in a range and a yes/no test on a guess is monotone, binary-search the value space. Use mid = lo + (hi-lo)/2 to dodge integer overflow, and decide the lo=mid+1 vs hi=mid-1 update by whether you want the last-True or first-True boundary.

🎯 Guided practice

  1. Easy — Maximum Count of Positive and Negative Integers. Given a sorted array (e.g. [-3,-1,0,0,2,4,5]), return max(count of negatives, count of positives). Brute force is an O(n) scan, but "sorted" is the trigger for O(log n). Step 1: negatives form a contiguous prefix, positives a contiguous suffix, zeros sit between — a monotonic structure. Step 2: binary-search the first index where value ≥ 0; since every negative is strictly before it, that index equals the count of negatives (here, index 2 → 2 negatives). Step 3: binary-search the first index where value ≥ 1 (the first strictly-positive element, index 4); positives = n - that index = 7 - 4 = 3. Step 4: return max(2, 3) = 3. Core pattern learned: "first index satisfying a monotonic condition" (the lower-bound template) converts counting into two O(log n) boundary searches.
  2. Medium — Frequency of the Most Frequent Element. Given nums and budget k (allowed total increments), maximize the count of one equal value. Step 1 (recognize): increments only go up, and it's cheapest to raise smaller numbers up to a target — so sort the array first: [1,2,4], k=5. Step 2 (window invariant): for a window [l..r], the cheapest equalization makes everything equal to the largest, nums[r]. Cost = nums[r]*(r-l+1) - sum(window). Step 3 (slide): expand r one step, add to a running window sum; while cost exceeds k, shrink from l. Track the max window size. Walk it on sorted [1,2,4]: window [1] cost 0 (size 1); window [1,2] cost 2*2-3=1 ≤ 5 (size 2); window [1,2,4] cost 4*3-7=5 ≤ 5 (size 3). Answer: 3. Why it's a searching pattern: the cost is monotonically non-decreasing as the window grows for a fixed right end, so the two-pointer sweep is O(n) after the O(n log n) sort — far better than O(n²) checking every pair. (Binary-searching the leftmost valid l for each r also works; two pointers is the tighter version of the same monotonic idea.)

✨ Added by the guide — work these before the full problem set.

Lessons in this topic