hard Minimize Malware Spread
Problem Statement
You are given a network of n nodes, represented as an n x n adjacency matrix called graph. In this graph, graph[i][j] = 1 indicates that node i is directly connected to node j. If graph[i][j] = 0, there is no direct connection between them.
Some nodes in this network are initially infected by malware, and these nodes are listed in the array initial. When a node is infected, the malware spreads to all nodes that are directly connected to it. This spread continues until no more nodes can be infected.
Suppose M(initial) is the final number of nodes infected with malware in the entire network after the spread of malware stops. We will remove exactly one node from the initial.
Return the node that, if removed, would minimize M(initial). If there is more than one possible node that can be removed to achieve this goal, return the node with the smallest index.
Note: Even if a node is removed from the initial list, it might still become infected later due to the spread of malware from other connected nodes.
Example
Example 1:
- Input: graph:
[[1, 0, 0], [0, 1, 1], [0, 1, 1]]- initial:
[0, 1]
- initial:
-
Expected Output:
1 -
Justification: In this example,
1is connected with2. If we remove0, there will be2infected nodes which are1and2. If we remove1, there will be only1infected node, which is0.
Example 2:
- Input: graph:
[[1, 1, 0, 0], [1, 1, 0, 0], [0, 0, 1, 1], [0, 0, 1, 1]]- initial:
[1, 2]
- initial:
- Expected Output:
1
- Justification: Here, the graph contains 2 components. In the first component
0and1are connected and in the 2nd component2and3are connected. So, if we remove either1or2, the number of infected nodes will be reduced from4to2. However, since node1has a smaller index, we choose node1.
Example 3:
-
Input: graph:
[[1, 0, 0, 1], [0, 1, 1, 0], [0, 1, 1, 1], [1, 0, 1, 1]]- initial:
[0, 2]
- initial:
-
Expected Output:
0
- Justification: Initially, all 4 nodes are infected with Malware. If we remove either
0or2, still, all nodes remain infected. For example, if we remove0, then node2will be infected as it is connected to3and3is connected to the0. Also, the same goes for the0. So, we remove the element with smallest index, which is0.
Try it yourself
Try solving this question here:
✅ Solution Minimize Malware Spread
Problem Statement
You are given a network of n nodes, represented as an n x n adjacency matrix called graph. In this graph, graph[i][j] = 1 indicates that node i is directly connected to node j. If graph[i][j] = 0, there is no direct connection between them.
Some nodes in this network are initially infected by malware, and these nodes are listed in the array initial. When a node is infected, the malware spreads to all nodes that are directly connected to it. This spread continues until no more nodes can be infected.
Suppose M(initial) is the final number of nodes infected with malware in the entire network after the spread of malware stops. We will remove exactly one node from the initial.
Return the node that, if removed, would minimize M(initial). If there is more than one possible node that can be removed to achieve this goal, return the node with the smallest index.
Note: Even if a node is removed from the initial list, it might still become infected later due to the spread of malware from other connected nodes.
Example
Example 1:
-
Input: graph:
[[1, 0, 0], [0, 1, 1], [0, 1, 1]]- initial:
[0, 1]
- initial:
-
Expected Output:
1
- Justification: In this example,
1is connected with2. If we remove0, there will be2infected nodes which are1and2. If we remove1, there will be only1infected node, which is0.
Example 2:
- Input: graph:
[[1, 1, 0, 0], [1, 1, 0, 0], [0, 0, 1, 1], [0, 0, 1, 1]]- initial:
[1, 2]
- initial:
- Expected Output:
1
- Justification: Here, the graph contains 2 components. In the first component
0and1are connected and in the 2nd component2and3are connected. So, if we remove either1or2, the number of infected nodes will be reduced from4to2. However, since node1has a smaller index, we choose node1.
Example 3:
- Input: graph:
[[1, 0, 0, 1], [0, 1, 1, 0], [0, 1, 1, 1], [1, 0, 1, 1]]- initial:
[0, 2]
- initial:
-
Expected Output:
0 -
Justification: Initially, all 4 nodes are infected with Malware. If we remove either
0or2, still, all nodes remain infected. For example, if we remove0, then node2will be infected as it is connected to3and3is connected to the0. Also, the same goes for the0. So, we remove the element with smallest index, which is0.
Solution
To solve the problem, we utilize Tarjan's algorithm to identify all the strongly connected components (SCCs) in the network graph. Tarjan's algorithm is an efficient way to find SCCs using Depth-First Search (DFS). Once we find all the SCCs, we map each node to its respective component and calculate the size of each SCC. For each SCC, we also count how many initially infected nodes it contains. The key idea is that removing an initially infected node from an SCC that has only one such node will minimize the total number of infected nodes, as it prevents further spread within that SCC.
After finding all SCCs and counting the infected nodes in each, we iterate through the initially infected nodes to determine which node's removal minimizes the spread the most. We prioritize nodes belonging to SCCs that contain exactly one infected node. Among these, we select the node whose removal results in the largest number of nodes saved from infection. If there is a tie, we choose the node with the smallest index to satisfy the problem's constraints.
Step-by-Step Algorithm
-
Initialize Variables:
- Define an integer
nas the size of the graph, which represents the total number of nodes. - Create arrays
discoveryandlowof sizento store the discovery times and the lowest reachable discovery time for each node, respectively. Initialize them to-1to indicate that no nodes have been visited yet. - Create a
parentarray of sizento keep track of the parent nodes during DFS traversal, initializing all values to-1. - Create a boolean array
inStackof sizento check if a node is currently in the stack. Initialize all values tofalse. - Create a
Stack<Integer>namedstackto store the nodes in the current DFS path. - Create an
ArrayList<List<Integer>>namedsccto hold all strongly connected components (SCCs).
- Define an integer
-
Find SCCs Using Tarjan's Algorithm:
- Loop through each node
ifrom0ton - 1. Ifdiscovery[i]is-1, this node has not been visited. Call thetarjanSCCfunction to find all SCCs starting from this nodei. - Inside the
tarjanSCCfunction:- Set the
discovery[u]andlow[u]values to the current time (time++) for the nodeubeing visited. This marks the node's discovery time. - Push node
uonto thestackand markinStack[u]astrue. - Loop through all nodes
vfrom0ton - 1. If there is a direct edge fromutov(graph[u][v] == 1), perform the following:- If
vhas not been visited (discovery[v] == -1):- Set
parent[v]touand calltarjanSCCrecursively forv. - Update the
lowvalue ofuaslow[u] = min(low[u], low[v])to keep track of the lowest discovery time reachable fromu.
- Set
- If
vis already in the stack (inStack[v]istrue), it means a back edge is found. Updatelow[u]aslow[u] = min(low[u], discovery[v]).
- If
- After visiting all the neighbors of
u, check ifuis the root of an SCC (low[u] == discovery[u]). If true, pop nodes from the stack untiluis reached, forming an SCC, and add this SCC to thescclist.
- Set the
- Loop through each node
-
Map Nodes to Their SCCs and Count Sizes:
- Create three arrays:
infectedCountto store the number of infected nodes in each SCC,componentSizeto store the size of each SCC, andcomponentMapto map each node to its respective SCC. - Loop through each SCC in
sccand for each node in the SCC, updatecomponentMapto map the node to its SCC index and increment the size of that SCC incomponentSize.
- Create three arrays:
-
Count Infected Nodes in Each SCC:
- Loop through the
initialarray, which contains initially infected nodes. For each node, find its SCC usingcomponentMapand increment the count ininfectedCountfor that SCC.
- Loop through the
-
Sort the Initial Array:
- Sort the
initialarray in ascending order to prioritize removing nodes with smaller indices in case of a tie.
- Sort the
-
Determine the Best Node to Remove:
- Initialize variables
resultto the first element of the sortedinitialarray andmaxSavedNodesto-1. - Loop through each node in the
initialarray:- Find the SCC of the node using
componentMap. - If the SCC contains exactly one infected node (
infectedCount[comp] == 1):- Check if the size of this SCC (
componentSize[comp]) is greater thanmaxSavedNodes. If true, updatemaxSavedNodesand setresultto the current node.
- Check if the size of this SCC (
- Find the SCC of the node using
- Initialize variables
-
Return the Result:
- Return
resultas the node whose removal minimizes the total number of infected nodes.
- Return
Algorithm Walkthrough
Input:
- graph:
[[1, 0, 0], [0, 1, 1], [0, 1, 1]] - initial:
[0, 1]
Step-by-Step Walkthrough:
-
Initialize Variables:
n = 3(number of nodes in the graph).discovery = [-1, -1, -1](initially all nodes are undiscovered).low = [-1, -1, -1](initially all nodes have no lowest reachable discovery time).parent = [-1, -1, -1](initially all nodes have no parent).inStack = [false, false, false](no nodes are in the stack initially).stack = [](empty stack for DFS traversal).scc = [](empty list to store SCCs).
-
Finding SCCs using Tarjan's Algorithm:
-
Start with Node 0:
discovery[0] = 0,low[0] = 0(set discovery and low time to 0).- Push
0onto the stack:stack = [0],inStack[0] = true. - Loop through all nodes
v:- For
v = 0:graph[0][0] = 1butu == v, skip. - For
v = 1andv = 2:graph[0][1] = 0andgraph[0][2] = 0, no direct connections, continue.
- For
- Since
low[0] == discovery[0], node0is the root of an SCC:- Pop
0from the stack:stack = [],inStack[0] = false. - Create SCC
[0]and add toscc:scc = [[0]].
- Pop
-
Continue with Node 1:
-
discovery[1] = 1,low[1] = 1(set discovery and low time to 1). -
Push
1onto the stack:stack = [1],inStack[1] = true. -
Loop through all nodes
v:- For
v = 0:graph[1][0] = 0, no direct connection. - For
v = 1:graph[1][1] = 1butu == v, skip. - For
v = 2:graph[1][2] = 1, direct connection exists:- Since
discovery[2] = -1, node2is unvisited. Setparent[2] = 1and calltarjanSCCfor node2.
- Since
- For
-
DFS on Node 2:
discovery[2] = 2,low[2] = 2(set discovery and low time to 2).- Push
2onto the stack:stack = [1, 2],inStack[2] = true. - Loop through all nodes
v:- For
v = 0andv = 1:graph[2][0] = 0,graph[2][1] = 1butv = 1is already visited and in stack. Updatelow[2] = min(low[2], discovery[1]) = min(2, 1) = 1. - For
v = 2:graph[2][2] = 1butu == v, skip.
- For
- Since
low[2] == discovery[2], node2is part of an SCC:- Pop
2from the stack:stack = [1],inStack[2] = false. - Create SCC
[2]and add toscc:scc = [[0], [2]].
- Pop
-
Back to Node 1:
- After DFS on
2, updatelow[1] = min(low[1], low[2]) = min(1, 1) = 1. - Since
low[1] == discovery[1], node1is the root of an SCC:- Pop
1from the stack:stack = [],inStack[1] = false. - Create SCC
[1, 2]and add toscc:scc = [[0], [1, 2]].
- Pop
- After DFS on
-
-
-
Map Nodes to Their SCCs and Count Sizes:
componentMap = [0, 1, 1](node0in SCC 0, nodes1and2in SCC 1).componentSize = [1, 2](SCC 0 has 1 node, SCC 1 has 2 nodes).
-
Count Infected Nodes in Each SCC:
- Loop through
initial = [0, 1]:- For
node = 0,comp = componentMap[0] = 0. IncrementinfectedCount[0] = 1. - For
node = 1,comp = componentMap[1] = 1. IncrementinfectedCount[1] = 1.
- For
- Loop through
-
Sort the Initial Array:
initial = [0, 1](already sorted).
-
Determine the Best Node to Remove:
- Initialize
result = 0andmaxSavedNodes = -1. - Loop through
initial = [0, 1]:- For
node = 0,comp = componentMap[0] = 0:infectedCount[0] == 1(only one malware node in SCC 0).componentSize[0] = 1>maxSavedNodes = -1. UpdatemaxSavedNodes = 1andresult = 0.
- For
node = 1,comp = componentMap[1] = 1:infectedCount[1] == 1(only one malware node in SCC 1).componentSize[1] = 2>maxSavedNodes = 1. UpdatemaxSavedNodes = 2andresult = 1.
- For
- Initialize
-
Return the Result:
- The best node to remove is
1, as it minimizes the total infected nodes. Return1.
- The best node to remove is
Code
import java.util.*;
class Solution {
private int time = 0;
public int minimizeMalwareSpread(int[][] graph, int[] initial) {
int n = graph.length;
int[] discovery = new int[n]; // Time of discovery of nodes
int[] low = new int[n]; // Lowest point that can be reached
int[] parent = new int[n]; // Parent of nodes in DFS
boolean[] inStack = new boolean[n]; // To check if a node is in the stack
Stack<Integer> stack = new Stack<>();
Arrays.fill(discovery, -1);
Arrays.fill(low, -1);
Arrays.fill(parent, -1);
List<List<Integer>> scc = new ArrayList<>(); // List to hold all SCCs
// Tarjan's algorithm to find SCCs
for (int i = 0; i < n; i++) {
if (discovery[i] == -1) {
tarjanSCC(i, graph, discovery, low, parent, inStack, stack, scc);
}
}
int[] infectedCount = new int[scc.size()];
int[] componentSize = new int[scc.size()];
int[] componentMap = new int[n];
// Map nodes to their SCC and count the size of each SCC
for (int i = 0; i < scc.size(); i++) {
for (int node : scc.get(i)) {
componentMap[node] = i;
componentSize[i]++;
}
}
// Count infected nodes in each SCC
for (int node : initial) {
int comp = componentMap[node];
infectedCount[comp]++;
}
Arrays.sort(initial);
int result = initial[0];
int maxSavedNodes = -1;
// Determine the best node to remove
for (int node : initial) {
int comp = componentMap[node];
if (infectedCount[comp] == 1) {
// Only one malware node in this SCC
if (componentSize[comp] > maxSavedNodes) {
maxSavedNodes = componentSize[comp];
result = node;
}
}
}
return result;
}
private void tarjanSCC(
int u,
int[][] graph,
int[] discovery,
int[] low,
int[] parent,
boolean[] inStack,
Stack<Integer> stack,
List<List<Integer>> scc
) {
discovery[u] = low[u] = time++;
stack.push(u);
inStack[u] = true;
for (int v = 0; v < graph.length; v++) {
if (graph[u][v] == 1 && u != v) {
// There's an edge from u to v
if (discovery[v] == -1) {
// If v is not visited
parent[v] = u;
tarjanSCC(v, graph, discovery, low, parent, inStack, stack, scc);
low[u] = Math.min(low[u], low[v]); // Update the low-link value of u
} else if (inStack[v]) {
// Back edge found
low[u] = Math.min(low[u], discovery[v]); // Update the low-link value of u
}
}
}
// If u is a root node, pop the stack and form an SCC
if (low[u] == discovery[u]) {
List<Integer> currentSCC = new ArrayList<>();
while (true) {
int v = stack.pop();
inStack[v] = false;
currentSCC.add(v);
if (v == u) break;
}
scc.add(currentSCC);
}
}
public static void main(String[] args) {
Solution sol = new Solution();
int[][] graph1 = { { 1, 0, 0 }, { 0, 1, 1 }, { 0, 1, 1 } };
int[] initial1 = { 0, 1 };
System.out.println(sol.minimizeMalwareSpread(graph1, initial1)); // Output: 1
int[][] graph2 = {
{ 1, 1, 0, 0 },
{ 1, 1, 0, 0 },
{ 0, 0, 1, 1 },
{ 0, 0, 1, 1 },
};
int[] initial2 = { 1, 2 };
System.out.println(sol.minimizeMalwareSpread(graph2, initial2)); // Output: 1
int[][] graph3 = {
{ 1, 0, 0, 1 },
{ 0, 1, 1, 0 },
{ 0, 1, 1, 1 },
{ 1, 0, 1, 1 },
};
int[] initial3 = { 0, 2 };
System.out.println(sol.minimizeMalwareSpread(graph3, initial3)); // Output: 0
}
}
Complexity Analysis:
Time Complexity
-
Finding Strongly Connected Components (SCCs):
The Tarjan's algorithm used in the solution runs a Depth-First Search (DFS) on the graph. Since each node and each edge are visited exactly once, the time complexity for this part is, where Vis the number of nodes (vertices) andEis the number of edges. In the worst case, we can haveNvertices and N2 edges. -
Mapping Nodes to SCCs and Counting Sizes/Infected Nodes:
After identifying all SCCs, the solution iterates over the nodes to map them to their respective SCCs and compute the sizes of the SCCs. This takestime. -
Sorting the Initial Infected Nodes:
Sorting the initial array takestime, where N is the number of initially infected nodes if we assume initially all vertices are infected. -
Determining the Best Node to Remove:
This involves iterating over the initial array, which takestime.
Thus, the overall time complexity of the solution is
Space Complexity
-
Storage for Discovery, Low, Parent, In-Stack Arrays:
Each of these arrays takesspace, where N is the number of nodes. -
Stack for DFS and SCC List:
The stack and list to store SCCs both requirespace in the worst case. -
Component Mapping Arrays:
Arrays such asinfectedCount,componentSize, andcomponentMapalso requirespace.
Thus, the overall space complexity is
🤖 Don't fully get this? Learn it with Claude
Stuck on Minimize Malware Spread? Open Claude, copy a block below, and it'll teach you this exact concept — visually and interactively.
Progressively stronger hints — you still solve it.
I'm working on the problem **Minimize Malware Spread** (DSA). Give me a HINT LADDER: start with the tiniest nudge, then wait. Only reveal the next, stronger hint when I ask. Do NOT show the full solution unless I type 'show solution'. Keep me doing the thinking. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
See the technique, not just code.
Explain the optimal approach to **Minimize Malware Spread** with a VISUAL walkthrough: trace it on a small concrete example using ASCII art / a step-by-step diagram, narrate what changes each step, then give time & space complexity with a one-line derivation. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Catch bugs, edge cases, sub-optimality.
I'll paste my solution to **Minimize Malware Spread**. Review it for correctness, missed edge cases, and time/space complexity, then coach me toward the optimal — don't just rewrite it. Ask me to paste my code now. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Lock in recognition with look-alikes.
Give me 2 problems that use the SAME underlying pattern as **Minimize Malware Spread**. For each, let me attempt first, then review my answer and name the trigger signal that reveals the pattern. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.