hard Minimize Malware Spread II
Problem Statement
You are given a network of n nodes, represented as an n x n adjacency matrix called graph. In this graph, graph[i][j] = 1 indicates that node i is directly connected to node j. If graph[i][j] = 0, there is no direct connection between them.
Some nodes in this network are initially infected by malware, and these nodes are listed in the array initial. When a node is infected, the malware spreads to all nodes that are directly connected to it. This spread continues until no more nodes can be infected.
Suppose M(initial) is the final number of nodes infected with malware in the entire network after the spread of malware stops.
We will remove exactly one node from initial, completely removing it and any connections from this node to any other node.
Return the node that, if removed, would minimize M(initial). If there is more than one possible node that can be removed to achieve this goal, return the node with the smallest index.
Examples
Example 1:
- Input: graph:
[[1, 0, 1], [0, 1, 1], [1, 1, 1]]- initial:
[0, 2]
- initial:
- Expected Output:
2 - Justification: By removing node
2from the initial list, we stop the malware from spreading from node2to node1. Thus, only node0remains infected, minimizing the total infected nodes to1.
Example 2:
- Input: graph:
[[1, 1, 0, 0], [1, 1, 0, 0], [0, 0, 1, 1], [0, 0, 1, 1]]- initial:
[1, 2]
- initial:
- Expected Output:
1
- Justification: Removing node
1from the initial list will prevent the malware from spreading in the first cluster. The total number of infected nodes will be minimized to2since only nodes2and3will remain infected.
Example 3:
- Input: graph:
[[1, 0, 0, 1], [0, 1, 1, 0], [0, 1, 1, 1], [1, 0, 1, 1]]- initial:
[0, 2]
- initial:
- Expected Output:
2
- Justification: If node
2is removed, the malware spread will be restricted only to node0and node3. This minimizes the spread of malware, reducing the total infected nodes to2.
Try it yourself
Try solving this question here:
✅ Solution Minimize Malware Spread II
Problem Statement
You are given a network of n nodes, represented as an n x n adjacency matrix called graph. In this graph, graph[i][j] = 1 indicates that node i is directly connected to node j. If graph[i][j] = 0, there is no direct connection between them.
Some nodes in this network are initially infected by malware, and these nodes are listed in the array initial. When a node is infected, the malware spreads to all nodes that are directly connected to it. This spread continues until no more nodes can be infected.
Suppose M(initial) is the final number of nodes infected with malware in the entire network after the spread of malware stops.
We will remove exactly one node from initial, completely removing it and any connections from this node to any other node.
Return the node that, if removed, would minimize M(initial). If there is more than one possible node that can be removed to achieve this goal, return the node with the smallest index.
Examples
Example 1:
- Input: graph:
[[1, 0, 1], [0, 1, 1], [1, 1, 1]]- initial:
[0, 2]
- initial:
- Expected Output:
2
- Justification: By removing node
2from the initial list, we stop the malware from spreading from node2to node1. Thus, only node0remains infected, minimizing the total infected nodes to1.
Example 2:
- Input: graph:
[[1, 1, 0, 0], [1, 1, 0, 0], [0, 0, 1, 1], [0, 0, 1, 1]]- initial:
[1, 2]
- initial:
- Expected Output:
1
- Justification: Removing node
1from the initial list will prevent the malware from spreading in the first cluster. The total number of infected nodes will be minimized to2since only nodes2and3will remain infected.
Example 3:
- Input: graph:
[[1, 0, 0, 1], [0, 1, 1, 0], [0, 1, 1, 1], [1, 0, 1, 1]]- initial:
[0, 2]
- initial:
- Expected Output:
2
- Justification: If node
2is removed, the malware spread will be restricted only to node0and node3. This minimizes the spread of malware, reducing the total infected nodes to2.
Solution
To solve the problem, the approach involves using Tarjan's algorithm, which is typically employed to find strongly connected components in a graph. The algorithm performs a Depth-First Search (DFS) from each initially infected node to determine the impact of removing that node from the network. The DFS is used to compute the discovery times and the lowest reachable times for each node. These times help identify the connected components and how the malware spreads throughout the network.
The primary objective is to find the node whose removal results in the smallest number of infections. We use a recursive DFS function to traverse the graph from each infected node. During this traversal, we keep track of nodes visited, compute the size of each component affected by malware, and identify nodes that, when removed, minimize the spread of malware. The node with the maximum impact on reducing infections is chosen, and in case of ties, the node with the smallest index is selected.
Step-by-Step Algorithm
-
Initialization:
- Determine the number of nodes in the graph (
numNodes) by getting the length of the adjacency matrix. - Set
nodeToRemoveto the first element of theinitiallist, assuming it as the default node to be removed. - Initialize
maxSavedNodesto0to keep track of the maximum number of nodes that can be saved by removing a single node.
- Determine the number of nodes in the graph (
-
Mark Initially Infected Nodes:
- Create a boolean array
isInfectedof sizenumNodesto mark which nodes are initially infected. - Loop through each node in the
initialarray and setisInfected[node]totrueto indicate that these nodes are infected.
- Create a boolean array
-
Prepare for DFS Traversal:
- Initialize three arrays
discoveryTime,lowestTime, andmalwareSpreadCount, each of sizenumNodes, filled with0. discoveryTime: Tracks the discovery time of each node during the DFS traversal.lowestTime: Tracks the lowest discovery time reachable from each node.malwareSpreadCount: Tracks the number of nodes saved by removing each initially infected node.
- Initialize three arrays
-
DFS Traversal for Each Infected Node:
- Iterate through each node in the
initialarray:- If
discoveryTime[node]is0, perform a DFS from that node:- Set
discoveryTime[currentNode]andlowestTime[currentNode]tocurrentTime. - Initialize
isMalwareSpreadtotrueif the current node (currentNode) is infected, otherwisefalse. - Initialize
componentSizeto1as the node itself is a part of its connected component. - Traverse all connected nodes of
currentNode:- If there is a connection (
graph[currentNode][neighbor] == 1):- Check if the
neighbornode has been visited (discoveryTime[neighbor] == 0):- If not visited, recursively call DFS for
neighborwith updatedcurrentTime(currentTime + 1). - After returning from DFS call:
- If
subTreeSizeis0(indicating that a subtree contains a malware node), setisMalwareSpreadtotrue. - If
subTreeSizeis greater than0, add it tocomponentSize. - If
lowestTime[neighbor]is greater than or equal todiscoveryTime[currentNode], updatemalwareSpreadCount[currentNode]by addingsubTreeSize. - Update
lowestTime[currentNode]to the minimum oflowestTime[currentNode]andlowestTime[neighbor].
- If
- If not visited, recursively call DFS for
- If
neighboris not the parent of thecurrentNode, updatelowestTime[currentNode]to the minimum oflowestTime[currentNode]anddiscoveryTime[neighbor].
- Check if the
- If there is a connection (
- Set
- If
- Iterate through each node in the
-
Compute Results:
- After DFS traversal for all initially infected nodes, compute the number of nodes saved by removing each initially infected node.
- Iterate through the
malwareSpreadCountarray:- If the number of saved nodes (
malwareSpreadCount[node]) is greater thanmaxSavedNodesor if it equalsmaxSavedNodesbut the node index is smaller, updatemaxSavedNodesandnodeToRemoveaccordingly.
- If the number of saved nodes (
-
Return the Result:
- Return
nodeToRemoveas the node whose removal results in minimizing the spread of malware.
- Return
Algorithm Walkthrough
Given the input:
- graph:
[[1, 0, 1], [0, 1, 1], [1, 1, 1]] - initial:
[0, 2]
-
Initialization:
numNodesis 3 (since there are 3 nodes in the graph).nodeToRemoveis initialized to0(first node in the initial list).maxSavedNodesis initialized to0.isInfectedis[false, false, false]initially.
-
Mark Initially Infected Nodes:
isInfected[0]is set totrue.isInfected[2]is set totrue.- Now,
isInfectedbecomes[true, false, true].
-
Prepare for DFS Traversal:
discoveryTimeis[0, 0, 0].lowestTimeis[0, 0, 0].malwareSpreadCountis[0, 0, 0].
-
DFS Traversal for Each Infected Node:
-
Start with node
0(first infected node):discoveryTime[0]is0, so perform DFS on node0:- Set
discoveryTime[0]andlowestTime[0]to1. - Initialize
isMalwareSpreadtotrue(node0is infected). - Initialize
componentSizeto1. - Traverse neighbors of node
0:- Neighbor 0: Already visited (it's the current node), skip.
- Neighbor 1: No direct connection (
graph[0][1]is0), skip. - Neighbor 2: Connected (
graph[0][2]is1) anddiscoveryTime[2]is0:- Perform DFS on node
2:- Set
discoveryTime[2]andlowestTime[2]to2. - Initialize
isMalwareSpreadtotrue(node2is infected). - Initialize
componentSizeto1. - Traverse neighbors of node
2:- Neighbor 0: Connected (
graph[2][0]is1), but it's the parent, so updatelowestTime[2]tomin(2, 1) = 1. - Neighbor 1: Connected (
graph[2][1]is1),discoveryTime[1]is0:- Perform DFS on node
1:- Set
discoveryTime[1]andlowestTime[1]to3. - Initialize
isMalwareSpreadtofalse(node1is not infected). - Initialize
componentSizeto1. - Traverse neighbors of node
1:- Neighbor 0: No direct connection (
graph[1][0]is0), skip. - Neighbor 1: Already visited (it's the current node), skip.
- Neighbor 2: Connected (
graph[1][2]is1), but already visited, updatelowestTime[1]tomin(3, 2) = 2.
- Neighbor 0: No direct connection (
- Return
1(size of the component not affected by malware).
- Set
- Perform DFS on node
- Node
2:subTreeSizeis1(node1is not infected), add tocomponentSizemaking it2. - Update
malwareSpreadCount[2]to1(contributed bysubTreeSize).
- Neighbor 0: Connected (
- Neighbor 2: Already visited (it's the current node), skip.
- Set
- Return
0(since node2is infected).
- Perform DFS on node
- Node
0:subTreeSizeis0(node2is infected), setisMalwareSpreadtotrue. - Node
0: UpdatelowestTime[0]tomin(1, 1) = 1.
- Set
-
Compute Result:
- After DFS,
malwareSpreadCountis[0, 0, 1]. - Check node
0:malwareSpreadCount[0]is0,maxSavedNodesremains0. - Check node
2:malwareSpreadCount[2]is1, updatemaxSavedNodesto1, setnodeToRemoveto2.
- After DFS,
-
-
Return Result:
- The optimal node to remove is
2, as it results in minimizing the spread of malware.
- The optimal node to remove is
Code
import java.util.*;
public class Solution {
// Helper method to perform DFS using Tarjan's algorithm
private int dfs(
int[][] graph,
int currentNode,
int parent,
int currentTime,
int[] discoveryTime,
int[] lowestTime,
boolean[] isInfected,
int[] malwareSpreadCount
) {
// Initialize discovery and lowest reachable time for the current node
lowestTime[currentNode] = discoveryTime[currentNode] = currentTime;
boolean isMalwareSpread = isInfected[currentNode];
int componentSize = 1; // Initialize component size
// Traverse all connected nodes
for (int neighbor = 0; neighbor < graph[currentNode].length; neighbor++) {
if (graph[currentNode][neighbor] == 1) {
// Check if there is a connection
if (discoveryTime[neighbor] == 0) {
// If the neighbor has not been visited
int subTreeSize = dfs(
graph,
neighbor,
currentNode,
currentTime + 1,
discoveryTime,
lowestTime,
isInfected,
malwareSpreadCount
);
if (subTreeSize == 0) {
// If subtree contains a malware node
isMalwareSpread = true;
} else {
componentSize += subTreeSize; // Add subtree size to the current component size
}
// Update malware spread count if the lowest time of the neighbor is greater than or equal to the discovery time
if (lowestTime[neighbor] >= discoveryTime[currentNode]) {
malwareSpreadCount[currentNode] += subTreeSize;
}
lowestTime[currentNode] = Math.min(
lowestTime[currentNode],
lowestTime[neighbor]
); // Update lowest reachable time
} else if (neighbor != parent) {
// If the neighbor is not the parent, update the lowest reachable time
lowestTime[currentNode] = Math.min(
lowestTime[currentNode],
discoveryTime[neighbor]
);
}
}
}
return isMalwareSpread ? 0 : componentSize; // Return the size of the component if malware doesn't spread
}
public int minMalwareSpread(int[][] graph, int[] initial) {
int numNodes = graph.length; // Number of nodes in the graph
int nodeToRemove = initial[0]; // Start with the first node in the initial array
int maxSavedNodes = 0; // Maximum number of saved nodes
boolean[] isInfected = new boolean[numNodes];
for (int node : initial) isInfected[node] = true; // Mark initially infected nodes
int[] discoveryTime = new int[numNodes];
int[] lowestTime = new int[numNodes];
int[] malwareSpreadCount = new int[numNodes];
// Perform DFS for each initially infected node
for (int node : initial) {
if (discoveryTime[node] == 0) {
dfs(
graph,
node,
-1,
1,
discoveryTime,
lowestTime,
isInfected,
malwareSpreadCount
);
}
// Choose the node that maximizes the number of saved nodes
if (
malwareSpreadCount[node] > maxSavedNodes ||
(malwareSpreadCount[node] == maxSavedNodes && node < nodeToRemove)
) {
maxSavedNodes = malwareSpreadCount[node];
nodeToRemove = node;
}
}
return nodeToRemove;
}
// Main method for testing
public static void main(String[] args) {
Solution solution = new Solution();
int[][] graph1 = { { 1, 0, 1 }, { 0, 1, 1 }, { 1, 1, 1 } };
int[] initial1 = { 0, 2 };
System.out.println(
"Result for Example 1: " + solution.minMalwareSpread(graph1, initial1)
);
int[][] graph2 = {
{ 1, 1, 0, 0 },
{ 1, 1, 0, 0 },
{ 0, 0, 1, 1 },
{ 0, 0, 1, 1 },
};
int[] initial2 = { 1, 2 };
System.out.println(
"Result for Example 2: " + solution.minMalwareSpread(graph2, initial2)
);
int[][] graph3 = {
{ 1, 0, 0, 1 },
{ 0, 1, 1, 0 },
{ 0, 1, 1, 1 },
{ 1, 0, 1, 1 },
};
int[] initial3 = { 0, 2 };
System.out.println(
"Result for Example 3: " + solution.minMalwareSpread(graph3, initial3)
);
}
}
Complexity Analysis
Time Complexity
The time complexity of the above solution is
- The DFS (Depth First Search) traversal using Tarjan's algorithm is done for each node, and each edge is explored at most once. Thus, the DFS takes
time.
Overall, the time complexity is
Space Complexity
The space complexity is
- Arrays like
discoveryTime,lowestTime,malwareSpreadCount, andisInfected, each of size. - The recursion stack used in the DFS, which can go up to
in the worst case.
Therefore, the total space complexity is
🤖 Don't fully get this? Learn it with Claude
Stuck on Minimize Malware Spread II? Open Claude, copy a block below, and it'll teach you this exact concept — visually and interactively.
Progressively stronger hints — you still solve it.
I'm working on the problem **Minimize Malware Spread II** (DSA). Give me a HINT LADDER: start with the tiniest nudge, then wait. Only reveal the next, stronger hint when I ask. Do NOT show the full solution unless I type 'show solution'. Keep me doing the thinking. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
See the technique, not just code.
Explain the optimal approach to **Minimize Malware Spread II** with a VISUAL walkthrough: trace it on a small concrete example using ASCII art / a step-by-step diagram, narrate what changes each step, then give time & space complexity with a one-line derivation. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Catch bugs, edge cases, sub-optimality.
I'll paste my solution to **Minimize Malware Spread II**. Review it for correctness, missed edge cases, and time/space complexity, then coach me toward the optimal — don't just rewrite it. Ask me to paste my code now. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.
Lock in recognition with look-alikes.
Give me 2 problems that use the SAME underlying pattern as **Minimize Malware Spread II**. For each, let me attempt first, then review my answer and name the trigger signal that reveals the pattern. If you're unsure or a claim isn't standard, say so and reason from first principles instead of guessing.