Operators of distributed systems often find themselves needing to answer a diagnostic or forensic question. Some part of the system is found to be in an unexpected state; for example, a suspicious routing table entry is discovered, or a proxy cache is found to contain an unusually large number of advertisements. The operators must determine the causes of this state before they can decide on an appropriate response. On the one hand, there may be an innocent explanation: the routing table entry could be the result of a miscon­figuration, and the cache entries could have appeared due to a workload change. On the other hand, the unexpected state may be the symptom of an ongoing attack: the routing table entry could be the result of route hijacking, and the cache entries could be a side-effect of a malware infection. In this situation, it would be helpful to be able to ask the system to “explain” its own state, e.g., by describing a chain of events that link the state to its root causes, such as external inputs.

Emerging network provenance techniques can construct such explanations. However, if some of the nodes are faulty or have been compromised by an adversary, the situation is complicated by the fact that the adversary can cause the nodes under his control to lie, suppress information, tamper with existing data, or report nonexistent events. This can cause the provenance system to turn from an advantage into a liability: its answers may cause operators to stop investigating an ongoing attack because everything looks fine. Moreover, current provenance techniques mostly focus on explaining the results of computations (“Why did this happen?”), but, in reality, the range of diagnostic questions is much broader – for instance, it may be necessary to explain why something did not happen, or why it took such a long time.


The goal of this project is to extend and generalize the network provenance by adding capabilities to deal with a wide variety of diagnostic and forensic questions. We are also working on better ways to extract, store, and query provenance, as well as on stronger theoretical underpinnings. We have been evaluating our techniques in the context of concrete applications such as Hadoop MapReduce, BGP interdomain routing, or Software-Defined Networks; our case studies so far include data center diagnostics, detecting net neutrality violations, and identifying advanced persistent threats.

For more information, see NetDB@Penn and Secure Network Provenance.

Collaborators

Funding

  • NSF

Students

  • W. Brad Moore
  • Arjun Narayan
  • Tian Yang
  • Mingchen Zhao