Complex inter-dependent data: Looking beyond conventional networks can lead to better predictions
Zebra mussels, a ship-borne invasive species, are such a problem in American waters that they cost the U.S. power industry alone $3.1 billion in economic losses in 1993-1999, mainly by blocking pipes that deliver water to cooling plants. Researchers looking for a way to predict where they might end up next, so that preventive measures can be taken, have relied on network science, a way to identify patterns and meaningful connections in fields ranging from invasive species to international terrorism and social networks to infectious diseases.
Network science enables an understanding and modeling of the interconnected world, whether social, biological, physical or organizational. New research from a team of University of Notre Dame researchers led by Nitesh Chawla, Frank M. Freimann Professor of Computer Science and Engineering and Director of the Interdisciplinary Center for Network Science and Applications (iCeNSA), suggests that current algorithms to represent networks have not truly considered the complex inter-dependencies in data, which can lead to erroneous analysis or predictions. Chawla’s team has developed a new algorithm that offers the promise of more precise network representation and accurate analysis.
“With this paper, we have made a significant advance in network theory to more accurately and precisely represent complex dependencies in data,” Chawla said.
One example of how the algorithm works that the researchers cite in the paper is the study of invasive species driven by the global shipping network.
“Species may be carried unintentionally by ships from port to port and cause invasions, thus ship movements connect ports in the world in an implicit species flow network,” Chawla said. “By identifying higher-order dependencies in ship movements, namely where a ship is more likely to go next given its previous steps, we can more accurately model ship movements and therefore species flow dynamics, for the analysis and prediction of invasive species.”
Chawla described the new method as a general approach that can potentially influence a broad range of fields.
“For example, more accurately representing flow of information on networks can give a more accurate representation of the complex social interactions and the flow of information, which can be of interest for telecom companies, social media, and so on, “he said.. “This work also has strong applications for modeling infectious disease spreads, which are a function of complex dependencies (human to human, human to species, etc). Our method can readily be applied to other types of traffic data such as taxi movements and human trajectories, which the government can leverage for urban planning, and merchants can use for customer behavior analysis and prediction. The ability to extract and represent higher-order navigation patterns can also be used to analyze web clickstreams and network access patterns, with potential applications from website optimization to intruder detection (based on anomalous access patterns) for security and defense.”
Chawla and his fellow researchers are the first to develop a variable higher-order network representation algorithm.
“It is a fundamental and transformative advance in network representation to automatically discover the orders of dependency among components of a complex interconnected world. ” he said.
Jian Xu and Thanuka Wickramarathne from Notre Dame are coauthors of the paper, which appears in the journal Science Advances.