Methodology: Non-traditional Data Mining

Data science aims to use scientific methods, programming tools, and certain data infrastructures to seek a better understanding of the system(s) under investigation. There emerge a range of very useful methods or techniques, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and graph neural networks (GNNs). These techniques can be very valuable when deriving insight from data to inform a certain model's structure or to calibrate and validate agent-based models. Below is information for one type of non-traditional machine learning: machine learning via neural networks.

Machine learning via neural networks: Among the advanced machine learning techniques that are more than standard regression techniques, neural networks have emerged as one of the most powerful algorithms. In a typical neural network, there exist nodes that are connected by links. As input data are fed into the algorithm, nodes receive messages from sending nodes and ‘fire’ messages to their receiving nodes depending on some pre-determined conditions. Like a regression equation, these conditions depend on a set of parameters, which can be optimized.

Neural networks mimic the structure of human or animal brains, which make researchers of complex systems feel quite interesting and useful. Instead of defining decision rules for agents by hand, each agent could be implemented with their own neural network (Zhang et al. 2016). Then the process of calibrating the model would involve optimizing the neural networks for all the agents. However, the use of neural networks to control the behavior of agents directly are relatively rare. This could be because the process of calibrating a model could be extremely difficult – a single neural network typically requires a huge amount of training data, which makes a model with a large number of neural networks (one per agent) very challenging.

GNNs learn node representations by recursively aggregating information from their neighborhood nodes. Classical GNN tasks include graph classification (e.g., molecular structure classification; Ying et al. 2018)), node classification (e.g., publication classification in an article network; Kipf and Welling 2016; Karimi et al. 2019), link prediction (e.g., predicting integrations in a social networks; Zhang and Chen 2018), and collaborative filtering (e.g., recommendation systems like Amazon or Netflix; Wang et al. 2019). Most population-level interactions come naturally in the form of graphs and can be modeled as graph edges. For example, in geospatial data, spatial objects can be represented as nodes and their topological/attribute relationships are represented as links, making GNNs the natural model choice. Furthermore, GNNs can be integrated with other models such as CNNs, for joint information extraction on the individual level (though CNNs or so), and interaction modeling on the population level (through a GNN). For example, in geospatial data, spatial objects can be represented as nodes and their topological/attribute relationships are represented as links, making GNNs the natural model choice. Furthermore, GNNs can be cascaded with other models such as CNNs, for joint information extraction on the individual level (though CNNs or so), and interaction modeling on the population level (through a GNN).

A concrete example of using GNNs to derive behavioral rules of networked entities can be found in a recent application in the autonomous flocking of multi-agent robot swarms. The authors consider a decentralized network of moving robot agents: each agent is viewed as a node in a dynamic graph with two agents within a communication range connected by an edge, and changes in this one graph are determined by both GNN and RNN. A GNN is applied on top of the graph for aggregating and forecasting the population-level behavior patterns. Further, each individual agent (node) can perceive the visual environment and extract features by its own convolutional neural network (CNN), which processes each drone’s visual input like “eyes”. Note that each node or agent has its own CNN. The resulting model is then a CNN-GNN stack and can be trained from end to end.

Similar ideas can be potentially extended to handling any dynamic network with semantically rich nodes, such as forecasting COVID-19 transmission by exploiting the multimedia information from a social network, where people are the nodes, and person-person contacts are the links that change over time. This would enable modelers to use a unique RNN for each agent and model each person’s health status (nodes) over time. An RNN can be a naïve baseline itself, without considering population influences, while a GNN can be used to model the population-level interactions (edges) that change over time.

We can better understand why agents make their decisions through methods such as reinforcement learning, where positive behaviors are “learned” through repeated exposure to an environment through building and refining deep neural networks. Such networks may capture uncertainty and incomplete knowledge representations. With large-scale individual tracking data, it may be possible to teach agents how to navigate spaces as if they were human. These ‘learning’ agents may both better reflect the actual behaviors of humans and model their behavior under changing conditions. Progress is rapidly being made elsewhere(Banino et al. 2018), but integration into geographical modelling remains a challenge (but see, e.g., work by Abdulkareem et al.(Abdulkareem et al. 2019) using Bayesian Networks to help simulate complex decision-making with regards to potential cholera infection).

Agent behavioral learning can also happen with the aid of big data. For instance, with time series data of particles’ mass, charge, and geographic positioning information data, GNN can be trained to derive closed-form, symbolic expressions of Newtonian force laws and Hamiltonians(Cranmer et al. 2020). The authors begin with a starting graph (say with n1 particles and n2 edges describing their relationships). The authors use 1) an edge model to represent links/edges among all n1 particles—here is the key of their work: there are many potential equations (they aim to find them by GNN; they are expressed as inductive biases), which represent potential math functions of Newtonian force laws. Here the goal is to use GNN to select function type and fine-tune the value of all parameters in the corresponding function (say one of the functions is named f1). Then, the authors use 2) a node model, in which each node (particle) receives all the messages from all the rest (n1-1) of the particles with the magnitude of each message (i.e., amount of gravity) calculated from the candidate function (say f1). The authors use 3) a global model to aggregate and update the status of all messages and nodes over time.


Readings and References:

Abdulkareem, S. A., Y. T. Mustafa, E.-W. Augustijn, and T. Filatova. 2019. Bayesian networks for spatial learning: a workflow on using limited survey data for intelligent learning in spatial agent-based models. GeoInformatica 23 (2):243–268.

An, L., V. Grimm, A. Sullivan, B.L. Turner II., Z. Wang, N. Malleson, R. Huang, A. Heppenstall, C. Vincenot, D. Robinson, X. Ye, J. Liu, E. Lindvist, W. Tang (in review).Agent-based complex systems science in the light of data mining and artificial intelligence.

Banino, A., C. Barry, B. Uria, C. Blundell, T. Lillicrap, P. Mirowski, A. Pritzel, M. J. Chadwick, T. Degris, J. Modayil, G. Wayne, H. Soyer, F. Viola, B. Zhang, R. Goroshin, N. Rabinowitz, R. Pascanu, C. Beattie, S. Petersen, A. Sadik, S. Gaffney, H. King, K. Kavukcuoglu, D. Hassabis, R. Hadsell, and D. Kumaran. 2018. Vector-based navigation using grid-like representations in artificial agents. Nature 557:429–433.

Cranmer, M., A. Sanchez-Gonzalez, P. Battaglia, R. Xu, K. Cranmer, D. Spergel, and S. Ho. 2020. Discovering symbolic models from deep learning with inductive biases. arXiv:2006.11287 [cs.LG]. https://arxiv.org/abs/2006.11287.

Karimi, M., D. Wu, Z. Wang, and Y. Shen. 2019. Explainable deep relational networks for predicting compound-protein affinities and contacts. In arXiv preprint arXiv:1912.12553.

Kipf, T. N., and M. Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv:1609.02907 [cs.LG].

Wang, X., X. He, M. Wang, F. Feng, and T.-S. Chua. 2019. Neural graph collaborative filtering. SIGIR’19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval :165–174.

Ying, Z., J. You, C. Morris, X. Ren, W. Hamilton, and J. Leskovec. 2018. Hierarchical graph representation learning with differentiable pooling. In Advances in Neural Information Processing Systems 31, 4800–4810.

Zhang, H., Y. Vorobeychik, J. Letchford, and K. Lakkaraju. 2016. Data-driven agent-based modeling, with application to rooftop solar adoption. Autonomous Agents and Multi-Agent Systems 30 (6):1023–1049.

Zhang, M., and Y. Chen. 2018. Link prediction based on graph neural networks. In Advances in Neural Information Processing Systems 31 (NIPS 2018), 5165–5175.


Notes:

The above text was modified from the appendix of the paper by An et al. (in review). Drs. Nicolas Malleson, Zhangyang Wang, and Ruihong Huang substantially contributed to the text.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Proin aliquam eros eget dolor cursus eleifend. Sed vel tortor vitae augue auctor convallis id nec mauris. Fusce scelerisque leo et magna sagittis, vitae dapibus mi tempor. Quisque dolor tellus, tristique vel dolor vel, laoreet efficitur ligula.

  • Link
  • Link
  • Link
  • Link