The problem of design, reverse engineering and retrofitting for robust operation of large-scale interconnected dynamical systems is perhaps the engineering grand challenge of our time. Mathematics and engineering tools for treatment of individual components have been developed to a high degree of sophistication. However, when these components are connected - whether physically or by communication devices - new, collective phenomena can emerge that are not necessarily related to properties of individual components. The local consequences of such phenomena can be sensed - and the drive towards reduced cost and ubiquity of sensors leads to a massive amount of dynamically changing data. The phenomena indicated by sensed data have to be recognized, counteracted or perhaps even utilized dynamically in attempts to achieve optimal design and operation. Here are some of the critical elements of the applied problem at hand, the "Big Data Dynamics in Systems of Systems", and our viewpoint on the associated research directions.

The size of the typical engineered System of Systems is large and consists of many heterogeneous components. Despite the existence of commercial modeling tools that provide insight into dynamical behavior of the system, currently they are not enabling parametric investigation of the design space on reasonable time scales required for design. What is needed are methods for such parametric investigation of large-scale, interconnected, hybrid, nonlinear systems with stochastic disturbances.

The nature of the problem is typically cyber-physical. The physical or cyber couplings between diverse subsystems introduce global, emerging phenomena that are hidden in component viewpoint but can be revealed by utilizing dynamical systems tools and systems engineering perspective.

Transients (finite time phenomena) are important, and thus dynamic measurement data can not be treated using traditional big data methods or stationary stochastic process theory. In fact, what is needed is data-driven methods, and that requires a nontrivial extension of stationary stochastic process theory and its coupling to operator-theoretic approach to dynamical systems.

The mathematical structure of the problem is that of a hybrid (discrete-time dynamics coupled with continuous-time dynamics) dynamical system with possibly stochastic elements. At the coarse level, the components of such a system are represented as a product of ordinary differential equations (e.g. flows of air and heat), algebraic differential equations (constraints) and finite-state machine models for switching components (e.g. communication on-communication off ). While there is an increased level of effort in the research community to enable analysis of such hybrid systems, these are concentrated mostly on low-dimensional situations and are context-specific. The Koopman operator-theoretic approach [1] can enable treatment of such diverse features in a unified framework.

The dynamics of the coupled system are nonlinear. Dynamical and control systems communities have made an enormous amount of progress on understanding and controlling such phenomena for low-dimensional systems, for example in the context of bifurcation theory. However, the unified approach to analysis of high dimensional systems is still lacking. In our work we combine the theory of global (Koopman) modes for model reduction [2], graph decompositions [3] into global-mode defined subsystems, and fast analysis of subsystems using operator-theoretic methods [4] to achieve such a unified approach even in large-dimensional systems.

The definition of the model and its inputs is uncertain. Parameters (the physical constants) of the problem are often known only as a distribution. Alternatively, only the bounds of their range are specified. Inputs into models such as weather conditions and schedules of use are inherently stochastic or uncertain. Tools for understanding performance of the system and providing an envelope of the system performance are needed. The probabilistic outputs will have non-Gaussian nature due to nonlinearity of the underlying dynamics. For this reason novel uncertainty metrics [5] need to be defined and coupled to operator-theory based uncertainty propagation tools. The performance under these metrics is related to structural and dynamical features in Systems of Systems.

Sampling is the only way to understand global aspects of performance in truly high-dimensional systems, due to the curse of dimensionality present in of all the other approaches. Unfortunately, using a classical Monte-Carlo sampling procedure - the only method that is truly dimension-independent so far - produces a convergence error of order N^{-1/2}, where N is the number of samples (and also only in expectation). This could be termed the "curse of convergence". For problems with certain degree of smoothness, we have developed a deterministic sampling method, DSample [6], that provably has error of order c/N, from every initial seed (as opposed to only in expectation). Development of such deterministic algorithms is needed (and possible) for non-smooth problems.

The combination of operator-theoretic methods for coarse-graining with graph theoretic methods for (dynamical) structural decomposition, and fast sampling methods can, we believe, lead to fast progress in the analysis and design of Big Data Dynamics in Systems of Systems.

References

[1] Marko Budišić, Ryan Mohr, and Igor Mezić. Applied Koopmanism. Chaos: An Interdisciplinary Journal of Nonlinear Science, 22(4):047510, 2012.

[2] Igor Mezić. Spectral properties of dynamical systems, model reduction and decompositions. Nonlinear Dynamics, 41(1-3):309{325, 2005.

[3] I. Mezić. Coupled nonlinear dynamical systems: Asymptotic behavior and uncertainty propagation. Proceedings of the CDC 2004, 2004.

[4] I. Mezić and A. Banaszuk. Comparison of systems with complex behavior. Physica D, 197, 2004.

[5] I. Mezić and T. Runolfsson. Uncertainty propagation in dynamical systems. Automatica, 44(12):3003-3013, 2008.

[6] I. Mezić. DSample: A deterministic algorithm for sampling with o(1/n) error. Submitted to CDC 2014, 2014.