Title:

Theory and practice of coordination algorithms exploiting the generalised distributive law

A key challenge for modern computer science is the development of technologies that allow interacting computer systems, typically referred as agents, to coordinate their decisions whilst operating in an environment with minimal human intervention. By so doing, the decision making capabilities of each of these agents should be improved by making decisions that take into account what the remaining agents intend to do. Against this background, the focus of this thesis is to study and design new coordination algorithms capable of achieving this improved performance. In this line of work, there are two key research challenges that need to be addressed. First, the current stateoftheart coordination algorithms have only been tested in simulation. This means that their practical performance still needs to be demonstrated in the real world. Second, none of the existing algorithms are capable of solving problems where the agents need to coordinate over complex decisions which typically require to trade off several parameters such as multiple objectives, the parameters of a sufficient statistic and the sample value and the bounds of an estimator. However, such parameters typically characterise the agents’ interactions within many real world domains. For this reason, deriving algorithms capable of addressing such complex interactions is a key challenge to bring research in coordination algorithms one step closer to successful deployment. The aim of this thesis is to address these two challenges. To achieve this, we make two types of contribution. First, we develop a set practical contributions to address the challenge of testing the performance of stateoftheart coordination algorithms in the real world. More specifically, we perform a case study on the deployment of the maxsum algorithm, a well known coordination algorithm, on a system that is couched in terms of allowing the first responders at the scene of a disaster to request imagery collection tasks of some of the most relevant areas to a team of unmanned aerial vehicles (UAVs). These agents then coordinate to complete the largest number of tasks. In more detail, maxsum is based on the generalised distributive law (GDL), a well known algebraic framework that has been used in disciplines such as artificial intelligence, machine learning and statistical physics, to derive effective algorithms to solve optimisation problems. Our iv contribution is the deployment of maxsum on real hardware and the evaluation of its performance in a real world setting. More specifically, we deploy maxsum on two UAVs (hexacopters) and test it a number of different settings. These tests show that maxsum does indeed perform well when confronted with the complexity and the unpredictability of the real world. The second category of contributions are theoretical in nature. More specifically, we propose a new framework and a set of solution techniques to address the complex interactions requirement. To achieve this, we move back to theory and tackle a new class of problem involving agents engaged in complex interactions defined by multiple parameters. We name this class partially ordered distributed constraint optimisation problems (PODCOPs). Essentially, this generalises the well known distributed constraint optimisation problem (DCOP) framework to settings in which agents make decisions over multiple parameters such as multiple objectives, the parameters of a sufficient statistic and the sample value and the bounds of an estimator. To measure the quality of these decisions, it becomes necessary to strike a balance between these parameters and to achieve this, the outcome of these decisions is represented using partially ordered constraint functions. Given this framework, we present three subclasses of PODCOPs, each focusing on a different type of complex interaction. More specifically, we study (i) multiobjective DCOPs (MODCOPs) in which the agents’ decisions are defined over multiple objectives, (ii) riskaware DCOPs (RADCOPs) in which the outcome of the agents’ decisions is not known with certainty and thus, where the agents need to carefully weigh the risk of making decisions that might lead to poor and unexpected outcomes and, (iii) multiarm bandit DCOPs (MABDCOPs) where the agents need to learn the outcome of their decisions online. To solve these problems, we again exploit the GDL framework. In particular, we employ the flexibility of the GDL to obtain either optimal or bounded approximate algorithms to solve PODCOPs. The key insight is to use the algebraic properties of the GDL to instantiate well known DCOP algorithms such as DPOP, Action GDL or bounded maxsum to solve PODCOPs. Given the properties of these algorithms, we derive a new set of solution techniques. To demonstrate their effectiveness, we study the properties of these algorithms empirically on various instances of MODCOPs, RADCOPs and MABDCOPs. Our experiments emphasize two key traits of the algorithms. First, bounded approximate algorithms perform well in terms of our requirements. Second, optimal algorithms incur an increase in both the computation and communication load necessary to solve PODCOPs because they are trying to optimally solve a problem which is potentially more complex than canonical DCOPs.
