PhD Viva Saeedeh

PhD Viva – Saeedeh Ghanadbashi

Title of the thesis: Ontology-Enhanced Decision-Making for Autonomous Agents in Dynamic and Partially Observable Environments

Abstract: Agents are intelligent software or hardware entities that can perceive their environment through sensors and act on it through actuators. Such agents usually operate in dynamic and partially observable environments and make decisions on what actions to take to achieve their goals. In such environments, they often encounter unforeseen situations and their observations may differ from the actual state of the environment. Also, in such challenging environments, not all the goals are known or can be predefined, and therefore agents might need to identify the changes and define new goals or adapt their predefined goals on the fly. To do so, agents face the following challenges; (1) Agents’ observation is often based on incomplete, ambiguous, and noisy sensed data. As a result, distinct states of the environment might appear the same to the agents, and consequently, they fail to take suitable actions. (2) Facing unforeseen situations (i.e., unpredictable or rare events) is unavoidable and agents need to make decisions and take actions on the fly while only accessing incomplete information which can cause uncertainty in the agents’ action selection process.

To address these challenges, collaboration, and practical reasoning techniques have been used to handle agents’ limited knowledge and restricted capabilities, enabling them to adapt to new circumstances or emerging requirements by choosing an action from a predefined set. However, the uncertainty caused by partially observable environments makes reasoning more complex and leads to inconsistencies in many traditional reasoning systems. Also, unforeseen situations may require generating new goals because the agent’s initial goals may no longer be relevant or achievable. Machine Learning (ML) and in particular Reinforcement Learning (RL) algorithms are used to address these challenges, however, depending on the availability of large amounts of data, the presence of predefined goals, or the possibility of inferring goals from expert human demonstrations and no consideration for a huge number of actions and the long exploration periods make their application limited. Other works have employed ontologies, serving as formal knowledge representations, to enable efficient information fusion from diverse sources and provide a comprehensive understanding of the dynamic environment, thus enhancing the agents’ decision-making capabilities in complex and uncertain scenarios. Using ontologies, agents are enabled to use the integrated conceptual features extracted from domain ontologies, which allows them to find optimal or near-optimal actions faster.

This thesis proposes a novel ontology-enhanced decision-making model for autonomous agents, enhancing their performance in dynamic and partially observable environments. This model proposes novel techniques for employing ontologies and reasoning mechanisms to enrich the agents’ domain knowledge, enabling them to interpret unforeseen events, generate new or evolve the current goals accordingly, and ultimately make suitable decisions and improve their performance in the real world. Specifically, the contributions of this thesis are as follows: (1) An ontology-based observation modeling method is proposed where agents’ partial observations are improved on a real-time basis by using prior ontological knowledge. (2) A novel ontology-enhanced decision-making model (OntoDeM) is proposed that allows agents to successfully handle unforeseen situations on the fly in dynamic and partially observable environments by enabling them to evolve their goals or generate new ones. (3) Implementation and evaluation of the proposed model in four different real-world application areas and addressing their challenging problems by employing the relevant methods from the proposed model.

The proposed model is compared to traditional learning algorithms including Q-learning, SARSA, and Deep Q Network, and state-of-the-art methods including Proximal Policy Optimization (PPO), Trust Region Policy Optimization (TRPO), and Deep Deterministic Policy Gradient (DDPG). The results show that OntoDeM can improve the agents’ observation and decision-making, which consequently improves their performance in dynamic and partially observable environments.