Representation learning for acting and planning

Hector Geffner
RWTH Aachen University

Abstract

Recent progress in deep learning and deep reinforcement learning (DRL) has been truly
remarkable yet two key problems remain: structural policy generalization and policy  reuse.
The first is about getting policies that generalize in a reliable way; the second is about
getting  policies that can be reused and combined in a flexible, goal-oriented manner. The
two problems are studied in DRL but only experimentally, and the results are not crisp and clear.
In our work, we have tackled these problems in a slightly different way, separating what is
to be learned from how. For this, we have developed languages for expressing, studying,
and learning general policies, using combinatorial and DRL approaches. We have also developed
languages and methods for expressing, studying and  learning general subgoal structures and
hierarchical polices. Open challenges and  general methodological lessons will be discussed.

This is joint work with Blai Bonet, Simon Stahlberg, Dominik Drexler, and other members of the RLeap team.

About the speaker

Hector Geffner is an Alexander von Humboldt Professor at the RWTH Aachen University, Germany.
Before joining RWTH in 2023, he was an ICREA Research Professor at the Universitat Pompeu Fabra in Barcelona, Spain. Hector obtained a Ph.D. in Computer Science at UCLA and worked at the IBM T.J. Watson Research Center in New York and at the Universidad Simon Bolivar in Caracas. Distinctions for his work include the 1990 ACM Dissertation Award and three ICAPS Influential Paper Awards. He currently leads a project on representation learning for acting and planning (RLeap) funded by an ERC grant.

Against the clock: lessons learned by applying temporal planning in practice

Andrea Micheli
Fondazione Bruno Kessler, Trento

Abstract

Automated Planning is a foundational area of AI research, focusing on the automated synthesis of courses of actions to achieve a desired goal within a formally-modeled system. When dealing with time and temporal constraints, this problem is known as Temporal Planning. In this talk, I will present my research on the application of temporal planning to real-world scenarios, and highlight the open research directions in this field. Starting from a series of projects in different application domains —including robotics, manufacturing, and logistics—I will explore key challenges encountered, the (sometimes hard) lessons learned, and the techniques, tools, and methodologies that have emerged from these efforts. Additionally, I will introduce and discuss preliminary results on applying Reinforcement Learning techniques to tailor temporal planners to specific application contexts.

About the speaker

Andrea Micheli (https://andrea.micheli.website) is the head of the “Planning Scheduling and Optimization” research unit at Fondazione Bruno Kessler, Trento, Italy (https://pso.fbk.eu). His research focuses on the development and technology transfer of automated planning technologies. He obtained his PhD in Computer Science from the University of Trento in 2016. His PhD thesis titled “Planning and Scheduling in Temporally Uncertain Domains” won several awards including the EurAI Best Dissertation Award and the honorable mention at the ICAPS Best Dissertation award. He currently works in the field of temporal planning and is the main developer of the TAMER planner (tamer.fbk.eu). He is also lead developer of the pysmt (https://www.pysmt.org) open-source project aiming at providing a standard Python API for satisfiability modulo theory solvers. Andrea coordinated the AIPlan4EU project (https://www.aiplan4eu-project.eu) aiming to remove the access barriers to automated planning technology and to bring such technology to the European AI On-Demand Platform. He authored more than 30 papers in the Formal Methods and Artificial Intelligence fields. Andrea recently won an ERC Starting Grant (https://pso.fbk.eu/articles/step-rl) for researching novel solutions in the combination of temporal planning and reinforcement learning.