Accurate and interpretable representations of environments with anticipatory learning classifier systems

Abstract

Anticipatory Learning Classifier Systems (ALCS) are rule- based machine learning algorithms that can simultaneously develop a complete representation of their environment and a decision policy based on this representation to solve their learning tasks. This paper intro- duces BEACS (Behavioral Enhanced Anticipatory Classifier System) in order to handle non-deterministic partially observable environments and to allow users to better understand the environmental representations issued by the system. BEACS is an ALCS that enhances and merges Probability-Enhanced Predictions and Behavioral Sequences approaches used in ALCS to handle such environments. The Probability-Enhanced Predictions consist in enabling the anticipation of several states, while the Behavioral Sequences permits the construction of sequences of ac- tions. The capabilities of BEACS have been studied on a thorough bench- mark of 23 mazes and the results show that BEACS can handle different kinds of non-determinism in partially observable environments, while describing completely and more accurately such environments. BEACS thus provides explanatory insights about created decision polices and environmental representations.