[FRIAM] Fwd: Reminder: Thesis Proposal - December 1, 2020 - Nicolay Topin - Unifying State and Policy Level Explanations for Reinforcement Learning

Mon Nov 30 15:21:15 EST 2020

I think we were talking about explanation of reasoning in AI systems
recently.  This thesis proposal is relevant.

---
Frank C. Wimberly
140 Calle Ojo Feliz,
Santa Fe, NM 87505

505 670-9918
Santa Fe, NM

---------- Forwarded message ---------
From: Diane Stidle <stidle at andrew.cmu.edu>
Date: Mon, Nov 30, 2020, 8:04 AM
Subject: Reminder: Thesis Proposal - December 1, 2020 - Nicolay Topin -
Unifying State and Policy Level Explanations for Reinforcement Learning
To: ml-seminar at cs.cmu.edu <ML-SEMINAR at cs.cmu.edu>, <
marie.desjardins at simmons.edu>

*Thesis Proposal*

Date: December 1, 2020
Time: 10:00am (EST)
Speaker: Nicolay Topin

Zoom Meeting Link:
https://cmu.zoom.us/j/99269721240?pwd=a3c5QytZbE01a0w4WEpIS3RpSjFSdz09
Meeting ID: 992 6972 1240
Password: 068976

*Title**: Unifying State and Policy Level Explanations for Reinforcement
Learning*

Abstract:
In an off-policy reinforcement learning setting, an agent observes
interactions with an environment to learn a policy to maximize reward.
Before the agent is allowed to follow its learned policy, a human operator
can use explanations to gauge the agent's competency and try to understand
its behavior. Policy-level behavior explanations illustrate the long-term
behavior of the agent. Feature importance explanations identify the
features of a state that affect an agent’s action choice for that state.
Experience importance explanations show which past experiences led to the
current behavior. Previous methods for creating explanations have provided
a subset of information types but not all three at once. In this thesis, we
address the problem of creating explanations for a reinforcement learning
agent that include this full set of information types. We contribute a
novel explanation method that unifies and extends these existing
explanation types.

We have created a method for producing feature importance explanations by
learning a decision tree policy using reinforcement learning. This method
formulates the problem as a Markov decision process, so standard off-policy
learning algorithms can be used to learn an optimal decision tree.
Likewise, we have created an algorithm for summarizing policy-level
behavior as a Markov chain over abstract states. Our approach uses a set of
decision trees to map states to abstract states. In addition, we have
introduced a method for creating experience importance explanations which
identifies sets of similarly treated inputs and how these sets impacted
training.

We propose two lines of future work. First, we will integrate the two
decision tree explanations (for feature importance explanations and
policy-level behavior explanations) via a shared state featurization.
Second, we will extend the experience importance explanation algorithm to
identify important experiences for both abstract state division as well as
the agent's choice of features to examine.

*Thesis committee:*
Manuela Veloso (Chair)
Tom Mitchell
Ameet Talwalkar
Marie desJardins (Simmons University)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://redfish.com/pipermail/friam_redfish.com/attachments/20201130/1ecad89b/attachment.html>