AIMA Artificial Intelligence a modern approch

AIMA-exercises is an open-source community of students, instructors and developers. Anyone can add an exercise, suggest answers to existing questions, or simply help us improve the platform. We accept contributions on this github repository.

Exercise 21.4

The direct utility estimation method in Section passive-rl-section uses distinguished terminal states to indicate the end of a trial. How could it be modified for environments with discounted rewards and no terminal states?

View Answer