[?]Agents that train in different episodes
How to inform an agent that an episode has ended? Some RL algorithms like q-learning need this. Does this make sense in Recsys where there is no clear goal/finish?
How to inform an agent that an episode has ended? Some RL algorithms like q-learning need this. Does this make sense in Recsys where there is no clear goal/finish?