subject

Consider the following gridworld MDP. The states are grid squares, identified by their row and column number (row first). The agent always starts in state (1,1), marked with the letter S. There are two terminal goal states, (2,3) with reward 5 and (1,3) with reward -5. Rewards are 0 in non-terminal states. (The reward for a state is received as the agent moves into the state). The transition function is such that the intended agent movement (Up, Down, Left, or Right) happens with probability .8. With probability .1 each, the agent ends up in one of the states perpendicular to the intended direction. If a collision with a wall happens, the agent stays in the same state. +5
S -5
Which of the following is the optimal policy for this grid ?
A. Right Right +5
Up Left -5
B. Down Left +5
Right Up -5
C. Right Down +5
Up Right -5
D. Right Right +5
Right Right -5

ansver
Answers: 2

Another question on Computers and Technology

question
Computers and Technology, 22.06.2019 18:10
How can i delete permalinks from a word press site?
Answers: 1
question
Computers and Technology, 23.06.2019 01:30
How do you set up a slide show to play continuously, advancing through all the slides without requiring your interaction? a. click set up slide show, and then select the loop continuously until ‘esc' and show without narration options. b. click set up slide show, and then select the loop continuously until ‘esc' and use timings, if present options. c. click set up slide show, and then select the show presenter view and use timings, if present options. d. click set up slide show, and then select the show without animation and browsed at a kiosk (full screen) options.
Answers: 3
question
Computers and Technology, 23.06.2019 12:00
From excel to powerpoint, you can copy and paste a. cell ranges and charts, one at a time. b. cell ranges and charts, simultaneously. c. charts only. d. cell ranges only.
Answers: 3
question
Computers and Technology, 24.06.2019 14:00
In simple terms, how would you define a protocol?
Answers: 2
You know the right answer?
Consider the following gridworld MDP. The states are grid squares, identified by their row and colum...
Questions
question
Mathematics, 20.09.2020 16:01
question
Spanish, 20.09.2020 16:01
question
Mathematics, 20.09.2020 16:01
question
Chemistry, 20.09.2020 16:01
question
English, 20.09.2020 16:01