Multi-Step Prediction for Curiosity Driven Learning
Ruchir Aggarwal*
Kushantha U. Attanayake*
Dennis Li*
Julio Soldevilla+
Poorani Ravindhiran^
For the results we present here:
- α = Scaler for loss in timestep t + 1 = 1
- β = Scaler for loss in timestep t + 2
Mario
Comparison Video
(Baseline) α = 1 , β = 0
α = 1 , β = 0.2
α = 1 , β = 0.5
Breakout
Space Invaders
Pacman
Comparison Video
(Baseline) α = 1 , β = 0
α = 1 , β = 0.5