Mean-Field Theory for Batched TD(λ)

Fernando J. Pineda

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

A representation-independent mean-field dynamics is presented for batched TD(λ). The task is learning to predict the outcome of an indirectly observed absorbing Markov process. In the case of linear representations, the discrete-time deterministic iteration is an affine map whose fixed point can be expressed in closed form without the assumption of linearly independent observation vectors. Batched linear TD(λ) is proved to converge with probability 1 for all λ. Theory and simulation agree on a random walk example.

Original languageEnglish (US)
Pages (from-to)1403-1419
Number of pages17
JournalNeural Computation
Volume9
Issue number7
DOIs
StatePublished - Oct 1 1997
Externally publishedYes

ASJC Scopus subject areas

  • Arts and Humanities (miscellaneous)
  • Cognitive Neuroscience

Fingerprint

Dive into the research topics of 'Mean-Field Theory for Batched TD(λ)'. Together they form a unique fingerprint.

Cite this