Mean-Field Theory for Batched TD(λ)

Fernando J. Pineda

doi:10.1162/neco.1997.9.7.1403

Mean-Field Theory for Batched TD(λ)

Fernando J. Pineda

Research output: Contribution to journal › Article › peer-review

9 Scopus citations

Abstract

A representation-independent mean-field dynamics is presented for batched TD(λ). The task is learning to predict the outcome of an indirectly observed absorbing Markov process. In the case of linear representations, the discrete-time deterministic iteration is an affine map whose fixed point can be expressed in closed form without the assumption of linearly independent observation vectors. Batched linear TD(λ) is proved to converge with probability 1 for all λ. Theory and simulation agree on a random walk example.

Original language	English (US)
Pages (from-to)	1403-1419
Number of pages	17
Journal	Neural Computation
Volume	9
Issue number	7
DOIs	https://doi.org/10.1162/neco.1997.9.7.1403
State	Published - Oct 1 1997
Externally published	Yes

ASJC Scopus subject areas

Arts and Humanities (miscellaneous)
Cognitive Neuroscience

Access to Document

10.1162/neco.1997.9.7.1403

Cite this

@article{41860cd1b34b43b7ba7c3ece6facf5d8,

title = "Mean-Field Theory for Batched TD(λ)",

abstract = "A representation-independent mean-field dynamics is presented for batched TD(λ). The task is learning to predict the outcome of an indirectly observed absorbing Markov process. In the case of linear representations, the discrete-time deterministic iteration is an affine map whose fixed point can be expressed in closed form without the assumption of linearly independent observation vectors. Batched linear TD(λ) is proved to converge with probability 1 for all λ. Theory and simulation agree on a random walk example.",

author = "Pineda, {Fernando J.}",

year = "1997",

month = oct,

day = "1",

doi = "10.1162/neco.1997.9.7.1403",

language = "English (US)",

volume = "9",

pages = "1403--1419",

journal = "Neural Computation",

issn = "0899-7667",

publisher = "MIT Press Journals",

number = "7",

}

TY - JOUR

T1 - Mean-Field Theory for Batched TD(λ)

AU - Pineda, Fernando J.

PY - 1997/10/1

Y1 - 1997/10/1

N2 - A representation-independent mean-field dynamics is presented for batched TD(λ). The task is learning to predict the outcome of an indirectly observed absorbing Markov process. In the case of linear representations, the discrete-time deterministic iteration is an affine map whose fixed point can be expressed in closed form without the assumption of linearly independent observation vectors. Batched linear TD(λ) is proved to converge with probability 1 for all λ. Theory and simulation agree on a random walk example.

AB - A representation-independent mean-field dynamics is presented for batched TD(λ). The task is learning to predict the outcome of an indirectly observed absorbing Markov process. In the case of linear representations, the discrete-time deterministic iteration is an affine map whose fixed point can be expressed in closed form without the assumption of linearly independent observation vectors. Batched linear TD(λ) is proved to converge with probability 1 for all λ. Theory and simulation agree on a random walk example.

UR - http://www.scopus.com/inward/record.url?scp=0003276733&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0003276733&partnerID=8YFLogxK

U2 - 10.1162/neco.1997.9.7.1403

DO - 10.1162/neco.1997.9.7.1403

M3 - Article

AN - SCOPUS:0003276733

SN - 0899-7667

VL - 9

SP - 1403

EP - 1419

JO - Neural Computation

JF - Neural Computation

IS - 7

ER -

Mean-Field Theory for Batched TD(λ)

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this