Mean-Field Theory for Batched TD(λ)

Research output: Contribution to journalArticle

Abstract

A representation-independent mean-field dynamics is presented for batched TD(λ). The task is learning to predict the outcome of an indirectly observed absorbing Markov process. In the case of linear representations, the discrete-time deterministic iteration is an affine map whose fixed point can be expressed in closed form without the assumption of linearly independent observation vectors. Batched linear TD(λ) is proved to converge with probability 1 for all λ. Theory and simulation agree on a random walk example.

Original languageEnglish (US)
Pages (from-to)1403-1419
Number of pages17
JournalNeural Computation
Volume9
Issue number7
StatePublished - Oct 1 1997

Fingerprint

Markov Chains
Mean field theory
Markov processes
Observation
Learning
Field Theory
Random Walk
Fixed Point
Iteration
Simulation

ASJC Scopus subject areas

  • Artificial Intelligence
  • Control and Systems Engineering
  • Neuroscience(all)

Cite this

Mean-Field Theory for Batched TD(λ). / Pineda, Fernando J.

In: Neural Computation, Vol. 9, No. 7, 01.10.1997, p. 1403-1419.

Research output: Contribution to journalArticle

Pineda, Fernando J. / Mean-Field Theory for Batched TD(λ). In: Neural Computation. 1997 ; Vol. 9, No. 7. pp. 1403-1419.
@article{41860cd1b34b43b7ba7c3ece6facf5d8,
title = "Mean-Field Theory for Batched TD(λ)",
abstract = "A representation-independent mean-field dynamics is presented for batched TD(λ). The task is learning to predict the outcome of an indirectly observed absorbing Markov process. In the case of linear representations, the discrete-time deterministic iteration is an affine map whose fixed point can be expressed in closed form without the assumption of linearly independent observation vectors. Batched linear TD(λ) is proved to converge with probability 1 for all λ. Theory and simulation agree on a random walk example.",
author = "Pineda, {Fernando J}",
year = "1997",
month = "10",
day = "1",
language = "English (US)",
volume = "9",
pages = "1403--1419",
journal = "Neural Computation",
issn = "0899-7667",
publisher = "MIT Press Journals",
number = "7",

}

TY - JOUR

T1 - Mean-Field Theory for Batched TD(λ)

AU - Pineda, Fernando J

PY - 1997/10/1

Y1 - 1997/10/1

N2 - A representation-independent mean-field dynamics is presented for batched TD(λ). The task is learning to predict the outcome of an indirectly observed absorbing Markov process. In the case of linear representations, the discrete-time deterministic iteration is an affine map whose fixed point can be expressed in closed form without the assumption of linearly independent observation vectors. Batched linear TD(λ) is proved to converge with probability 1 for all λ. Theory and simulation agree on a random walk example.

AB - A representation-independent mean-field dynamics is presented for batched TD(λ). The task is learning to predict the outcome of an indirectly observed absorbing Markov process. In the case of linear representations, the discrete-time deterministic iteration is an affine map whose fixed point can be expressed in closed form without the assumption of linearly independent observation vectors. Batched linear TD(λ) is proved to converge with probability 1 for all λ. Theory and simulation agree on a random walk example.

UR - http://www.scopus.com/inward/record.url?scp=0003276733&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0003276733&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:0003276733

VL - 9

SP - 1403

EP - 1419

JO - Neural Computation

JF - Neural Computation

SN - 0899-7667

IS - 7

ER -