Limited-memory BFGS について

Words near each other

・ "O" Is for Outlaw
・ "O"-Jung.Ban.Hap.
・ "Ode-to-Napoleon" hexachord
・ "Oh Yeah!" Live
・ "Our Contemporary" regional art exhibition (Leningrad, 1975)
・ "P" Is for Peril
・ "Pimpernel" Smith
・ "Polish death camp" controversy
・ "Pro knigi" ("About books")
・ "Prosopa" Greek Television Awards
・ "Pussy Cats" Starring the Walkmen
・ "Q" Is for Quarry
・ "R" Is for Ricochet
・ "R" The King (2016 film)
・ "Rags" Ragland
・ ! (album)
・ ! (disambiguation)
・ !!
・ !!!
・ !!! (album)
・ !!Destroy-Oh-Boy!!
・ !Action Pact!
・ !Arriba! La Pachanga
・ !Hero
・ !Hero (album)
・ !Kung language
・ !Oka Tokat
・ !PAUS3
・ !T.O.O.H.!
・ !Women Art Revolution

Dictionary Lists

mini英和辞書

翻訳と辞書　辞書検索 [ 開発暫定版 ]

スポンサードリンク

L-BFGS ：ウィキペディア英語版

Limited-memory BFGS
Limited-memory BFGS (L-BFGS or LM-BFGS) is an optimization algorithm in the family of quasi-Newton methods that approximates the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm using a limited amount of computer memory. It is a popular algorithm for parameter estimation in machine learning.〔
Like the original BFGS, L-BFGS uses an estimation to the inverse Hessian matrix to steer its search through variable space, but where BFGS stores a dense ''n''×''n'' approximation to the inverse Hessian (''n'' being the number of variables in the problem), L-BFGS stores only a few vectors that represent the approximation implicitly. Due to its resulting linear memory requirement, the L-BFGS method is particularly well suited for optimization problems with a large number of variables. Instead of the inverse Hessian H''_k'', L-BFGS maintains a history of the past ''m'' updates of the position x and gradient ∇''f''(x), where generally the history size ''m'' can be small (often ''m''<10). These updates are used to implicitly do operations requiring the H''_k''-vector product.
==Algorithm==
L-BFGS shares many features with other quasi-Newton algorithms, but is very different in how the matrix-vector multiplication for finding the search direction is carried out

d_k=-H_k g_k\,\!

. There are multiple published approaches using a history of updates to form this direction vector. Here, we give a common approach, the so-called "two loop recursion."
We'll take as given

x_k\,\!

, the position at the

k\,\!

-th iteration, and

g_k\equiv\nabla f(x_k)

where

f\,\!

is the function being minimized, and all vectors are column vectors. We also assume that we have stored the last

m

updates of the form

s_k = x_ - x_k\,\!

and

y_k = g_ - g_k\,\!

. We'll define

\rho_k = \frac

, and

H^0_k\,\!

will be the 'initial' approximate of the inverse Hessian that our estimate at iteration

k\,\!

begins with.
Then we can compute the (uphill) direction as follows:

q = g_k\,\!

For

i=k-1, k-2, \ldots, k-m

\alpha_i = \rho_i s^_i q\,\!

q = q - \alpha_i y_i\,\!

H^0_k=y^_ s_/y^_ y_

z = H^0_k q

For

i=k-m, k-m+1, \ldots, k-1

\beta_i = \rho_i y^_i z\,\!

z = z + s_i (\alpha_i - \beta_i)\,\!

Stop with

H_k g_k = z\,\!

This formulation is valid whether we are minimizing or maximizing. Note that if we are minimizing, the search direction would be the negative of z (since z is "uphill"), and if we are maximizing,

H^_k

should be negative definite rather than positive definite. We would typically do a backtracking line search in the search direction (any line search would be valid, but L-BFGS does not require exact line searches in order to converge).
Commonly, the inverse Hessian

H^0_k\,\!

is represented as a diagonal matrix, so that initially setting

z\,\!

requires only an element-by-element multiplication.
This two loop update only works for the inverse Hessian. Approaches to implementing L-BFGS using the direct approximate Hessian

B_k\,\!

have also been developed, as have other means of approximating the inverse Hessian.

抄文引用元・出典: フリー百科事典『ウィキペディア（Wikipedia）』
■ウィキペディアで「Limited-memory BFGS」の詳細全文を読む

スポンサードリンク

翻訳と辞書 : 翻訳のためのインターネットリソース