multicollinearity について

Words near each other

・ "O" Is for Outlaw
・ "O"-Jung.Ban.Hap.
・ "Ode-to-Napoleon" hexachord
・ "Oh Yeah!" Live
・ "Our Contemporary" regional art exhibition (Leningrad, 1975)
・ "P" Is for Peril
・ "Pimpernel" Smith
・ "Polish death camp" controversy
・ "Pro knigi" ("About books")
・ "Prosopa" Greek Television Awards
・ "Pussy Cats" Starring the Walkmen
・ "Q" Is for Quarry
・ "R" Is for Ricochet
・ "R" The King (2016 film)
・ "Rags" Ragland
・ ! (album)
・ ! (disambiguation)
・ !!
・ !!!
・ !!! (album)
・ !!Destroy-Oh-Boy!!
・ !Action Pact!
・ !Arriba! La Pachanga
・ !Hero
・ !Hero (album)
・ !Kung language
・ !Oka Tokat
・ !PAUS3
・ !T.O.O.H.!
・ !Women Art Revolution

Dictionary Lists

mini英和辞書

翻訳と辞書　辞書検索 [ 開発暫定版 ]

スポンサードリンク

multicollinearity ：ウィキペディア英語版

multicollinearity
In statistics, multicollinearity (also collinearity) is a phenomenon in which two or more predictor variables in a multiple regression model are highly correlated, meaning that one can be linearly predicted from the others with a substantial degree of accuracy. In this situation the coefficient estimates of the multiple regression may change erratically in response to small changes in the model or the data. Multicollinearity does not reduce the predictive power or reliability of the model as a whole, at least within the sample data set; it only affects calculations regarding individual predictors. That is, a multiple regression model with correlated predictors can indicate how well the entire bundle of predictors predicts the outcome variable, but it may not give valid results about any individual predictor, or about which predictors are redundant with respect to others.
In case of perfect multicollinearity the predictor matrix is singular and therefore cannot be inverted. Under these circumstances, for a general linear model

y = X \beta + \epsilon

, the ordinary least-squares estimator

\hat_ = (X'X)^X'y

does not exist.
Note that in statements of the assumptions underlying regression analyses such as ordinary least squares, the phrase "no multicollinearity" is sometimes used to mean the absence of perfect multicollinearity, which is an exact (non-stochastic) linear relation among the regressors.
==Definition==

Collinearity is a linear association between ''two'' explanatory variables. Two variables are perfectly collinear if there is an exact linear relationship between them. For example,

X_

and

X_

are perfectly collinear if there exist parameters

\lambda_0

and

\lambda_1

such that, for all observations ''i'', we have
:

X_ = \lambda_0 + \lambda_1 X_.

Multicollinearity refers to a situation in which two or more explanatory variables in a multiple regression model are highly linearly related. We have perfect multicollinearity if, for example as in the equation above, the correlation between two independent variables is equal to 1 or -1. In practice, we rarely face perfect multicollinearity in a data set. More commonly, the issue of multicollinearity arises when there is an approximate linear relationship among two or more independent variables.
Mathematically, a set of variables is perfectly multicollinear if there exist one or more exact linear relationships among some of the variables. For example, we may have
:

\lambda_0 + \lambda_1 X_ + \lambda_2 X_ + \cdots + \lambda_k X_ = 0

holding for all observations ''i'', where

\lambda_j

are constants and

X_

is the ''i''^th observation on the ''j''^th explanatory variable. We can explore one issue caused by multicollinearity by examining the process of attempting to obtain estimates for the parameters of the multiple regression equation
:

Y_ = \beta _0 + \beta _1 X_ + \cdots + \beta _k X_ + \varepsilon _.

The ordinary least squares estimates involve inverting the matrix
:

X^ X

where
:

X = \begin      1 & X_ & \cdots & X_  \\      \vdots & \vdots & & \vdots \\      1 & X_ & \cdots & X_\end.

If there is an exact linear relationship (perfect multicollinearity) among the independent variables, the rank of X (and therefore of X^TX) is less than k+1, and the matrix X^TX will not be invertible.
Perfect multicollinearity is fairly common when working with raw datasets, which frequently contain redundant information. Once redundancies are identified and removed, however, nearly multicollinear variables often remain due to correlations inherent in the system being studied. In such a case, instead of the above equation holding, we have that equation in modified form with an error term

v_i

:
:

\lambda_0 + \lambda_1 X_ + \lambda_2 X_ + \cdots + \lambda_k X_ + v_i = 0.

In this case, there is no exact linear relationship among the variables, but the

X_j

variables are nearly perfectly multicollinear if the variance of

v_i

is small for some set of values for the

\lambda

's. In this case, the matrix X^TX has an inverse, but is ill-conditioned so that a given computer algorithm may or may not be able to compute an approximate inverse, and if it does so the resulting computed inverse may be highly sensitive to slight variations in the data (due to magnified effects of rounding error) and so may be very inaccurate.

抄文引用元・出典: フリー百科事典『ウィキペディア（Wikipedia）』
■ウィキペディアで「multicollinearity」の詳細全文を読む

スポンサードリンク

翻訳と辞書 : 翻訳のためのインターネットリソース