|
In probability theory and statistics, variance measures how far a set of numbers is spread out. A variance of zero indicates that all the values are identical. Variance is always non-negative: a small variance indicates that the data points tend to be very close to the mean (expected value) and hence to each other, while a high variance indicates that the data points are very spread out around the mean and from each other. An equivalent measure is the square root of the variance, called the standard deviation. The standard deviation has the same dimension as the data, and hence is comparable to deviations from the mean. There are two distinct concepts that are both called "variance". One variance is a characteristic of a set of observations. The other is part of a theoretical probability distribution and is defined by an equation. When variance is calculated from observations, those observations are either measured from a real world system or generated by a theoretical probability distribution or other generating model. If all possible observations of the system are present then the calculated variance is called the population variance. Normally, however, only a subset is available, and the variance calculated from this is called the sample variance. The variance calculated from a sample is considered an estimate of the full population variance. There are multiple ways to calculate an estimate of the population variance, as discussed in the section below. The two kinds of variance are closely related. To see how, consider that a theoretical probability distribution can be used as a generator of observations. If an infinite number of observations are generated using a distribution, then the sample variance calculated from that infinite set will match the value calculated using the distribution's equation for variance. The variance is one of several descriptors of a probability distribution. In particular, the variance is one of the moments of a distribution. In that context, it forms part of a systematic approach to distinguishing between probability distributions. While other such approaches have been developed, those based on moments are advantageous in terms of mathematical and computational simplicity. ==Definition== The variance of a set of samples that is represented by random variable ''X'' is its second central moment, the expected value of the squared deviation from the mean : : This definition encompasses random variables that are generated by processes that are discrete, continuous, neither, or mixed. The variance can also be thought of as the covariance of a random variable with itself: : The variance is also equivalent to the second cumulant of a probability distribution that generates ''X''. The variance is typically designated as Var(''X''), , or simply σ2 (pronounced "sigma squared"). The expression for the variance can be expanded: : A mnemonic for the above expression is "mean of square minus square of mean". On computational floating point arithmetic, this equation should not be used, because it suffers from catastrophic cancellation if the two components of the equation are similar in magnitude. There exist numerically stable alternatives. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Variance」の詳細全文を読む スポンサード リンク
|