|
In probability theory and statistics, kurtosis (from (ギリシア語:κυρτός), ''kyrtos'' or ''kurtos'', meaning "curved, arching") is a measure of the "tailedness" of the probability distribution of a real-valued random variable. In a similar way to the concept of skewness, ''kurtosis'' is a descriptor of the shape of a probability distribution and, just as for skewness, there are different ways of quantifying it for a theoretical distribution and corresponding ways of estimating it from a sample from a population. Depending on the particular measure of kurtosis that is used, there are various interpretations of kurtosis, and of how particular measures should be interpreted; these are primarily tail weight, peakedness (width of peak), and lack of shoulders (distribution primarily peak and tails, not in between). The standard measure of kurtosis, originating with Karl Pearson, is based on a scaled version of the fourth moment of the data or population. This number measures heavy tails, and not peakedness;〔(SAS Elementary Statistics Procedures ), SAS Institute (section on Kurtosis)〕〔(Westfall, P.H. (2014), Kurtosis as Peakedness, 1905 - 2014. R.I.P., The American Statistician 68, 191 - 195. )〕 hence, the "peakedness" definition is misleading. For this measure, higher kurtosis means more of the variance is the result of infrequent extreme deviations, as opposed to frequent modestly sized deviations. The kurtosis of any univariate normal distribution is 3. It is common to compare the kurtosis of a distribution to this value. Distribution with kurtosis less than 3 are said to be ''platykurtic''. An example of a platykurtic distribution is the uniform distribution, which does not have positive-valued tails. Distributions with kurtosis greater than 3 are said to be leptokurtic. An example of a leptokurtic distribution is the Laplace distribution, which has tails that asymptotically approach zero more slowly than a Gaussian. It is also common practice to use an adjusted version of Pearson's kurtosis, the excess kurtosis, which is the kurtosis minus 3, to provide the comparison to the normal distribution. Some authors use "kurtosis" by itself to refer to the excess kurtosis. For the reason of clarity and generality, however, this article follows the non-excess convention and explicitly indicates where excess kurtosis is meant. Alternative measures of kurtosis are: the L-kurtosis, which is a scaled version of the fourth L-moment; measures based on 4 population or sample quantiles. These correspond to the alternative measures of skewness that are not based on ordinary moments.〔 == Pearson moments == The kurtosis is the fourth standardized moment, defined as : where μ4 is the fourth moment about the mean and σ is the standard deviation. Several letters are used in the literature to denote the kurtosis. A very common choice is κ, which is fine as long as it is clear that it does not refer to a cumulant. Other choices include γ2, to be similar to the notation for skewness, although sometimes this is instead reserved for the excess kurtosis. The kurtosis is bounded below by the squared skewness plus 1: : where μ3 is the third moment about the mean. The lower bound is realized by the Bernoulli distribution with ''p'' = ½, or "coin toss". There is no upper limit to the excess kurtosis of a general probability distribution, and it may be infinite. Under the above definition, the kurtosis of any univariate normal distribution is 3. The excess kurtosis, defined as the kurtosis minus 3, then takes a value of 0 for the normal. Much of the statistics literature prefers to use the excess kurtosis ostensibly to match the fact that the fourth ''cumulant'' (not moment) of a normal distribution vanishes. Unfortunately, this has resulted in a schism in which the excess kurtosis is sometimes simply called "kurtosis" without the qualifier. Some software packages popular among the pure mathematics and science communities such as Mathematica, Matlab, and Maple all use "kurtosis" in the original manner defined above, for which the kurtosis of a normal distribution is 3. Other software packages popular among the statistics and finance communities including Excel and R return the excess kurtosis under "kurtosis" function calls, and NumPy defaults to this behavior. A reason why some authors favor the excess kurtosis is that cumulants are extensive. Formulas related to the extensive property are more naturally expressed in terms of the excess kurtosis. For example, let ''X''1, ..., ''X''''n'' be independent random variables for which the fourth moment exists, and let ''Y'' be the random variable defined by the sum of the ''X''''i''. The excess kurtosis of ''Y'' is : where is the standard deviation of . In particular if all of the ''X''''i'' have the same variance, then this simplifies to : The reason not to subtract off 3 is that the bare fourth moment better generalizes to multivariate distributions, especially when independence is not assumed. The cokurtosis between pairs of variables is an order four tensor. For a bivariate normal distribution, the cokurtosis tensor has off-diagonal terms that are neither 0 nor 3 in general, so attempting to "correct" for an excess becomes confusing. It is true, however, that the joint cumulants of degree greater than two for any multivariate normal distribution are zero. For two commuting random variables, ''X'' and ''Y'', not necessarily independent, the kurtosis of the sum, ''X'' + ''Y'', is :: Note that the binomial coefficients appear in the above equation. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Kurtosis」の詳細全文を読む スポンサード リンク
|