|
Winsorising or Winsorisation is the transformation of statistics by limiting extreme values in the statistical data to reduce the effect of possibly spurious outliers. It is named after the engineer-turned-biostatistician Charles P. Winsor (1895–1951). The effect is the same as clipping in signal processing. The distribution of many statistics can be heavily influenced by outliers. A typical strategy is to set all outliers to a specified percentile of the data; for example, a 90% Winsorisation would see all data below the 5th percentile set to the 5th percentile, and data above the 95th percentile set to the 95th percentile. Winsorised estimators are usually more robust to outliers than their more standard forms, although there are alternatives, such as trimming, that will achieve a similar effect. == Example == Consider the data set consisting of: : The 5th percentile lies between −40 and −5, while the 95th percentile lies between 101 and 1053. (Values shown in bold.) Then a 90% Winsorisation would result in the following: : Python can winsorise data using NumPy and SciPy libraries : 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Winsorising」の詳細全文を読む スポンサード リンク
|