|
In probability theory and statistics, Wallenius' noncentral hypergeometric distribution (named after Kenneth Ted Wallenius) is a generalization of the hypergeometric distribution where items are sampled with bias. This distribution can be illustrated as an urn model with bias. Assume, for example, that an urn contains ''m''1 red balls and ''m''2 white balls, totalling ''N'' = ''m''1 + ''m''2 balls. Each red ball has the weight ω1 and each white ball has the weight ω2. We will say that the odds ratio is ω = ω1 / ω2. Now we are taking ''n'' balls, one by one, in such a way that the probability of taking a particular ball at a particular draw is equal to its proportion of the total weight of all balls that lie in the urn at that moment. The number of red balls ''x''1 that we get in this experiment is a random variable with Wallenius' noncentral hypergeometric distribution. The matter is complicated by the fact that there is more than one noncentral hypergeometric distribution. Wallenius' noncentral hypergeometric distribution is obtained if balls are sampled one by one in such a way that there is competition between the balls. Fisher's noncentral hypergeometric distribution is obtained if the balls are sampled simultaneously or independently of each other. Unfortunately, both distributions are known in the literature as "the" noncentral hypergeometric distribution. It is important to be specific about which distribution is meant when using this name. The two distributions are both equal to the (central) hypergeometric distribution when the odds ratio is 1. It is far from obvious why these two distributions are different. See the Wikipedia entry on noncentral hypergeometric distributions for a more detailed explanation of the difference between these two probability distributions. ==Univariate distribution== Wallenius' distribution is particularly complicated because each ball has a probability of being taken that depends not only on its weight, but also on the total weight of its competitors. And the weight of the competing balls depends on the outcomes of all preceding draws. This recursive dependency gives rise to a difference equation with a solution that is given in open form by the integral in the expression of the probability mass function in the table above. Closed form expressions for the probability mass function exist (Lyons, 1980), but they are not very useful for practical calculations because of extreme numerical instability, except in degenerate cases. Several other calculation methods are used, including recursion, Taylor expansion and numerical integration (Fog, 2007, 2008). The most reliable calculation method is recursive calculation of f(''x'',''n'') from f(''x'',''n''-1) and f(''x''-1,''n''-1) using the recursion formula given below under properties. The probabilities of all (''x'',''n'') combinations on all possible trajectories leading to the desired point are calculated, starting with f(0,0) = 1 as shown on the figure to the right. The total number of probabilities to calculate is ''n''(''x''+1)-''x''2. Other calculation methods must be used when ''n'' and ''x'' are so big that this method is too inefficient. The probability that all balls have the same color is easier to calculate. See the formula below under multivariate distribution. No exact formula for the mean is known (short of complete enumeration of all probabilities). The equation given above is reasonably accurate. This equation can be solved for μ by Newton-Raphson iteration. The same equation can be used for estimating the odds from an experimentally obtained value of the mean. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Wallenius' noncentral hypergeometric distribution」の詳細全文を読む スポンサード リンク
|