|
In statistical analysis of binary classification, the F1 score (also F-score or F-measure) is a measure of a test's accuracy. It considers both the precision ''p'' and the recall ''r'' of the test to compute the score: ''p'' is the number of correct positive results divided by the number of all positive results, and ''r'' is the number of correct positive results divided by the number of positive results that should have been returned. The F1 score can be interpreted as a weighted average of the precision and recall, where an F1 score reaches its best value at 1 and worst at 0. The traditional F-measure or balanced F-score (F1 score) is the harmonic mean of precision and recall: :. The general formula for positive real β is: :. The formula in terms of Type I and type II errors: :. Two other commonly used F measures are the measure, which weights recall higher than precision, and the measure, which puts more emphasis on precision than recall. The F-measure was derived so that "measures the effectiveness of retrieval with respect to a user who attaches β times as much importance to recall as precision". It is based on Van Rijsbergen's effectiveness measure :. Their relationship is where . == Diagnostic Testing == This is related to the field of binary classification where recall is often termed as Sensitivity. There are several reasons that the F1 score can be criticized in particular circumstances.〔 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「F1 score」の詳細全文を読む スポンサード リンク
|