|
The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-point computation established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE). Many hardware floating point units use the IEEE 754 standard. The standard addressed many problems found in the diverse floating point implementations that made them difficult to use reliably and portably. The current version, IEEE 754-2008 published in August 2008, includes nearly all of the original IEEE 754-1985 standard and the IEEE Standard for Radix-Independent Floating-Point Arithmetic (IEEE 854-1987). The international standard ISO/IEC/IEEE 60559:2011 (with content identical to IEEE 754-2008) has been approved for adoption through JTC1/SC 25 under the ISO/IEEE PSDO Agreement〔(FW: ISO/IEC/IEEE 60559 (IEEE Std 754-2008) )〕 and published.〔(ISO/IEC/IEEE 60559:2011 - Information technology - Microprocessor Systems - Floating-Point arithmetic )〕 The standard defines * ''arithmetic formats:'' sets of binary and decimal floating-point data, which consist of finite numbers (including signed zeros and subnormal numbers), infinities, and special "not a number" values (NaNs) * ''interchange formats:'' encodings (bit strings) that may be used to exchange floating-point data in an efficient and compact form * ''rounding rules:'' properties to be satisfied when rounding numbers during arithmetic and conversions * ''operations:'' arithmetic and other operations on arithmetic formats * ''exception handling:'' indications of exceptional conditions (such as division by zero, overflow, ''etc.'') The standard also includes extensive recommendations for advanced exception handling, additional operations (such as trigonometric functions), expression evaluation, and for achieving reproducible results. The standard is derived from and replaces IEEE 754-1985, the previous version, following a seven-year revision process, chaired by Dan Zuras and edited by Mike Cowlishaw. The binary formats in the original standard are included in the new standard along with three new basic formats (one binary and two decimal). To conform to the current standard, an implementation must implement at least one of the basic formats as both an arithmetic format and an interchange format. To conform with IEEE rules, as of September 2015 the standard is being revised to incorporate various clarifications and errata.〔()〕 The revision committee is chaired by David Hough and the editor is Mike Cowlishaw. Details are on the revision website at () == Formats == An IEEE 754 ''format'' is a "set of representations of numerical values and symbols". A format may also include how the set is encoded. A format comprises: * Finite numbers, which may be either base 2 (binary) or base 10 (decimal). Each finite number is described by three integers: ''s'' = a ''sign'' (zero or one), ''c'' = a ''significand'' (or 'coefficient'), ''q'' = an ''exponent''. The numerical value of a finite number is (−1)''s'' × ''c'' × ''b''''q'' where ''b'' is the base (2 or 10), also called ''radix''. For example, if the base is 10, the sign is 1 (indicating negative), the significand is 12345, and the exponent is −3, then the value of the number is −12.345. * Two infinities: +∞ and −∞. * Two kinds of NaN: a quiet NaN (qNaN) and a signaling NaN (sNaN). A NaN may carry a ''payload'' that is intended for diagnostic information indicating the source of the NaN. The sign of a NaN has no meaning, but it may be predictable in some circumstances. The possible finite values that can be represented in a format are determined by the base ''b'', the number of digits in the significand (precision ''p''), and the exponent parameter ''emax'': * ''c'' must be an integer in the range zero through ''b''''p''−1 (''e.g.'', if ''b''=10 and ''p''=7 then c is 0 through 9999999) * ''q'' must be an integer such that 1−''emax'' ≤ ''q''+''p''−1 ≤ ''emax'' (''e.g.'', if ''p''=7 and ''emax''=96 then q is −101 through 90). Hence (for the example parameters) the smallest non-zero positive number that can be represented is 1×10−101 and the largest is 9999999×1090 (9.999999×1096), and the full range of numbers is −9.999999×1096 through 9.999999×1096. The numbers −''b''1−''emax'' and ''b''1−''emax'' (here, −1×10−95 and 1×10−95) are the smallest (in magnitude) ''normal numbers''; non-zero numbers between these smallest numbers are called subnormal numbers. Zero values are finite values with significand 0. These are signed zeros, the sign bit specifies if a zero is +0 (positive zero) or −0 (negative zero). 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「IEEE floating point」の詳細全文を読む スポンサード リンク
|