问Python和一般情况下的浮点相等
EN

Stack Overflow用户

提问于 2010-06-16 05:15:41

回答 8查看 20.3K关注 0票数 16

我有一段代码，它的行为取决于我是通过字典获得转换因子还是直接使用它们。

下面的代码将打印1.0 == 1.0 -> False

但是，如果将factors[units_from]替换为10.0，将factors[units_to ]替换为1.0 / 2.54，则将打印1.0 == 1.0 -> True

#!/usr/bin/env python

base = 'cm'
factors = {
    'cm'        : 1.0,
    'mm'        : 10.0,
    'm'         : 0.01,
    'km'        : 1.0e-5,
    'in'        : 1.0 / 2.54,
    'ft'        : 1.0 / 2.54 / 12.0,
    'yd'        : 1.0 / 2.54 / 12.0 / 3.0,
    'mile'      : 1.0 / 2.54 / 12.0 / 5280,
    'lightyear' : 1.0 / 2.54 / 12.0 / 5280 / 5.87849981e12,
}

# convert 25.4 mm to inches
val = 25.4
units_from = 'mm'
units_to = 'in'

base_value = val / factors[units_from]
ret = base_value * factors[units_to  ]
print ret, '==', 1.0, '->', ret == 1.0

首先我要说的是，我非常确定这里发生了什么。我以前在C中见过它，只是从来没有在Python中看到过，但自从Python在C中实现之后，我们就看到了它。

我知道浮点数会改变从CPU寄存器到缓存再到缓存的值。我知道，如果将两个相等的变量进行比较，如果其中一个变量被页出，而另一个变量驻留在寄存器中，则将返回false。

问题

避免这种问题的最好方法是什么？
我是不是做错了什么？

便笺

这显然是一个简化的例子的一部分，但我试图做的是使用长度、体积等类，这些类可以与相同类的其他对象进行比较，但具有不同的单位。

反问句

如果这是一个潜在的危险问题，因为它使程序在不确定的情况下行为，当编译器检测到您正在检查floats

Should编译器的相等性时，它们是否应该警告或出错？

编译器已经这样做了，我只是找不到

python

floating-point

equality

回答 8

Stack Overflow用户

回答已采纳

发布于 2010-06-18 01:42:39

感谢您的回复。大多数都很好，并提供了很好的链接，所以我只想说，并回答我自己的问题。

Caspin发布了这个link。

他还提到谷歌测试使用了ULP比较，当我查看谷歌代码时，我看到他们提到了与天鹅座软件完全相同的链接。

我最终用C语言实现了一些算法作为Python扩展，后来发现我也可以用纯Python来实现。代码发布在下面。

最后，我可能只会将ULP差异添加到我的技巧包中。

有趣的是，在两个不会离开内存的相等数字之间有多少浮点数。我读过的一篇文章或谷歌代码说4是一个很好的数字...但在这里我能打到10分。

>>> f1 = 25.4
>>> f2 = f1
>>> 
>>> for i in xrange(1, 11):
...     f2 /= 10.0          # to cm
...     f2 *= (1.0 / 2.54)  # to in
...     f2 *= 25.4          # back to mm
...     print 'after %2d loops there are %2d doubles between them' % (i, dulpdiff(f1, f2))
... 
after  1 loops there are  1 doubles between them
after  2 loops there are  2 doubles between them
after  3 loops there are  3 doubles between them
after  4 loops there are  4 doubles between them
after  5 loops there are  6 doubles between them
after  6 loops there are  7 doubles between them
after  7 loops there are  8 doubles between them
after  8 loops there are 10 doubles between them
after  9 loops there are 10 doubles between them
after 10 loops there are 10 doubles between them

同样有趣的是，当其中一个作为字符串写出并读回时，相等的数字之间有多少浮点数。

>>> # 0 degrees Fahrenheit is -32 / 1.8 degrees Celsius
... f = -32 / 1.8
>>> s = str(f)
>>> s
'-17.7777777778'
>>> # floats between them...
... fulpdiff(f, float(s))
0
>>> # doubles between them...
... dulpdiff(f, float(s))
6255L

import struct
from functools import partial

# (c) 2010 Eric L. Frederich
#
# Python implementation of algorithms detailed here...
# from http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm

def c_mem_cast(x, f=None, t=None):
    '''
    do a c-style memory cast

    In Python...

    x = 12.34
    y = c_mem_cast(x, 'd', 'l')

    ... should be equivilent to the following in c...

    double x = 12.34;
    long   y = *(long*)&x;
    '''
    return struct.unpack(t, struct.pack(f, x))[0]

dbl_to_lng = partial(c_mem_cast, f='d', t='l')
lng_to_dbl = partial(c_mem_cast, f='l', t='d')
flt_to_int = partial(c_mem_cast, f='f', t='i')
int_to_flt = partial(c_mem_cast, f='i', t='f')

def ulp_diff_maker(converter, negative_zero):
    '''
    Getting the ulp difference of floats and doubles is similar.
    Only difference if the offset and converter.
    '''
    def the_diff(a, b):

        # Make a integer lexicographically ordered as a twos-complement int
        ai = converter(a)
        if ai < 0:
            ai = negative_zero - ai

        # Make b integer lexicographically ordered as a twos-complement int
        bi = converter(b)
        if bi < 0:
            bi = negative_zero - bi

        return abs(ai - bi)

    return the_diff

# double ULP difference
dulpdiff = ulp_diff_maker(dbl_to_lng, 0x8000000000000000)
# float  ULP difference
fulpdiff = ulp_diff_maker(flt_to_int, 0x80000000        )

# default to double ULP difference
ulpdiff = dulpdiff
ulpdiff.__doc__ = '''
Get the number of doubles between two doubles.
'''

票数 4

Stack Overflow用户

发布于 2010-06-16 06:59:33

正如已经显示的那样，比较两个浮点数(或双精度浮点数等)可能会有问题。通常，不是比较是否完全相等，而是应该对照误差界限进行检查。如果它们在误差范围内，则认为它们相等。

这说起来容易做起来难。浮点的性质使得固定的误差范围变得毫无价值。较小的误差范围(如2*float_epsilon)在值接近0.0时工作得很好，但如果值接近1000，则会失败。对于像1,000,000.0这样大的值，误差界限对于接近0.0的值来说太宽松了。

最好的解决方案是了解你的数学领域，并根据具体情况选择一个合适的错误界限。

当这是不切实际的或者你很懒的时候，Units in the Last Place (ULPs)是一个非常新颖和健壮的解决方案。完整的细节是相当复杂的，你可以阅读更多here。

基本思想是这样的，一个浮点数有两个部分，尾数和指数。通常，四舍五入误差只会改变尾数几个步骤。当该值接近0.0时，这些步长恰好是float_epsilon。当浮点值更接近1,000,000时，步长将接近于1。

Google test使用ULP连接到compare floating point numbers。他们为要比较相等的两个浮点数选择了默认的4个ULP。你也可以使用他们的代码作为参考来构建你自己的ULP风格的浮点比较器。

票数 8

Stack Overflow用户

发布于 2010-06-16 05:21:35

不同之处在于，如果您用1.0 / 2.54替换factors[units_to ]，您将执行以下操作：

(base_value * 1.0) / 2.54

有了字典，你可以做到：

base_value * (1.0 / 2.54)

四舍五入的顺序很重要。如果你这样做，这会更容易看出来：

>>> print (((25.4 / 10.0) * 1.0) / 2.54).__repr__()
1.0
>>> print ((25.4 / 10.0) * (1.0 / 2.54)).__repr__()
0.99999999999999989

请注意，没有不确定或未定义的行为。有一个标准，IEEE-754，实现必须遵守(并不是说他们总是遵守)。

我不认为应该有一个自动的足够接近的替代品。这通常是处理问题的有效方法，但它应该由程序员决定是否以及如何使用它。

最后，当然还有用于任意精度算术的选项，包括python-gmp和decimal。想一想你是否真的需要它们，因为它们确实有很大的性能影响。

在常规寄存器和高速缓存之间移动没有问题。您可能会想到x86的80位extended precision。

票数 6

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/3049101

复制

相似问题

问Python和一般情况下的浮点相等
EN

回答 8

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python和一般情况下的浮点相等EN

回答 8

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python和一般情况下的浮点相等
EN