numpy-python中float32上的快速平方根逆

哈客cc lv.2

发布时间：2022-03-17 13:51:48 274

相关标签：

我对python中的快速平方根逆方法做了一些检查（jupyterlab使用python版本3.8.8），出于某种原因，我得出结论，我一定是做错了什么，或者float32有什么我不理解的地方。问题是，根据以下文章：matrix67。com/data/InvSqrt。pdf

然后得出结论，神奇数字0x5f375a86的性能略好于0x5f3759df，但通过下面的代码，我检查哪一个更好（因为我怀疑python可能会做与c/c++不同的事情），然后得出结论，0x5f3759df在1到300000的范围内更好。0x5f3759df与0x5f375a86

有人能给我解释一下怎么了吗？我忽略了什么吗？我还尝试了上面列出的一些值https://en.wikipedia.org/wiki/Fast_inverse_square_root
对于x64和x32，它总是得出相同的结论，0x5f3759df是最准确的，在这一点上是心理的。我在下面列出了使用过的代码，其中；np_invsqrt“；是对这段代码稍加修改的版本：https://github.com/ajcr/ajcr.github.io/blob/master/_posts/2016-04-01-fast-inverse-square-root-python.md

def np_invsqrt1(x):
    neg_half = -0.5
    half3 = 1.5
    y = np.float32(x)    
    i = y.view(np.int32)
    i = np.int32(0x5f3759df) + -1*np.int32(i >> 1) #quake 32
    y = i.view(np.float32)   
    y = y * (half3 + (neg_half*x * y * y))
    return float(y)

def np_invsqrt2(x):
    neg_half = -0.5
    half3 = 1.5
    y = np.float32(x)    
    i = y.view(np.int32)
    i = np.int32(0x5F375A86) + -1*np.int32(i >> 1) #lomont 32
    y = i.view(np.float32)   
    y = y * (half3 + (neg_half*x * y * y))
    return float(y)


s = np.arange(1, 300000,0.1)
c1 = 0
c2 = 0
for i in range (0,len(s)):
    sq1 = np_invsqrt1(s[i])
    sq2 = np_invsqrt2(s[i])
    sqt1 = (1/np.sqrt(s[i])-sq1)
    sqt2 = (1/np.sqrt(s[i])-sq2)
    #print(sqt1)
    #print(sqt2)
    if min(sqt1,sqt2) == sqt1:
        #print(i+1,1)
        c1 = c1 + 1
    else:
        #print(i+1,2)
        c2 = c2 + 1
print(c1,c2)
if max(c1,c2) == c1:
    print("sq1 is best") 
else:
    print("sq2 is best")

特别声明：以上内容（图片及文字）均为互联网收集或者用户上传发布，本站仅提供信息存储服务！如有侵权或有涉及法律问题请联系我们。