【二】imagehash算法解析【Python】

本文最后更新于 1586 天前，其中的信息可能已经有所发展或是发生改变。

「论一个随机图片 api 的养成计划」

前段时间，书樱用 Python 稍微抓取了 Pixiv 的图片，总共大概有三万张左右。因为爬取时的爬虫只负责下载了，出现了很多重复的图片以及图片缩略图。最近想搞个图片 api，正好可以用上这些二次元图片，所以这个系列暂且叫做「论一个随机图片 api 的养成计划，于是便有了这篇文章。

系列文章合集 -> 「论一个随机图片 api 的养成计划」

提示

如果您觉得我写的东西对您有所帮助的话，不妨请我喝杯咖啡。=> 赞助

序言

不知不觉又摸鱼了好久，那么书樱今天系统地介绍一下 imagehash 这个 Python 库，我们在后面的过程中会用到。

秉承着不重复发明轮子的原则，书樱在中文互联网上找了一圈，这应该是第一篇对 imagehash 这个库进行全面解析的文章。

这是这个系列文章的第二篇，想看前一篇文章的朋友，可以点击进入。【一】使用 OpenCV、Pillow 库分类图片【Python】

话不宜迟，进入正题。

imagehash

An image hashing library written in Python. 来自 imagehash GitHub 页面

imagehash所有的代码都托管在GitHub上，大佬们可以自行阅读。

GitHub

JohannesBuchner/imagehash

imagehash是一个用 Python 写的图片哈希算法库。支持以下功能：

平均哈希（ahash）
感知哈希（phash）
差异哈希（dhash）
小波哈希（whash）
HSV 颜色哈希（colorhash）
抗剪切哈希（crop-resistant hashing）

原理

什么是哈希（hash）？

散列函数（英语：Hash function）又称散列算法、哈希函数，是一种从任何一种数据中创建小的数字“指纹”的方法。散列函数把消息或数据压缩成摘要，使得数据量变小，将数据的格式固定下来。该函数将数据打乱混合，重新创建一个叫做散列值（hash values，hash codes，hash sums，或 hashes）的指纹。散列值通常用一个短的随机字母和数字组成的字符串来代表。[1]

诸如 MD5、HA256 一类的密码散列函数，可以输入任何一种数据，将数据压缩成部分摘要，使得数据量变小，从而创建出小的数字“指纹”。

什么是图像哈希（`imagehash`）？

在图像哈希算法中定义了一类可以输出可比较哈希的函数，这些函数可以提取图像中的特征，用来生成一个独特但不唯一的指纹，比较这些生成的指纹就能够比较两个图像的相似度。

图像哈希是怎么起作用的？

在使用加密哈希函数加密时，输出的散列值是随机的，散列值通常用一个短的随机字母和数字组成的字符串来代表。

用于生成哈希的数据的行为类似于随机种子，对于同一种加密算法，相同的数据能生成相同的结果，但不同的数据将生成不同的结果，就如下面这个 MD5 加密：

MD5("114514")
= c4d038b4bed09fdb1471ef51ec3a32cd

即使在原文中发生哪怕一个极其微小的变化，其散列输出也会发生巨大的变化：

MD5("114515")
= ec5935f52ea59fbad054b523ccdf9c72

所以比较两个哈希值实际上只能告诉我们一件事，这两个文件是否相同——如果哈希不同，则数据不同。如果哈希值相同，则数据可能相同。（由于存在哈希冲突的可能性，因此具有相同的哈希值不能保证相同的数据。）

而在imagehash中，对于输入的图片，我们希望相似的图片输入能够有相似的哈希输出，而不同的图片输入能够输出不同的图像哈希，这样输出的哈希才具有可比性。比较哈希字符串之间的汉明距离，我们就可以比较两张图片的相似度。

图像哈希算法，如:平均哈希（ahash）、感知哈希（phash）、差异哈希（dhash）、小波哈希（whash）是使用图像的明度信息来进行分析的（没有颜色信息）。

而颜色哈希算法，如：HSV 颜色哈希（colorhash），则分析的是图片颜色分布以及黑色和灰色的占比（没有位置信息）。

安装

imagehash 基于 PIL，numpy 以及 scipy.fftpack(pHash)。可以轻松的通过pypi安装：

pip3 install imagehash

pip 会自动安装当前不存在的 Python 依赖库

使用

要想使用 imagehash 里的函数，您必须导入imagehash和PIL.Image

import imagehash
from PIL import Image

所有的函数都可以在import后直接使用。

例子（来自 GitHub 文档）:

>>> from PIL import Image
>>> import imagehash
>>> hash = imagehash.average_hash(Image.open('test.png'))
>>> print(hash)
d879f8f89b1bbf
>>> otherhash = imagehash.average_hash(Image.open('other.bmp'))
>>> print(otherhash)
ffff3720200ffff
>>> print(hash == otherhash)
False
>>> print(hash - otherhash)
36
>>> for r in range(1, 30, 5):
...     rothash = imagehash.average_hash(Image.open('test.png').rotate(r))
...     print('Rotation by %d: %d Hamming difference' % (r, hash - rothash))
...
Rotation by 1: 2 Hamming difference
Rotation by 6: 11 Hamming difference
Rotation by 11: 13 Hamming difference
Rotation by 16: 17 Hamming difference
Rotation by 21: 19 Hamming difference
Rotation by 26: 21 Hamming difference
>>>

函数

以下内容来自 Github 页面，部分文本为书樱自行翻译添加，可能存在翻译有误。

Each algorithm can also have its hash size adjusted (or in the case of colorhash, its binbits). Increasing the hash size allows an algorithm to store more detail in its hash, increasing its sensitivity tochanges in detail.

每一种算法都可以调整其 hash_size（在 colorhash 里则是 binbits）。增加 hash_size 可以让算法在其哈希中存储更多的细节，增加其对细节变化的敏感性。

_binary_array_to_hex

_binary_array_to_hex(arr)

internal function to make a hex string out of a binary array.

一个可以把二进制数组输出为十六进制字符串的内置函数。

@arr：numpy对象

hex_to_hash

hex_to_hash(hexstr)

Convert a stored hash (hex, as retrieved from str(Imagehash))back to an Imagehash object. Notes:1. This algorithm assumes all hashes are either bidimensional arrays with dimensions hash_size _hashsize, or one dimensional arrays with dimensions bin bits 14.2. This algorithm does not work for hash_size < 2.

把一个存储的 hash 值（由str(Imagehash)输出的十六进制数）转化为Imagehash对象注意：该算法假定所有的哈希都是hash_size * hash_size的二维数组，或者是具有bin * 14.2的一维数组。此函数不适用于当hash_size＜2时。

@hexstr：hash字符串

hex_to_flathash

hex_to_flathash(hexstr, hashsize)

@hexstr：hash字符串 @hashsize：hash大小

old_hex_to_hash

old_hex_to_hash(hexstr, hash_size=8)

Convert a stored hash (hex, as retrieved from str(Imagehash))back to an Imagehash object. This method should be used for hashes generated by ImageHash up to version 3.7. For hashes generated by newer versions of ImageHash, hex_to_hash should be used instead.

把一个存储的hash值（由str(Imagehash)输出的十六进制数）转化为imagehash对象这个函数应该被用于低于版本 3.7 的Imagehash生成的hash转换。而对于新版本生成的hash，应该使用hex_to_hash。

@hexstr：hash字符串 @hashsize：可选，hash大小，默认为 8

average_hash

average_hash(image, hash_size=8, mean=numpy.mean)

Average Hash computation implementation follows http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.htmlStep by step explanation: https://web.archive.org/web/20171112054354/https://www.safaribooksonline.com/blog/2013/11/26/image-hashing-with-python/

平均哈希计算实例：http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html分布解释：https://web.archive.org/web/20171112054354/https://www.safaribooksonline.com/blog/2013/11/26/image-hashing-with-python/

@image：PIL.image对象 @hash_size：可选，hash大小，默认为 8 @mean ：可选，平均方式，默认为numpy.mean

phash & phash_simple

phash(image, hash_size=8, highfreq_factor=4)phash_simple(image, hash_size=8, highfreq_factor=4)

Perceptual Hash computation Implementation follows http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html

感知哈希计算实例：http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html

@image：PIL.image对象 @hash_size：可选，hash大小，默认为 8 @highfreq_factor：可选，高频因子，默认为 4

dhash & dhash_vertical

dhash(image, hash_size=8)dhash_vertical(image, hash_size=8)

Difference Hash computation. following http://www.hackerfactor.com/blog/index.php?/archives/529-Kind-of-Like-That.html

差异哈希计算 dhash()是水平计算哈希，而dhash_vertical()是垂直计算哈希实例：http://www.hackerfactor.com/blog/index.php?/archives/529-Kind-of-Like-That.html

@image：PIL.image对象 @hash_size：可选，hash大小，默认为 8

whash

whash(image, hash_size = 8, image_scale = None, mode = 'haar', remove_max_haar_ll = True)

Wavelet Hash computation. based on https://www.kaggle.com/c/avito-duplicate-ads-detection/

小波哈希计算基于：https://www.kaggle.com/c/avito-duplicate-ads-detection/

@image：PIL.image对象 @hash_size：可选，hash大小，默认为 8 @image_scale：可选，必须是 2 的整数次幂且小于图像大小，默认为输入图片的最大值 @mode：可选，PyWavelets库的算法模式，默认为haar哈尔小波，也可为db4多贝西小波 @remove_max_haar_ll：可选，使用哈尔小波算法移除最差低频信息，默认为True

colorhash

colorhash(image, binbits=3)

Color Hash computation. Computes fractions of image in intensity, hue and saturation bins: the first binbits encode the black fraction of the image the next binbits encode the gray fraction of the remaining image (low saturation) the next 6binbits encode the fraction in 6 bins of saturation, for highly saturated parts of the remaining image the next 6binbits encode the fraction in 6 bins of saturation, for mildly saturated parts of the remaining image@binbits number of bits to use to encode each pixel fractions

颜色哈希计算计算图像在强度、色调和饱和度中的占比：第一个二进制位编码了图像的黑色占比下一个二进制位编码了剩余图像的灰度占比（低饱和度）下六个二进制位以六位二进制编码了剩余图像高饱和度部分的饱和度下六个二进制位以六位二进制编码了剩余图像中饱和度部分的饱和度

@image： PIL.image对象 @binbits：可选，用于编码每个像素部分的bit数，默认为 3

_find_region

_find_region(remaining_pixels, segmented_pixels)

Finds a region and returns a set of pixel coordinates for it.

查找区域并返回该区域的一组像素坐标。

@remaining_pixels：一个 numpy 布尔数组，当时True时意味着此像素将会被分割 @segmented_pixels：一组已经分割的像素坐标。这将随着新像素添加到返回分割段中而更新

_find_all_segments

_find_all_segments(pixels, segment_threshold, min_segment_size)

Finds all the regions within an image pixel array, and returns a list of the regions. Note: Slightly different segmentations are produced when using pillow version 6 vs. >=7, due to a change in rounding in the greyscale conversion.

查找图像像素数组中的所有区域，并返回区域列表。注意：由于灰度转换中的舍入变化，使用Pillow 6与Pillow 7及以上相比较而言会略有不同的分割。

@pixiels：一个值为像素亮度的·数组 @segment_threshold：区分峰值和谷值的阈值 @min_segment_size：一次分割的最小像素值

crop_resistant_hash

crop_resistant_hashcrop_resistant_hash(image,hash_func=None,limit_segments=None,segment_threshold=128,min_segment_size=500,segmentation_image_size=300)

Creates a CropResistantHash object, by the algorithm described in the paper "Efficient Cropping-Resistant Robust Image Hashing". DOI 10.1109/ARES.2014.85 This algorithm partitions the image into bright and dark segments, using a watershed-like algorithm, and then does an image hash on each segment. This makes the image much more resistant to cropping than other algorithms, with the paper claiming resistance to up to 50% cropping, while most other algorithms stop at about 5% cropping. Note: Slightly different segmentations are produced when using pillow version 6 vs. >=7, due to a change in rounding in the greyscale conversion. This leads to a slightly different result.

使用"Efficient Cropping-Resistant Robust Image Hashing". DOI 10.1109/ARES.2014.85论文中描述的算法创建一个CropResistantHash对象这个算法使用分水岭算法将图像分割成亮和暗两部分，再将其每一个图像进行哈希计算。这使得图像比其他算法更抗剪切。论文声称，此算法将能达到 50%的抗剪切，而其他的算法最多只有 5%。注意：由于Pillow中灰度缩放转换算法的改变，使用Pillow 6与Pillow 7及以上版本所处理的图像可能会有细小差别。这会导致一些轻微的差别。

@image：PIL.image对象 @hash_func：可选，将要使用的哈希函数。默认为code>None**@limit_segments**：_可选_，如果您有储存需求，您可以限制算法只计算最大部分。默认为None@segment_threshold：可选，介于峰值和谷值之间的亮度阈值。这应该是个静态的值，若介于峰值和低谷之间会动态中断匹配。默认为 128 @min_segment_size：可选，哈希分段的最小像素数，默认为 500 @segmentation_image_size：可选，图像分割之前被调整的大小，默认为 300

算法解析

还是用这张天依的照片来演示一下各个算法的运行过程，此为测试图片，来自 Pixiv，画师 id：418969，作品：#VOCALOIDCHINA Espejo

ahash

ahash，又称为average hash，平均哈希算法。这是最简单的算法，速度最快，但精度也是最低的。

ahash 的基本原理就是，把图片的每个像素与所有像素的平均值进行比较，大于平均值的像素输出True，小于平均值的像素输出False，最后再输出哈希。

def average_hash(image, hash_size=8, mean=numpy.mean):
    """
    Average Hash computation
    Implementation follows http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html
    Step by step explanation: https://web.archive.org/web/20171112054354/https://www.safaribooksonline.com/blog/2013/11/26/image-hashing-with-python/
    @image must be a PIL instance.
    @mean how to determine the average luminescence. can try numpy.median instead.
    """
    if hash_size < 2:
        raise ValueError("Hash size must be greater than or equal to 2")

    # reduce size and complexity, then covert to grayscale
    image = image.convert("L").resize((hash_size, hash_size), Image.ANTIALIAS)

    # find average pixel value; 'pixels' is an array of the pixel values, ranging from 0 (black) to 255 (white)
    pixels = numpy.asarray(image)
    avg = mean(pixels)

    # create string of bits
    diff = pixels > avg
    # make a hash
    return ImageHash(diff)

注意：传入的参数hash_size必须要＞ 2，否则会返回ValueError错误。

第一步：将原始图像转为灰度图、缩放图片、平滑滤波

image = image.convert("L").resize((hash_size, hash_size), Image.ANTIALIAS)

image.convert("L")将图片转化为灰度图，如下：

Image.resize((hash_size, hash_size))缩放图片大小为长宽都为hash_size。（默认为 8）

接着使用Image.ANTIALIAS操作进行平滑滤波。

这就是算法的第一步，缩小图片的尺寸，减少运算量。

第二步：将处理后的图像转为numpy数组

pixels = numpy.asarray(image)
avg = mean(pixels)

使用numpy.asarray()将image图像的每一个像素读取出明度值，输出大概长这样。

>>> numpy.asarray(Image.open('0.L.resize.ANTIALIAS.png'))
array([[162, 128, 135,  84,  28,   7,  12,  17],
       [109, 137, 122, 129, 110,  48,  25,  20],
       [ 88, 131, 139, 128, 144, 135, 121,  93],
       [129, 160, 161, 149, 137, 150, 146, 115],
       [179, 189, 177, 130, 127, 179, 133, 112],
       [177, 171, 138,  64,  69, 163,  80,  94],
       [171, 166, 138, 125, 109, 115, 126, 113],
       [105, 130, 128, 114, 106, 114, 115,  93]], dtype=uint8)

每一个数值都代表了一个像素，大小为 0-255，代表了该像素的明度。

第三步：计算哈希

使用numpy.mean()操作对数组求平均值。

>>> numpy.mean(numpy.asarray(Image.open('0.L.resize.ANTIALIAS.png')))
117.953125

diff = pixels > avg

把所有的像素与求得的平均值比较，大于则为True，小于则为False

array([[ True,  True,  True, False, False, False, False, False],
        [False,  True,  True,  True, False, False, False, False],
        [False,  True,  True,  True,  True,  True,  True, False],
        [ True,  True,  True,  True,  True,  True,  True, False],
        [ True,  True,  True,  True,  True,  True,  True, False],
        [ True,  True,  True, False, False,  True, False, False],
        [ True,  True,  True,  True, False, False,  True, False],
        [False,  True,  True, False, False, False, False, False]])

最后再将这 8*8 个 bit 每 4 位组成一个字符，输出哈希为：e0707efefee4f260，这便是这张图片的ahash值

>>> print(imagehash.average_hash(Image.open('0.L.resize.ANTIALIAS.png')))
e0707efefee4f260

phash & phash_simple

phash，又称为perceptual hash，感知哈希算法。感知哈希算法是精度最高的算法，因为涉及使用到了 DCT（离散余弦变换），在提高精度的同时也降低了速度。

离散余弦变换（英语：discrete cosine transform, DCT）是与傅里叶变换相关的一种变换，类似于离散傅里叶变换，但是只使用实数。离散余弦变换相当于一个长度大概是它两倍的离散傅里叶变换，这个离散傅里叶变换是对一个实偶函数进行的（因为一个实偶函数的傅里叶变换仍然是一个实偶函数），在有些变形里面需要将输入或者输出的位置移动半个单位（DCT 有 8 种标准类型，其中 4 种是常见的）。[2]

def phash(image, hash_size=8, highfreq_factor=4):
    """
    Perceptual Hash computation.
    Implementation follows http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html
    @image must be a PIL instance.
    """
    if hash_size < 2:
        raise ValueError("Hash size must be greater than or equal to 2")

    import scipy.fftpack
    img_size = hash_size * highfreq_factor
    image = image.convert("L").resize((img_size, img_size), Image.ANTIALIAS)
    pixels = numpy.asarray(image)
    dct = scipy.fftpack.dct(scipy.fftpack.dct(pixels, axis=0), axis=1)
    dctlowfreq = dct[:hash_size, :hash_size]
    med = numpy.median(dctlowfreq)
    diff = dctlowfreq > med
    return ImageHash(diff)

def phash_simple(image, hash_size=8, highfreq_factor=4):
    """
    Perceptual Hash computation.
    Implementation follows http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html
    @image must be a PIL instance.
    """
    import scipy.fftpack
    img_size = hash_size * highfreq_factor
    image = image.convert("L").resize((img_size, img_size), Image.ANTIALIAS)
    pixels = numpy.asarray(image)
    dct = scipy.fftpack.dct(pixels)
    dctlowfreq = dct[:hash_size, 1:hash_size+1]
    avg = dctlowfreq.mean()
    diff = dctlowfreq > avg
    return ImageHash(diff)

注意：传入的参数hash_size必须要＞ 2，否则会返回ValueError错误。

第一步：将原始图像转为灰度图、缩放图片、平滑滤波

与ahash一样，phash也要将图像变小以便减少计算量。

    import scipy.fftpack
    img_size = hash_size * highfreq_factor
    image = image.convert("L").resize((img_size, img_size), Image.ANTIALIAS)

为了实现 DCT，这里import了一个叫scipy的模块。

SciPy is a collection of mathematical algorithms and convenience functions built on the NumPy extension of Python. It adds significant power to the interactive Python session by providing the user with high-level commands and classes for manipulating and visualizing data.[3]

SciPy是一个基于 Python 的NumPy扩展构建的数学算法和函数的集合。通过为用户提供用于操作和可视化数据的高级命令和类。

其中fftpack则是离散傅里叶变换（Discrete Fourier transforms）的相关库。

image = image.convert("L").resize((img_size, img_size), Image.ANTIALIAS)

image.convert("L")将图片转化为灰度图，如下：

Image.resize((img_size, img_size))缩放图片大小为长宽都为img_size。对于 DCT 来说，32 比较合适，所以默认为hash_size * highfreq_factor即 32。

接着使用Image.ANTIALIAS操作进行平滑滤波。

第二步：离散余弦变换（DCT）

phash

    pixels = numpy.asarray(image)
    dct = scipy.fftpack.dct(scipy.fftpack.dct(pixels, axis=0), axis=1)
    dctlowfreq = dct[:hash_size, :hash_size]

phash_simple

    pixels = numpy.asarray(image)
    dct = scipy.fftpack.dct(pixels)
    dctlowfreq = dct[:hash_size, 1:hash_size+1]

使用numpy.asarray()将image图像的每一个像素读取出明度值，输出大概长这样。

array([[ 33, 155, 191, ...,  12,  14,  17],
       [133, 180, 178, ...,  15,  17,  17],
       [186, 178, 172, ...,  21,  19,  18],
       ...,
       [126, 122, 116, ...,  99, 103,  91],
       [103, 101,  98, ..., 102, 100,  97],
       [ 92,  89,  90, ...,  97,  96,  94]], dtype=uint8)

紧随其后的就是 离散余弦变换 ，这是官网给出的计算公式[4]

 $y_{k}=2 \sum_{n=0}^{N-1} x_{n} \cos \left(\frac{\pi k(2 n+1)}{2 N}\right)$

phash和phash_simple的区别就在这里，前者是对pixels，也就是图像明度的每一轴都变换一次；而后者只运算了一次，所以称之为所谓的simple。

phash_simple中，对于 dct 的取值为什么要从0到hash_size和1到hash_size+1，博客中大佬给出的解释是这样的。

Like the Average Hash, compute the mean DCT value (using only the 8×8 DCT low-frequency values and excluding the first term since the DC coefficient can be significantly different from the other values and will throw off the average). Thanks to David Starkweather for the added information about pHash. He wrote: "the dct hash is based on the low 2D DCT coefficients starting at the second from lowest, leaving out the first DC term. This excludes completely flat image information (i.e. solid colors) from being included in the hash description.""[5]

像平均哈希一样，计算平均 DCT 值（只使用 8×8 的 DCT 低频值，不包括第一项，因为改 DC 系数可能与其他值有很大不同，会影响中位数）。感谢 David Starkweather 提供的关于 pHash 的补充信息。他写道："dct 哈希是基于从最低的第二个开始的低二维 DCT 系数，排除了第一个 DC 项。这就把完全平坦的图像信息（即纯色）排除在散列描述之外"。

dctlowfreq = dct[:hash_size, 1:hash_size+1]

接着减小 DCT 的输出，虽然 DCT 处理之后输出的是 3232 的矩阵，但是我们只保留其中左上角的 88，这代表了该图像中最低频的部分。

第三步：计算哈希

    med = numpy.median(dctlowfreq)
    diff = dctlowfreq > med

对输出的dctlowfreq低频因子用numpu.median()求得中位数，再把低频因子中每一位与该中位数比较，大于则为 True，小于则为 False，再将此差异序列diff输出哈希。

dhash

Like aHash and pHash, dHash is pretty simple to implement and is far more accurate than it has any right to be. As an implementation, dHash is nearly identical to aHash but it performs much better.[6]

与 aHash 和 pHash 一样，dHash 的实现相当简单，而且要准确得多。实现起来，dHash 几乎与 aHash 相同，但它的表现要好得多。

def dhash(image, hash_size=8):
    """
    Difference Hash computation.
    following http://www.hackerfactor.com/blog/index.php?/archives/529-Kind-of-Like-That.html
    computes differences horizontally
    @image must be a PIL instance.
    """
    # resize(w, h), but numpy.array((h, w))
    if hash_size < 2:
        raise ValueError("Hash size must be greater than or equal to 2")

    image = image.convert("L").resize((hash_size + 1, hash_size), Image.ANTIALIAS)
    pixels = numpy.asarray(image)
    # compute differences between columns
    diff = pixels[:, 1:] > pixels[:, :-1]
    return ImageHash(diff)

def dhash_vertical(image, hash_size=8):
    """
    Difference Hash computation.
    following http://www.hackerfactor.com/blog/index.php?/archives/529-Kind-of-Like-That.html
    computes differences vertically
    @image must be a PIL instance.
    """
    # resize(w, h), but numpy.array((h, w))
    image = image.convert("L").resize((hash_size, hash_size + 1), Image.ANTIALIAS)
    pixels = numpy.asarray(image)
    # compute differences between rows
    diff = pixels[1:, :] > pixels[:-1, :]
    return ImageHash(diff)

注意：传入的参数hash_size必须要＞ 2，否则会返回ValueError错误。

第一步：将原始图像转为灰度图、缩放图片、平滑滤波

放缩的方式与ahash和phash一致。

值得我们注意的是，dhash默认的缩放并不是直接缩放成 88，而是 98，具体的理由在下面会提出。

第二步：计算差异

dhash是水平计算哈希，而dhash_vertical是垂直计算哈希。

正如前面所说，dhash比较的是梯度差异，这样就确定了图像的相对梯度方向，九列像素两两比较，生成八个值，这就是为什么我们要选取 9*8 大小的原因。

    image = image.convert("L").resize((hash_size + 1, hash_size), Image.ANTIALIAS)
    pixels = numpy.asarray(image)

第三步：生成哈希

如果每一列右边的像素比左边的亮，则输出Ture，否则输出False。

    diff = pixels[1:, :] > pixels[:-1, :]
    return ImageHash(diff)

whash

什么是 Wavelet（小波）

A wavelet is a wave-like oscillation with an amplitude that begins at zero, increases or decreases, and then returns to zero one or more times. Wavelets are termed a "brief oscillation". A taxonomy of wavelets has been established, based on the number and direction of its pulses. Wavelets are imbued with specific properties that make them useful for signal processing.

小波是一种波状振荡，其振幅从零开始，增加或减少，然后一次或多次返回到零。小波被称为 "短暂振荡"。根据其脉冲的数量和方向，已经建立了一个小波的分类法。小波被赋予了特定的属性，使它们在信号处理中很有用。

For example, a wavelet could be created to have a frequency of Middle C and a short duration of roughly one tenth of a second. If this wavelet were to be convolved with a signal created from the recording of a melody, then the resulting signal would be useful for determining when the Middle C note appeared in the song. Mathematically, a wavelet correlates with a signal if a portion of the signal is similar. Correlation is at the core of many practical wavelet applications.

例如，可以创建一个频率为中央 C，其持续时间大约为十分之一秒的小波。如果将这个小波与一个从旋律录音中产生的信号进行卷积，那么产生的信号将有助于确定中央 C 音符何时出现在歌曲中。在数学上，如果信号的一部分是相似的，那么小波就与信号相关。相关性是许多实际小波应用的核心。

As a mathematical tool, wavelets can be used to extract information from many different kinds of data, including – but not limited to – audio signals and images. Sets of wavelets are needed to analyze data fully. "Complementary" wavelets decompose a signal without gaps or overlaps so that the decomposition process is mathematically reversible. Thus, sets of complementary wavelets are useful in wavelet based compression/decompression algorithms where it is desirable to recover the original information with minimal loss.[7]

作为一种数学工具，小波可以被用来从许多不同种类的数据中提取信息，包括但不限于音频信号和图像。小波也会被成套的用来全面分析数据。互补小波对信号进行分解的没有空隙或重叠，因此分解过程在数学上是可逆的。因此，在基于小波的压缩/解压缩算法中，互补小波的集合是有用的，此算法也许能以最小的损失恢复原始信息。

def whash(image, hash_size = 8, image_scale = None, mode = 'haar', remove_max_haar_ll = True):
    """
    Wavelet Hash computation.
    based on https://www.kaggle.com/c/avito-duplicate-ads-detection/
    @image must be a PIL instance.
    @hash_size must be a power of 2 and less than @image_scale.
    @image_scale must be power of 2 and less than image size. By default is equal to max
        power of 2 for an input image.
    @mode (see modes in pywt library):
        'haar' - Haar wavelets, by default
        'db4' - Daubechies wavelets
    @remove_max_haar_ll - remove the lowest low level (LL) frequency using Haar wavelet.
    """
    import pywt
    if image_scale is not None:
        assert image_scale & (image_scale - 1) == 0, "image_scale is not power of 2"
    else:
        image_natural_scale = 2**int(numpy.log2(min(image.size)))
        image_scale = max(image_natural_scale, hash_size)

    ll_max_level = int(numpy.log2(image_scale))

    level = int(numpy.log2(hash_size))
    assert hash_size & (hash_size-1) == 0, "hash_size is not power of 2"
    assert level <= ll_max_level, "hash_size in a wrong range"
    dwt_level = ll_max_level - level

    image = image.convert("L").resize((image_scale, image_scale), Image.ANTIALIAS)
    pixels = numpy.asarray(image) / 255.

    # Remove low level frequency LL(max_ll) if @remove_max_haar_ll using haar filter
    if remove_max_haar_ll:
        coeffs = pywt.wavedec2(pixels, 'haar', level = ll_max_level)
        coeffs = list(coeffs)
        coeffs[0] *= 0
        pixels = pywt.waverec2(coeffs, 'haar')

    # Use LL(K) as freq, where K is log2(@hash_size)
    coeffs = pywt.wavedec2(pixels, mode, level = dwt_level)
    dwt_low = coeffs[0]

    # Substract median and compute hash
    med = numpy.median(dwt_low)
    diff = dwt_low > med
    return ImageHash(diff)

Wavelets are a popular tool for computational harmonic analysis. They provide localization in both the temporal (or spatial) domain as well as in the frequency domain (Daubechies, 1992). A prominent feature is the ability to perform a multiresolution analysis (S. Mallat, 2008). The wavelet transform of natural signals and images tends to have most of its energy concentrated in a small fraction of the coefficients. This sparse representation property is key to the good performance of wavelets in applications such as data compression and denoising. For example, the wavelet transform is a key component of the JPEG 2000 image compression standard.[9]

小波是计算谐波分析的一个流行工具。它们在时域（或空间）和频域上都能提供定位（Daubechies，1992）。一个突出的特点是能够进行多分辨率分析（S. Mallat, 2008）。自然信号和图像的小波变换倾向于将其大部分能量集中在一小部分系数中。这种稀疏的表示特性是小波在数据压缩和去噪等应用中表现良好的关键。例如，小波变换是 JPEG 2000 图像压缩标准的一个关键组成部分。

提示

colorhash

def colorhash(image, binbits=3):
    """
    Color Hash computation.
    Computes fractions of image in intensity, hue and saturation bins:
    * the first binbits encode the black fraction of the image
    * the next binbits encode the gray fraction of the remaining image (low saturation)
    * the next 6*binbits encode the fraction in 6 bins of saturation, for highly saturated parts of the remaining image
    * the next 6*binbits encode the fraction in 6 bins of saturation, for mildly saturated parts of the remaining image
    @binbits number of bits to use to encode each pixel fractions
    """

    # bin in hsv space:
    intensity = numpy.asarray(image.convert("L")).flatten()
    h, s, v = [numpy.asarray(v).flatten() for v in image.convert("HSV").split()]
    # black bin
    mask_black = intensity < 256 // 8
    frac_black = mask_black.mean()
    # gray bin (low saturation, but not black)
    mask_gray = s < 256 // 3
    frac_gray = numpy.logical_and(~mask_black, mask_gray).mean()
    # two color bins (medium and high saturation, not in the two above)
    mask_colors = numpy.logical_and(~mask_black, ~mask_gray)
    mask_faint_colors = numpy.logical_and(mask_colors, s < 256 * 2 // 3)
    mask_bright_colors = numpy.logical_and(mask_colors, s > 256 * 2 // 3)

    c = max(1, mask_colors.sum())
    # in the color bins, make sub-bins by hue
    hue_bins = numpy.linspace(0, 255, 6+1)
    if mask_faint_colors.any():
        h_faint_counts, _ = numpy.histogram(h[mask_faint_colors], bins=hue_bins)
    else:
        h_faint_counts = numpy.zeros(len(hue_bins) - 1)
    if mask_bright_colors.any():
        h_bright_counts, _ = numpy.histogram(h[mask_bright_colors], bins=hue_bins)
    else:
        h_bright_counts = numpy.zeros(len(hue_bins) - 1)

    # now we have fractions in each category (6*2 + 2 = 14 bins)
    # convert to hash and discretize:
    maxvalue = 2**binbits
    values = [min(maxvalue-1, int(frac_black * maxvalue)), min(maxvalue-1, int(frac_gray * maxvalue))]
    for counts in list(h_faint_counts) + list(h_bright_counts):
        values.append(min(maxvalue-1, int(counts * maxvalue * 1. / c)))
    # print(values)
    bitarray = []
    for v in values:
        bitarray += [v // (2**(binbits-i-1)) % 2**(binbits-i) > 0 for i in range(binbits)]
    return ImageHash(numpy.asarray(bitarray).reshape((-1, binbits)))

— 更新中。。。

参考资料

To be continue ->

「论一个随机图片 api 的养成计划」

序言

imagehash

原理

什么是哈希（hash）？

什么是图像哈希（imagehash）？

图像哈希是怎么起作用的？

安装

使用

函数

_binary_array_to_hex

hex_to_hash

hex_to_flathash

old_hex_to_hash

average_hash

phash & phash_simple

dhash & dhash_vertical

whash

colorhash

_find_region

_find_all_segments

crop_resistant_hash

算法解析

ahash

phash & phash_simple

dhash

whash

什么是 Wavelet（小波）

colorhash

参考资料

发送评论 编辑评论

推荐文章

什么是图像哈希（`imagehash`）？

发送评论编辑评论