为给定字符串获取唯一的8位数字


Get unique 8 digit number for a given string

有人知道为给定字符串生成唯一的8或9位数字的算法吗?最好有一个php示例,如果没有的话,至少是算法。

您可以使用crc32()并返回len。

<?php 
function crc_string($str, $len){
    return substr(sprintf("%u", crc32($str)),0,$len);
}
echo crc_string('some_string', 8);//65585849
?>

编辑

在根据我的答案进行碰撞/可靠性测试后,你很可能会在长度为8的情况下发生碰撞,在长度为9的情况下可能会稍微少一些,然后在长度为10的情况下甚至更少。在我的测试中,我测试了一个从0到100k的递增值,共有26次碰撞,第一次发生在36k处。

<?php 
set_time_limit(0);
header('Content-type: text/html; charset=utf-8');
$time_start = microtime(true);
function crc_string($str, $len){
    return substr(sprintf("%u", crc32($str)),0,$len);
}
echo 'Started, please wait...<br />';
$record = array();
$collisions = 0;
for($i=0; $i<100000;$i++){
    $new = crc_string($i, 8);
    if(in_array($new,$record)){
        $match = array_search($new,$record);
        $took_time = microtime(true) - $time_start;
        echo($new.' has collided for iteration '.$i.' matching against a previous iteration ('.$match.') '.$record[$match]).' (Process time: '.round($took_time,2).'seconds)<br />';
        $collisions++;
    }else{
        $record[]=$new;
    }
    ob_flush();
    flush();
}
echo 'Successfully iterated 100k incrementing values and '.$collisions.' collisions occurred; total processing time: '.round((microtime(true) - $time_start),2).'seconds.';
?>

测试结果:

Started, please wait...
38862356 has collided for iteration 36084 matching against a previous iteration (8961) 38862356 (Process time: 165.47seconds)
18911887 has collided for iteration 36887 matching against a previous iteration (8162) 18911887 (Process time: 172.79seconds)
37462269 has collided for iteration 38245 matching against a previous iteration (33214) 37462269 (Process time: 185.81seconds)
20153794 has collided for iteration 38966 matching against a previous iteration (6083) 20153794 (Process time: 192.87seconds)
41429622 has collided for iteration 40329 matching against a previous iteration (24999) 41429622 (Process time: 206.41seconds)
20784356 has collided for iteration 48908 matching against a previous iteration (27095) 20784356 (Process time: 302.75seconds)
39932561 has collided for iteration 51926 matching against a previous iteration (12367) 39932561 (Process time: 340.88seconds)
14372225 has collided for iteration 53032 matching against a previous iteration (13211) 14372225 (Process time: 355.46seconds)
16636457 has collided for iteration 55490 matching against a previous iteration (39250) 16636457 (Process time: 389.44seconds)
23059743 has collided for iteration 63126 matching against a previous iteration (39808) 23059743 (Process time: 504.1seconds)
13627299 has collided for iteration 63877 matching against a previous iteration (21973) 13627299 (Process time: 516.08seconds)
24647738 has collided for iteration 63973 matching against a previous iteration (47328) 24647738 (Process time: 517.62seconds)
14471815 has collided for iteration 71118 matching against a previous iteration (37805) 14471815 (Process time: 641.93seconds)
13253269 has collided for iteration 73602 matching against a previous iteration (33064) 13253269 (Process time: 687.53seconds)
10732050 has collided for iteration 73706 matching against a previous iteration (9197) 10732050 (Process time: 689.44seconds)
18919349 has collided for iteration 80358 matching against a previous iteration (73190) 18919349 (Process time: 819.89seconds)
40795042 has collided for iteration 81875 matching against a previous iteration (31127) 40795042 (Process time: 851.3seconds)
14609922 has collided for iteration 82498 matching against a previous iteration (17366) 14609922 (Process time: 864.29seconds)
20425272 has collided for iteration 83914 matching against a previous iteration (9858) 20425272 (Process time: 894.32seconds)
24790147 has collided for iteration 84519 matching against a previous iteration (9754) 24790147 (Process time: 907.34seconds)
35605337 has collided for iteration 91434 matching against a previous iteration (36127) 35605337 (Process time: 1060.5seconds)
30935494 has collided for iteration 91857 matching against a previous iteration (91704) 30935494 (Process time: 1070.17seconds)
28520037 has collided for iteration 92929 matching against a previous iteration (28847) 28520037 (Process time: 1095.53seconds)
31109474 has collided for iteration 95584 matching against a previous iteration (30349) 31109474 (Process time: 1159.36seconds)
40842617 has collided for iteration 97330 matching against a previous iteration (13609) 40842617 (Process time: 1203.19seconds)
20309913 has collided for iteration 99224 matching against a previous iteration (94210) 20309913 (Process time: 1250.54seconds)
Successfully iterated 100k incrementing values and 26 collisions occurred; total processing time: 1269.98seconds.

结论是,除非你对自动递增的值进行1对1的递增,否则当你填充用户表时,你总是会遇到相同字节长度或更多字节长度的冲突:

echo sprintf("%08d",'1');//00000001
echo sprintf("%08d",'2');//00000002
...                      //99999999

你可以通过在冲突的值中添加另一个字节来解决这个问题,或者像md5()/sha()散列函数tho一样包含a-z范围,这样会破坏对象;p

祝好运

碰撞会发生,是的,但由于您没有说明为什么需要这样做,我认为碰撞无关紧要。

您可以获得字符串的md5散列(十六进制),并将其转换为我们的数字系统,然后将其截断为所需的数字。

这可能对您有所帮助:php:仅数字哈希?

10^9唯一的9位数字,而每个长度都有256^length字符串(假设为ascii字符串)。

因此,根据鸽子洞原理-对于长度为4+的字符串,您无法获得唯一的数字。(必须发生碰撞)

作为一种替代方案,您可能会查看传统的哈希函数(它们会发生冲突)或使用无界数字。

如前所述,如果数字的位数少于要关联的字符串的位数,则"唯一性"是不可能的。

您正在寻找的是一个好的哈希函数。

查看MD6算法。它有一个可定制的摘要长度高达512位,所以你可以创建有8-9位小数的摘要。我不知道有任何PHP实现,最初的实现语言是C.