*根据revo的回答更新的问题
以下是工作脚本,其中有一组更好的示例字符串,以显示我的意图-
$strings[] = 'seventy five yards out';
$strings[] = 'sixty yards out';
$strings[] = 'one hundred fifty yards out';
$inputString = 'seventy two yards out';
$inputWords = str_word_count($inputString, 1);
$foundWords = [];
foreach ($strings as $key => $string) {
$stringWords = str_word_count($string, 1);
$wordsCount = array_count_values($stringWords);
$commonWords = array_intersect($inputWords, array_keys($wordsCount));
if (count($commonWords) > 0) {
foreach ($commonWords as $commonWord) {
$foundWords[$key][$commonWord] = $wordsCount[$commonWord];
}
}
}
print_r($foundWords);
我该如何让它打印出"七十五码外",因为它实际上最接近文本?我本来想把单词计数除以一个百分比,但现在想这可能奏效了。。
关键是分别对每个提供的字符串执行str_word_count()
。通过这种方式,我们可以转换为数组,并且根据您的需要处理数组要简单得多。
CCD_ 2对导致单词出现次数的数组的值进行计数。
$strings[] = 'seventy five yards out';
$strings[] = 'sixty yards out';
$strings[] = 'one hundred fifty yards out';
$inputString = 'seventy two yards out';
$inputWords = str_word_count($inputString, 1);
$probabilities = [];
foreach ($strings as $key => $string) {
$stringWords = str_word_count($string, 1);
$wordsCount = array_count_values($stringWords);
$commonWords = array_intersect($inputWords, array_keys($wordsCount));
if (count($commonWords) > 0) {
foreach ($commonWords as $commonWord) {
if (!isset($probabilities[$key])) $probabilities[$key] = 0;
$probabilities[$key] += $wordsCount[$commonWord];
}
$probabilities[$key] /= count($stringWords);
}
}
arsort($probabilities);
echo $strings[key($probabilities)];
输出:
seventy five yards out
概率print_r($probabilities);
:
Array
(
[0] => 0.75
[1] => 0.66666666666667
[2] => 0.4
)
实时演示
这样的东西应该可以工作:
<?php
$g = 'the weather is nice'; // strings to loop through
$n = 'the water is blue';
$b = 'that was a bad movie';
$t = 'hows the weather'; // example input
$test = (str_word_count($t, 1)); // breaks out each word into array
// Comparisons
$comps = array();
// Array sums
$sums = array();
// Search each variable that's been set, as long as it's less that 't'
// A "for" loop will accept letters in addition to numbers, so we'll start with the
// letter "a" and loop through each letter up to "s" (which is one less than "t")
for ($inc = 'a'; $inc < 't'; $inc++) {
// Now, a variable assigned as $$inc will translate into $a, $b, $c ... $s
// and if $a, $b, $c, etc, are set...
if (isset($$inc)) {
// ... assign them to the $comps array with a key of $$inc
$comps[$$inc] = str_word_count($$inc, 1);
// For example, when the "for" loop reaches "f", nothing will be added to the
// $comps array because $f is not set above.
// But when it gets to "g" it'll find that $g HAS been set, and that it has a
// value of "the weather is nice". At this point the $comps array will now look
// like this:
// $comps['the weather is nice'] = array('the', 'weather', 'is', 'nice');
// If you'd like to see this in action (since it might sound a little confusing),
// remove the # from the beginning of each of the following lines that start with #
// (there should be 10 total):
#print "<pre>The loop has reached the letter <b>{$inc}</b> for the value of ";
#print "<b>'$inc</b> and has found that <b>'${$inc}</b> HAS been set in the code.'n";
#print "Adding another dollar sign to <b>'$inc</b> has had the following effects:'n";
#print "- <b>'$inc</b> now looks like <b>'$'$inc</b> (from within the written part of the code)'n";
#print "- <b>'$'$inc</b> translates into <b>'${$inc}</b> (the variable that is acually being evaluated)'n";
#print "- <b>'${$inc}</b> evaluates to <b>{$$inc}</b>'n</pre>";
}
#else {
# print "<pre>The loop has reached the letter <b>{$inc}</b> for the value of <b>'$inc</b>";
# print " and has found that <b>'${$inc}</b> has NOT been set in the code, so it's being skipped.'n";
#}
}
// Avoid errors by checking if empty or not
if (!empty($comps)) {
foreach ($comps as $key => $comp) {
// Find intersections, if any
$candidates[$key] = array_intersect($test, $comp);
// Count the intersections
$counts[$key] = array_count_values($candidates[$key]);
// Add up the intersections
$sums[$key] = array_sum($counts[$key]);
}
}
$winner = '';
if (!empty($sums)) {
// Reverse sort $sums, putting the highest value first
arsort($sums);
// Flip $sums so we can extract the key
$flipped = array_flip($sums);
// Extract the first key off of $sums
$winner = array_shift($flipped);
}
print $winner;
首先,您的问题也是询问出现的次数。但你显然更进一步了,我觉得我应该争取另一个解决方案。
similar_text()
函数
$strings[] = 'sixty yards out';
$strings[] = 'seventy five yards out';
$strings[] = 'one hundred fifty yards out';
$inputString = 'seventy two yards out';
$p = 0;
$k = null;
foreach ($strings as $key => $string) {
similar_text($inputString, $string, $percent);
if ($percent > $p) {
$p = $percent;
$k = $key;
}
}
echo !is_null($k) ? $strings[$k] : "";
输出:
seventy five yards out
实时演示