PHP 计数器增加了一个额外的增量


PHP counter adds an extra increment

我基本上有一个简单的程序,它从表单中获取一些文本作为输入,将文本中的所有单词匹配到两个词典。一个词典包含正面单词列表,另一个词典包含负面单词列表。对于每个正单词匹配,$posMatchCount递增。对于每个负词匹配,$negMatchCount递增。完成一个简单的比较,如果正词更大,程序返回"正",否则,它返回"负"。如果正词 == 负词,或者没有正或负匹配,则返回"中性"。以下是完整的代码:

        <?php
include("positive_lexicon.php");
include("negative_lexicon.php");
?>
<html>
<head>
    <title>Output</title>
</head>
<body>
<h1>Output</h1>  
<hr>
<?php

$preprocessedDoc 2 ="我喜欢这款手机,但讨厌我喜欢屏幕尺寸的电池";

/////////////////////////////////////////////////////////////////////////////////match doc text with POSITIVE sentiment lexicon
$matchedPosWords = NULL;//contains matched words
$posMatchCount = 0;//count of POS matches
$array1 = explode(' ', $preprocessedDoc2);
foreach($array1 as $word){
    if(preg_match("/'s{$word}'s/", $positiveLexicon)){
        $matchedPosWords = $matchedPosWords . $word . " - ";
        $posMatchCount++;
        $posMatch = true; //for subjectivity check
    }
    else{
        $posMatch= false; //for subjectivity check
    }
}
   echo "Matched POSITIVE words: <br><br>";
   echo "<div style='"background-color:#66FF66'">";
   echo $matchedPosWords . " (Total: {$posMatchCount})";
   echo "</div>";
   echo "<br><br>";
/////////////////////////////////////////////////////////////////////////////////match doc text with NEGATIVE sentiment lexicon   
$matchedNegWords = NULL;//contains matched words
$negMatchCount = 0;//count of NEG matches
$array2 = explode(' ', $preprocessedDoc2);
foreach($array2 as $word2){
    if(preg_match("/'s{$word2}'s/", $negativeLexicon)){
        $matchedNegWords = $matchedNegWords . $word2 . " - ";
        $negMatchCount++;
        $negMatch = true; //for subjectivity check
    }
    else{
        $negMatch = false; //for subjectivity check
    }
}
   echo "Matched NEGATIVE words: <br><br>";
   echo "<div style='"background-color:#FF5050'">";
   echo $matchedNegWords . " (Total: {$negMatchCount})";
   echo "</div>";
   echo "<br><br>";
/////////////////////////////////////////////////////////////////////////////////comparison between POSITIVE and NEGATIVE words
echo "analyzing document's sentiment ...<br><br>";
function checkPolarity($posWords, $negWords, $posMatch1, $negMatch1){//function to check polarity of doc

    if((($posMatch1==false) && ($negMatch1==false))||($posWords==$negWords)){
        return "<strong>NEUTRAL</strong>"; //if there are no POS or NEG matches, or matches are equal, return NEUTRAL
    }
    if($posWords > $negWords){
        return "<strong>POSITIVE</strong>"; //if count of POS matches is greater than count of NEG matches, return POSITIVE
    }
    else{
        return "<strong>NEGATIVE</strong>"; //if count of NEG matches is greater than count of POS matches, return NEGATIVE
    }

}
$polarity = checkPolarity($posMatchCount, $negMatchCount, $posMatch, $negMatch); //call function to check polarity   
echo "Polarity of the document is: " . $polarity; //display overall polarity
echo "<br><br>";
$polarity = "";

?>
</body>
</html>

但是,有时即使正词的数量大于负词,它也会返回"神经"。有时它会增加一个额外的增量。例如,字符串输入"我喜欢这款手机,但讨厌我喜欢屏幕尺寸的电池"返回以下内容:

Matched POSITIVE words:
love - adore - - (Total: 3)

Matched NEGATIVE words:
hate - - (Total: 2)

即使只有两个正匹配和一个负匹配,它给出的正匹配计数为 3,负匹配计数为 2。我知道这个问题会立即在SO上被发现,即使我似乎找不到它。我会试试运气..

在我看来,

代码看起来没有错。但是你放的输出

Matched POSITIVE words:
love - adore - - (Total: 3)

Matched NEGATIVE words:
hate - - (Total: 2)

您在最后一个条目中都有一个空格用于正或负匹配,我认为这是错误的。

如果您喜欢,请将代码更改为此以进行调试和检查。

echo "Foreach for Positive words started <br/>";
foreach($array1 as $word){
    if(preg_match("/'s{$word}'s/", $positiveLexicon) && trim($word) != "" ){
        echo $word."= <br/>"; // there should be no empty word in this
        $matchedPosWords = $matchedPosWords." - ". $word; // there should be no dash at the last, only word
        $posMatchCount++;
        $posMatch = true; //for subjectivity check
    }
    else{
        $posMatch= false; //for subjectivity check
    }
}
echo "Foreach for Positive words Ended <br/>";