MYSQL select where所有子字符串都存在于一个字符串中


MYSQL select where all substrings are present in a string

我有一个字符串存储在表中,如:

1. "the quick brown fox"
2. "the quick brown fox jumps"
3. "the quick brown fox jumps over the lazy dog"
4. "the quick potato does nothing"

给定三个输入单词,我想在三个单词都在字符串

中找到时返回该条目

我这样写:

WHERE word1 IN stringfield AND word2 IN stringfield AND word3 IN stringfeild

但是,我想选择性地提供额外的输入单词,以便根据包含与输入单词最匹配的条目过滤结果。所有返回的匹配至少有三个匹配。

例如输入:

"the", "quick", "brown", "fox", "jumps", "over"

的回报:

3.
2. 
1. 

因为3有最多的匹配项,然后是2,然后是1。而4没有被选中,因为它不包含至少三个匹配项。

这有可能吗?这是最快的方法吗,还是用连接表更好?如果有,那是怎么回事?非常感谢。

首先,你最好使用mysql的全文功能。点击这里阅读。

我假设你正在动态地构造你的where子句,所以如果你有五个单词,你可以构造:

WHERE stringfield LIKE '%word1%' OR
      stringfield LIKE '%word2%' OR
      stringfield LIKE '%word3%' OR
      stringfield LIKE '%word4%' OR
      stringfield LIKE '%word5%' 

IN操作符根本没有做你认为它在做的事情。

如果你可以这样做,那么完整的查询也会有:

WHERE ((stringfield LIKE '%word1%') +
       (stringfield LIKE '%word2%') +
       (stringfield LIKE '%word3%') +
       (stringfield LIKE '%word4%') +
       (stringfield LIKE '%word5%')
      ) >= 3
ORDER BY ((stringfield LIKE '%word1%') +
          (stringfield LIKE '%word2%') +
          (stringfield LIKE '%word3%') +
          (stringfield LIKE '%word4%') +
          (stringfield LIKE '%word5%')
         ) DESC

MySQL在数值上下文中将布尔表达式视为整数。这使得计算匹配次数变得特别容易。但是,正如我所说,全文索引可能才是您真正需要的。

当我开始觉得某件事很难的时候,我就会努力去完成它……下面是一个解决方案:(name of the Database is 'Test')

第一次创建这个函数

DELIMITER $$
CREATE DEFINER = 'root'@'%'
FUNCTION Test.countOccurence (LineTocheck nvarchar(255), criteriaToMatch nvarchar(15))
RETURNS int(11)
BEGIN
  DECLARE Occurences int DEFAULT 0;
  SELECT
    (LENGTH(LineTocheck) - LENGTH(REPLACE(LineTocheck, criteriaToMatch, ''))) / LENGTH(criteriaToMatch) INTO Occurences;
  RETURN Occurences;
END
$$
DELIMITER ;

第二次执行查询:

SELECT Generic.id
    ,Description
    ,SUM(countOccurence(Description, c.criteria))
FROM Generic
    ,criteria c
GROUP BY Description
    ,Generic.id
ORDER BY SUM(countOccurence(Description, c.criteria)) desc

注:表结构为:对于标准:

CREATE TABLE Test.criteria (
  id int(11) NOT NULL AUTO_INCREMENT,
  criteria varchar(15) NOT NULL,
  PRIMARY KEY (id)
)
ENGINE = INNODB
AUTO_INCREMENT = 1
CHARACTER SET utf8
COLLATE utf8_general_ci;

对于要搜索occurrence

的表
CREATE TABLE Test.Generic (
  id int(11) NOT NULL AUTO_INCREMENT,
  Description varchar(255) NOT NULL,
        PRIMARY KEY (id)
)
ENGINE = INNODB
AUTO_INCREMENT = 1
CHARACTER SET utf8
COLLATE utf8_general_ci;
SET NAMES 'utf8';
INSERT INTO Test.criteria(id, criteria) VALUES
(1, 'fox');
INSERT INTO Test.criteria(id, criteria) VALUES
(2, 'brown');
INSERT INTO Test.criteria(id, criteria) VALUES
(3, 'over');
SET NAMES 'utf8';
INSERT INTO Test.Generic(id, Description) VALUES
(1, 'the quick brown fox');
INSERT INTO Test.Generic(id, Description) VALUES
(2, 'the quick brown fox jumps');
INSERT INTO Test.Generic(id, Description) VALUES
(3, 'the quick brown fox jumps over the lazy dog');
INSERT INTO Test.Generic(id, Description) VALUES
(4, 'the quick potato does nothing');

使用Dbforge MySQL Studio Express(免费)连接MySQL并运行语句http://www.devart.com/login.html?returnToUrl=/dbforge/mysql/studio/download.html%3Ffd=dbforgemysqlfree.exe

测试一下,然后告诉我