使用MySQL查询非常慢(发送数据)


Very slow query with MySQL ("Sending Data")

我目前正在使用CodeIgniter框架开发一个PHP/MySQL应用程序。

我有一个相当长的查询,导致一些问题。当将日期范围更改为更长的时间段(例如30天)而不是默认的7天时,就会出现问题。查询时间大幅增加:1/2秒到90秒,但我只能假设这是由于数据大小的增加。

在我粘贴查询之前,下面是对这些表的快速解释:

  • flagged_cases:唯一案例列表(主表)- 352行
  • data_sources:数据源列表,每种情况下使用外键引用该表- 20行
  • matches:一个case的文本匹配行(一对多关系,即一个case,多个match) - 22000行
  • flagged_cases_keywords_hits:大小写id到关键字(和命中数)的映射- 2500行
  • keywords:关键字列表- 121行
  • reviewed_state: id/description for 3个状态,只检查这个查询reviewed_state = 1 - 3行

以下是查询,我意识到它相当大,但我认为一定有索引的潜在问题,不幸的是,我只是没有知识来完全排除故障,所以任何帮助都是感激的。

SELECT    flagged_cases.id, 
          data_source_id, 
          title, 
          fetch_date, 
          publish_date, 
          case_id, 
          case_title, 
          case_link, 
          relevance_score, 
          ( 
                   SELECT   group_concat(match_string_highlighted ORDER BY matches.id SEPARATOR "")
                   FROM     matches 
                   WHERE    flagged_case_id=flagged_cases.id) AS all_matches, 
          reviewed_state_id, 
          ( 
                   SELECT   group_concat(concat(k.keyword, " ", "x", cast(kh.hits AS CHAR), "") SEPARATOR "")
                   FROM     flagged_cases_keywords_hits kh 
                   JOIN     keywords k 
                   ON       kh.keyword_id = k.id 
                   WHERE    kh.flagged_case_id = flagged_cases.id 
                   ORDER BY k.weighting DESC) AS hitcount 
FROM      flagged_cases 
JOIN      data_sources 
ON        flagged_cases.data_source_id = data_sources.id 
JOIN      reviewed_state 
ON        flagged_cases.reviewed_state_id = reviewed_state.id 
LEFT JOIN matches 
ON        flagged_cases.id = matches.flagged_case_id 
WHERE     reviewed_state_id = 1 
AND       data_source_id IN('1', 
                            '3', 
                            '4', 
                            '5', 
                            '6', 
                            '7', 
                            '8', 
                            '9', 
                            '10', 
                            '11', 
                            '12', 
                            '13', 
                            '14', 
                            '15', 
                            '16', 
                            '17', 
                            '18', 
                            '19', 
                            '20') 
AND       fetch_date >= '2015-05-10 00:00:00' 
AND       fetch_date <= '2015-05-17 23:59:59' 
GROUP BY  flagged_cases.id 
ORDER BY  title DESC 
LIMIT     10;

作为做SHOW FULL PROCESSLIST的结果,我可以看到查询停留在"发送数据"状态,从一些研究中我可以看到基本上是MySQL获取和选择数据,所以我只能假设一定有一个缺失的索引或导致这个变慢的东西。

我也得到了查询的EXPLAIN,如下所示:

+----+--------------------+----------------+--------+----------------------------------+-----------------+---------+----------------------------+------+----------------------------------------------+
| id | select_type        | table          | type   | possible_keys                    | key             | key_len | ref                        | rows | Extra                                        |
+----+--------------------+----------------+--------+----------------------------------+-----------------+---------+----------------------------+------+----------------------------------------------+
|  1 | PRIMARY            | reviewed_state | const  | PRIMARY                          | PRIMARY         | 4       | const                      |    1 | Using index; Using temporary; Using filesort |
|  1 | PRIMARY            | data_sources   | range  | PRIMARY                          | PRIMARY         | 4       | NULL                       |   19 | Using where                                  |
|  1 | PRIMARY            | flagged_cases  | ref    | data_source_id,reviewed_state_id | data_source_id  | 4       | proactive.data_sources.id  |   14 | Using where                                  |
|  1 | PRIMARY            | matches        | ref    | flagged_case_id                  | flagged_case_id | 4       | proactive.flagged_cases.id |   32 | Using index                                  |
|  3 | DEPENDENT SUBQUERY | kh             | ref    | flagged_case_id,keyword_id       | flagged_case_id | 5       | func                       |    3 | Using where; Using temporary                 |
|  3 | DEPENDENT SUBQUERY | k              | eq_ref | PRIMARY                          | PRIMARY         | 4       | proactive.kh.keyword_id    |    1 | Using where                                  |
|  2 | DEPENDENT SUBQUERY | matches        | ref    | flagged_case_id                  | flagged_case_id | 4       | func                       |   32 |                                              |
+----+--------------------+----------------+--------+----------------------------------+-----------------+---------+----------------------------+------+----------------------------------------------+

任何帮助/建议/提示非常感谢!:)

您能尝试一下,看看是否有任何好处吗?

选择列表中的子查询将被替换为内联视图,这些视图根据连接到其他表的值分组。

select      flagged_cases.id, 
            data_source_id, 
            title, 
            fetch_date, 
            publish_date, 
            case_id, 
            case_title, 
            case_link, 
            relevance_score, 
            v1.all_matches, 
            reviewed_state_id, 
            v2.hitcount
from        flagged_cases 
       join data_sources 
         on flagged_cases.data_source_id = data_sources.id 
       join reviewed_state 
         on flagged_cases.reviewed_state_id = reviewed_state.id
       join (
                select      group_concat(match_string_highlighted order by matches.id separator "") as all_matches
                from        matches
                group by    flagged_case_id
            ) v1
         on v1.flagged_case_id = flagged_cases.id
       join (
                select      group_concat(concat(k.keyword, " ", "x", cast(kh.hits as char), "") order by k.weighting desc separator "")
                from        flagged_cases_keywords_hits kh 
                       join keywords k 
                         on kh.keyword_id = k.id 
                group by    kh.flagged_case_id
            ) v2
         on v2.flagged_case_id = flagged_cases.id 
  left join matches 
         on flagged_cases.id = matches.flagged_case_id 
where       reviewed_state_id = 1 
        and data_source_id in('1','3','4','5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20') 
        and fetch_date >= '2015-05-10 00:00:00' 
        and fetch_date <= '2015-05-17 23:59:59' 
group by    flagged_cases.id 
order by    title desc
limit       10;