MySQL:检查使用多个字段的多个表的出现情况


MySQL: check occurrences from multiple tables using multiple fields

我有三个表t1, t2和t2,三个表上的字段名称相同,如

  1. id
  2. 住宅电话
  3. WorkPhone
  4. 手机

我想用t1来检查t2和t3是否出现。我提出了MySQL查询(如下),但当我运行它时,我得到一个空集,因为我知道我实际上在所有三个表中都有重复的电话号码:

SELECT t1.id, t1.HomePhone, t1.WorkPhone, t1.MobilePhone, 
(
SELECT COUNT(*) 
FROM t2 
WHERE t1.HomePhone = t2.HomePhone 
    AND 
        (t2.HomePhone IS NOT NULL OR t2.HomePhone != '')
    UNION 
    SELECT COUNT(*) 
    FROM t2 
    WHERE t1.WorkPhone = t2.WorkPhone 
        AND 
            (t2.WorkPhone IS NOT NULL OR t2.WorkPhone != '')
        UNION 
    SELECT COUNT(*)  
    FROM t2 
    WHERE t1.MobilePhone = t2.MobilePhone 
        AND 
            (t2.MobilePhone IS NOT NULL OR MobilePhone != '')
) AS Countt2,
(
SELECT COUNT(*) 
FROM t3 
WHERE t1.HomePhone = t3.HomePhone 
    AND 
        (t3.HomePhone IS NOT NULL OR t3.HomePhone != '')
    UNION 
    SELECT COUNT(*) 
    FROM t3 
    WHERE t1.WorkPhone = t3.WorkPhone 
        AND 
            (t3.WorkPhone IS NOT NULL OR t3.WorkPhone != '')
        UNION 
    SELECT COUNT(*)  
    FROM t3 
    WHERE t1.MobilePhone = t3.MobilePhone 
        AND 
        (t3.MobilePhone IS NOT NULL  OR MobilePhone != '')
) AS Countt3 
FROM t1 
WHERE 
(t1.HomePhone IS NOT NULL OR t1.HomePhone != '') 
AND (t1.WorkPhone IS NOT NULL OR t1.WorkPhone != '')
AND (t1.MobilePhone IS NOT NULL OR t1.MobilePhone != '')

我在这里可能做错了什么?

前提:Duplicate定义为所有4个字段都匹配。因此,通过合并所有表并"计数>1",您将识别重复项。和groupConcat列出有dup的表。

SELECT count(B.src), B.ID, B.HomePhone, B.WorkPHone, B.MobilePHone, 
    group_Concat(B.SRC) as TablesDupIN 
FROM (
(SELECT 't1'  as SRC, id, HomePhone, WorkPhone, MobilePhone FROM T1)
UNION ALL 
(SELECT 'T2', id, HomePhone, WorkPhone, MobilePhone FROM T2)
UNION ALL 
(SELECT 'T3', id, HomePhone, WorkPhone, MobilePhone FROM T3)) B
GROUP BY ID, HomePhone, WorkPhone, MobilePhone
HAVING count(*) > 1

union返回每个表中的所有记录,但不消除重复记录。just Union将删除重复项。T1 t2 t3放到一个临时表中。因此,我们可以从临时表中按ID、HomePHone、WorkPhone和MobilePhone分组进行选择。假设需要所有4个字段来确定副本。如果不是这种情况,你可以消除字段ID,如果它只是所有的电话必须是相同的。通过对结果进行分组,我们将ID、家庭电话、工作电话和移动电话等行组合在一起。我使用'T1', 'T2'来识别有重复的表,并且重复的表在ID上,以及TalesDupIn中列出的表中的3个电话号码。

的结果类似于

2 4 555-5555 555-5555 555-5555 (t1,t2)
2 5 666-6666 666-6666 666-6666 (t1,t3)
3 7 777-7777 777-7777 777-7777 (t1,t2,t3)
2 8 888-8888 888-8888 888-8888 (t2,t3)

这意味着ID 4在表1和表2中都可以找到ID 5可以在表1和表3中找到。
但是副本是根据匹配ID + 3个电话的所有4个字段确定的。如果ID不是副本的一部分,则可以从选择和分组中删除它,作为group_Concat(ID)添加,并与tablesDupIn结合,将告诉您ID和所涉及的表。根据您打算如何处理结果,这可能是有益的,也可能不是…

您可以使用OR而不是AND

   AND t3.HomePhone IS NOT NULL 
   AND t3.HomePhone != ''
这样的

    AND (t3.HomePhone IS NOT NULL 
         OR t3.HomePhone != '')

>

SELECT 
  COUNT(DISTINCT t1.id)
  COUNT(DISTINCT t2.id)
  COUNT(DISTINCT t3.id)
FROM t1
LEFT JOIN t2 ON
  (t1.HomePhone=t2.HomePhone AND t1.HomePhone IS nOT NULL)
  OR (t1.WorkPhone=t2.WorkPhone AND t1.WorkPhone IS nOT NULL)
  OR (t1.MobilePhone=t2.MobilePhone AND t1.MobilePhone IS nOT NULL)
LEFT JOIN t3 ON
  (t1.HomePhone=t3.HomePhone AND t1.HomePhone IS nOT NULL)
  OR (t1.WorkPhone=t3.WorkPhone AND t1.WorkPhone IS nOT NULL)
  OR (t1.MobilePhone=t3.MobilePhone AND t1.MobilePhone IS nOT NULL)