为PostgreSQL中的WHERE IN子句绑定多行和多列


Binding multiple rows and columns for WHERE IN clause in PostgreSQL

所以我想准备一个查询,比如:

SELECT id FROM users WHERE (branch, cid) IN $1;

然后将一组可变长度的行(如(('a','b'),('c','d')))绑定到它

换句话说,类似于:

pg_prepare($users, 'users_query', 'SELECT id FROM users WHERE (branch, cid) IN $1');
$result = pg_execute($users, 'users_query', array("(('a','b'),('c','d'))");

我之所以需要将两者分开,是因为我想准备一次,然后用尽可能少的开销运行多次。

只使用两条记录进行顺序扫描这一事实毫无意义。对于这样一个微小的集合,索引永远不会比序列扫描更快。我构建了一个类似于您的小示例表,并用一百万行填充它,下面的查询样式始终可以产生良好的计划和快速执行:

prepare s4 as
select id from users
join (select * from (values ($1,$2),($3,$4)) as v(branch, cid)) as p
using (branch, cid);
explain analyze execute s4('b11','c11','b1234','c1234');
                                                    QUERY PLAN                                                    
------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..16.65 rows=1 width=4) (actual time=0.199..0.234 rows=2 loops=1)
   ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=64) (actual time=0.002..0.003 rows=2 loops=1)
   ->  Index Scan using u_i on users  (cost=0.00..8.30 rows=1 width=16) (actual time=0.111..0.112 rows=1 loops=2)
         Index Cond: ((users.branch = "*VALUES*".column1) AND (users.cid = "*VALUES*".column2))
 Total runtime: 0.425 ms

您真正的问题似乎是如何将动态确定数量的值对绑定到sql。我的PHP非常生疏,阅读在线文档让我想起了我是多么讨厌它,但我认为下面的内容可以满足你的需求,用根据你想要绑定的值的数量动态创建的值对占位符的数量构建上面形式的sql。我手头没有php执行环境,所以我甚至没有检查它在语法上是否正确,但您应该能够理解这个想法,并在我的示例中解决任何琐碎的错误。

$values = array(
  'a', 'b',
  'c', 'd',
  // etc...
);
$value_placeholders = "";
$sep = "";
for ($i=1; $i <= $count($values); $i+=2) {
  $value_placeholders = $value_placeholders . sprintf("($%u,$%u),", $i, $i+1) . $sep
  $sep = ",";
}
$sql =
  'select id from users ' .
  'join (select * from (values ' . $value_placeholders . ') as v(branch, cid)) as p' .
  'using (branch, cid)';
$result = pg_query_params($dbconn, $sql, $values);

如果你真的只需要有一个准备好的声明(对于一个不愿意对真实数据集而不是两个记录进行查询的人来说,我们将完全避免谈论过早优化),我想我有一个答案:

create index u_i2 on users ((branch||cid));
prepare sa as select id from users where branch||cid in (select unnest($1::text[]));
explain analyze execute sa(ARRAY['b1c1','b1234c1234']);
                                                      QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=12.17..645.78 rows=50000 width=4) (actual time=0.169..0.188 rows=2 loops=1)
   ->  HashAggregate  (cost=0.02..0.03 rows=1 width=32) (actual time=0.018..0.019 rows=2 loops=1)
         ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.010..0.011 rows=2 loops=1)
   ->  Bitmap Heap Scan on users  (cost=12.14..638.25 rows=500 width=16) (actual time=0.082..0.082 rows=1 loops=2)
         Recheck Cond: ((users.branch || users.cid) = (unnest($1)))
         ->  Bitmap Index Scan on u_i2  (cost=0.00..12.02 rows=500 width=0) (actual time=0.078..0.078 rows=1 loops=2)
               Index Cond: ((users.branch || users.cid) = (unnest($1)))
 Total runtime: 0.275 ms

注意:我找不到对行对的索引访问。但是,如果您对这两个字段的串联进行函数索引,然后提供这样的串联的绑定数组,您将获得一个很好的快速嵌套循环索引扫描。