作为一名开发人员,我知道一个优秀的DB人员是物有所值的。我经常发现自己使用效率非常低的方法来解决非关键问题,但在目前的情况下,我需要加快速度,而不是"让它发挥作用"。我甚至不会展示到目前为止我做了什么,因为这太尴尬了,但我们只说我在主PHP while循环中有子查询。很抱歉
我有几个表需要连接在一起,以便向索引引擎(ApacheSOLR)发出json请求。
tbl_contacts
+----+--------------+---------------+
| ID | FirstName | LastName |
+----+--------------+---------------+
| 1 | Joe | Blogs |
+----+--------------+---------------+
| 2 | Jane | Baker |
+----+--------------+---------------+
| 3 | John | Doe |
+----+--------------+---------------+
tbl_attributes_map
+----+--------------+---------------+
| ID | ContactID | AttributeID |
+----+--------------+---------------+
| 1 | 1 | 1 |
+----+--------------+---------------+
| 2 | 1 | 3 |
+----+--------------+---------------+
| 3 | 2 | 2 |
+----+--------------+---------------+
tbl_attributes
+----+---------------+---------------+
| ID | AttributeType | Attribute |
+----+---------------+---------------+
| 1 | Lower | Shoe |
+----+---------------+---------------+
| 2 | Upper | T-Shirt |
+----+---------------+---------------+
| 3 | Upper | Vest |
+----+---------------+---------------+
tbl_notes
+----+---------------+---------------+
| ID | ContactID | Note |
+----+---------------+---------------+
| 1 | 1 | Big feet |
+----+---------------+---------------+
| 2 | 2 | Showoff |
+----+---------------+---------------+
| 3 | 2 | Sweaty |
+----+---------------+---------------+
tbl_appointment
+----+---------------+---------------+--------------------+-------------------+
| ID | ContactID | Location | TimeFrom | TimeTo |
+----+---------------+---------------+--------------------+-------------------+
| 1 | 1 | Big Ben | 2015-12-16 08:00:00|2015-12-16 08:30:00|
+----+---------------+---------------+--------------------+-------------------+
| 1 | 2 | London | 2015-12-17 10:00:00|2015-12-17 11:00:00|
+----+---------------+---------------+--------------------+-------------------+
| 1 | 2 | New York | 2015-12-16 12:00:00|2015-12-16 12:30:00|
+----+---------------+---------------+--------------------+-------------------+
我需要运行一个查询,它本质上允许我打印一个数组结构,例如:
Array(
[FirstName] => Joe
[LastName] => Blogs
[Upper] => Array(
Vest
)
[Lower] => Array(
Shoe
)
[Notes] => Array(
Big Feet
)
[Location] => Array(
Big Ben
)
[ApptFrom] => Array(
2015-12-16 08:00:00
)
[ApptTo] => Array(
2015-12-16 08:30:00
)
如果我可以进入一个阶段,在那里我可以运行查询以获得以下输出,那么我可以运行一个分隔符来内爆我需要为其构建数组的字段。例如:
+----+------------+-----------+---------+---------+-----------------+-------------------+-----------------------------------------+-----------------------------------------+
| ID | FirstName | LastName | Upper | Lower | Notes | Location | ApptFrom | ApptTo |
+----+------------+-----------+---------+---------+-----------------+-------------------+-----------------------------------------+-----------------------------------------+
| 2 | Jane | Baker | T-Shirt | | Show off,Sweaty | London,New York | 2015-12-17 10:00:00,2015-12-16 12:00:00 | 2015-12-17 11:00:00,2015-12-16 12:30:00 |
+----+------------+-----------+---------+---------+-----------------+-------------------+-----------------------------------------+-----------------------------------------+
我的剧本目前有效,但性能很差。浏览80000个联系人大约需要3个小时:-/
提前谢谢。
当我开始索引Elasticsearch(使用Apache Lucene作为SOLR)5000万数据时,我也遇到了完全类似的情况,现在只需要几个小时,我想你可以做以下事情:
- 对查询进行解释,看看查询是否使用了正确的索引
- 尝试使用更多的sub-select而不是join,因为mysql在为数百万数据选择索引时有问题(您可能会使用强制索引)
- 启动多线程进行索引