通过PHP cURL将文档添加到Apache Solr


Add document to Apache Solr via PHP cURL

我不知道我做错了什么。该记录不会被添加。

下面是我的代码:
$ch = curl_init("http://127.0.0.1:8983/solr/collection1/update/json?commit=true");
$data = array(
    "add" => array( "doc" => array(
        "id"   => "HW132",
        "name" => "Hello World"
    ))
);
$data_string = json_encode($data);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_POST, TRUE);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-type: application/json'));
curl_setopt($ch, CURLOPT_POSTFIELDS, $data_string);
$response = curl_exec($ch);

这是我从Solr得到的响应:

{"responseHeader":{"status":0,"QTime":4}}

显然,我需要请求Apache Solr提交文档。它不会自动提交文档,或者我不知道如何配置它来自动提交。下面是工作示例。希望能对有同样问题的人有所帮助。

$ch = curl_init("http://127.0.0.1:8983/solr/collection1/update?wt=json");
$data = array(
    "add" => array( 
        "doc" => array(
            "id"   => "HW2212",
            "title" => "Hello World 2"
        ),
        "commitWithin" => 1000,
    ),
);
$data_string = json_encode($data);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_POST, TRUE);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-type: application/json'));
curl_setopt($ch, CURLOPT_POSTFIELDS, $data_string);
$response = curl_exec($ch);

所以不清楚你使用的是什么版本的Solr, 3。X或4。X(它们在处理提交的方式上有所不同,但将涵盖两者)。无论哪种情况,您都可以在solrconfig.xml文件

中进行这些更改

3。您可以指定自动提交的文档数或毫秒数或两者兼而有之。在达到阈值后,Solr将提交您的更改,因此您不必在代码中:

 <!-- autocommit pending docs if certain criteria are met.  Future versions may expand the available
     criteria -->
    <autoCommit>
      <maxDocs>10000</maxDocs> <!-- maximum uncommited docs before autocommit triggered -->
      <maxTime>15000</maxTime> <!-- maximum time (in MS) after adding a doc before an autocommit is triggered -->
      <openSearcher>false</openSearcher> <!-- SOLR 4.0.  Optionally don't open a searcher on hard commit.  This is useful to minimize the size of transaction logs that keep track of uncommitted updates. -->
    </autoCommit>

4。X中还有SoftCommit选项。它使更改在同步到磁盘之前可以用于搜索:

 <!-- SoftAutoCommit
         Perform a 'soft' commit automatically under certain conditions.
         This commit avoids ensuring that data is synched to disk.
         maxDocs - Maximum number of documents to add since the last
                   soft commit before automaticly triggering a new soft commit.
         maxTime - Maximum amount of time in ms that is allowed to pass
                   since a document was added before automaticly
                   triggering a new soft commit.
      -->
     <autoSoftCommit>
       <maxTime>1000</maxTime>
     </autoSoftCommit>

我发现在solrconfig.xml中思考和实现这些设置,而不是依赖于应用程序代码级别的提交,可以获得更可预测的结果。

关于Solr提交的更完整的讨论可以在这里找到:

http://wiki.apache.org/solr/SolrConfigXml Update_Handler_Sectionhttp://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

我试图张贴一个xml,但我不知道为什么下面的解决方案工作。文档说我应该使用'@' +文件路径来上传文件,但它不起作用。所以我这样做了:

<?php
$url = 'http://localhost:8080/solr/update/?commit=true';
$file = realpath('/home/fabio/target_file.xml');
$header = array(
    "Content-Type: text/xml",
);
$post = file_get_contents($file);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_POST, TRUE);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
curl_setopt($ch, CURLOPT_VERBOSE, TRUE); 
echo curl_exec($ch);
curl_close($ch);

我得到这个OK(200)状态:

*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> POST /solr/update/?commit=true HTTP/1.1
Host: localhost:8080
Accept: */*
Content-Type: text/xml
Content-Length: 3502
Expect: 100-continue
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
< HTTP/1.1 200 OK
< Server: Apache-Coyote/1.1
< Content-Type: application/xml;charset=UTF-8
< Transfer-Encoding: chunked
< Date: Tue, 19 Jan 2016 16:52:19 GMT
< 
* Connection #0 to host localhost left intact
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">269</int></lst>
</response>