我有以下两个包含HTML代码的变量:
$var1= Profile photo uploaded<div class="comment_attach_image">
<a class="group1 cboxElement"
href="http://52.1.47.143/file/attachment/2015/03/a4ea5532b83a56bbbae2fffc80de4fee.png" >
<img src="http://52.1.47.143/file/attachment/2015/03/a4ea5532b83a56bbbae2fffc80de4fee.png" height="150px" width="150px" />
</a>
<a class="comment_attach_image_link_dwl" href="http://52.1.47.143/feed/download/year_2015/month_03/file_a4ea5532b83a56bbbae2fffc80de4fee.png" >Download</a>
</div>
$var2 = PDF file added<div class="comment_attach_file">
<a class="comment_attach_file_link" href="http://52.1.47.143/feed/download/year_2015/month_03/file_1b87d4420c693f2bbdf738cbf2457d89.pdf" >1b87d4420c693f2bbdf738cbf2457d89.pdf</a>
<a class="comment_attach_file_link_dwl" href="http://52.1.47.143/feed/download/year_2015/month_03/file_1b87d4420c693f2bbdf738cbf2457d89.pdf" >Download</a>
</div>
我只想从上述两个变量中提取 URL。我想要从上述两个变量中得到的如下:
$new_var1 = http://52.1.47.143/file/attachment/2015/03/a4ea5532b83a56bbbae2fffc80de4fee.png;
$new_var2 = http://52.1.47.143/feed/download/year_2015/month_03/file_1b87d4420c693f2bbdf738cbf2457d89.pdf ;
如何在PHP中以高效和更智能的方式做到这一点?
或者用PHP的方式做(是的...j/k):
<?php
$var1 = 'Profile photo uploaded<div class="comment_attach_image">
<a class="group1 cboxElement"
href="http://52.1.47.143/file/attachment/2015/03/a4ea5532b83a56bbbae2fffc80de4fee.png" >
<img src="http://52.1.47.143/file/attachment/2015/03/a4ea5532b83a56bbbae2fffc80de4fee.png" height="150px" width="150px" />
</a>
<a class="comment_attach_image_link_dwl" href="http://52.1.47.143/feed/download/year_2015/month_03/file_a4ea5532b83a56bbbae2fffc80de4fee.png" >Download</a>
</div>';
$var2 = 'PDF file added<div class="comment_attach_file">
<a class="comment_attach_file_link" href="http://52.1.47.143/feed/download/year_2015/month_03/file_1b87d4420c693f2bbdf738cbf2457d89.pdf" >1b87d4420c693f2bbdf738cbf2457d89.pdf</a>
<a class="comment_attach_file_link_dwl" href="http://52.1.47.143/feed/download/year_2015/month_03/file_1b87d4420c693f2bbdf738cbf2457d89.pdf" >Download</a>
</div>';
$url_regex = '/(href|src)="(.*?)"/';
preg_match_all($url_regex, $var1, $matches);
var_dump($matches);
preg_match_all($url_regex, $var2, $matches);
var_dump($matches);
将产生这个:
array(3) {
[0]=>
array(3) {
[0]=>
string(86) "href="http://52.1.47.143/file/attachment/2015/03/a4ea5532b83a56bbbae2fffc80de4fee.png""
[1]=>
string(85) "src="http://52.1.47.143/file/attachment/2015/03/a4ea5532b83a56bbbae2fffc80de4fee.png""
[2]=>
string(100) "href="http://52.1.47.143/feed/download/year_2015/month_03/file_a4ea5532b83a56bbbae2fffc80de4fee.png""
}
[1]=>
array(3) {
[0]=>
string(4) "href"
[1]=>
string(3) "src"
[2]=>
string(4) "href"
}
[2]=>
array(3) {
[0]=>
string(79) "http://52.1.47.143/file/attachment/2015/03/a4ea5532b83a56bbbae2fffc80de4fee.png"
[1]=>
string(79) "http://52.1.47.143/file/attachment/2015/03/a4ea5532b83a56bbbae2fffc80de4fee.png"
[2]=>
string(93) "http://52.1.47.143/feed/download/year_2015/month_03/file_a4ea5532b83a56bbbae2fffc80de4fee.png"
}
}
array(3) {
[0]=>
array(2) {
[0]=>
string(100) "href="http://52.1.47.143/feed/download/year_2015/month_03/file_1b87d4420c693f2bbdf738cbf2457d89.pdf""
[1]=>
string(100) "href="http://52.1.47.143/feed/download/year_2015/month_03/file_1b87d4420c693f2bbdf738cbf2457d89.pdf""
}
[1]=>
array(2) {
[0]=>
string(4) "href"
[1]=>
string(4) "href"
}
[2]=>
array(2) {
[0]=>
string(93) "http://52.1.47.143/feed/download/year_2015/month_03/file_1b87d4420c693f2bbdf738cbf2457d89.pdf"
[1]=>
string(93) "http://52.1.47.143/feed/download/year_2015/month_03/file_1b87d4420c693f2bbdf738cbf2457d89.pdf"
}
}
请参阅preg_match_all
了解所包含的内容。如果您真的只需要匹配的第一个URL,请选择preg_match
,它具有与preg_match_all
相同的功能签名。
如果你试图解析一个DOM,JavaScript将是一个更好的选择。但是,如果您坚持使用PHP,请尝试下载此名为Simple HTML DOM的HTML解析器。他们的网站上有很好的文档,但对于您要做的事情,我会使用以下内容
// Get the contents of your page
$html = file_get_html('http://linkto.com/yourfile.html');
// Find all links this way
foreach($html->find('a') as $element) {
echo $element->href.'<br>';
}
// Target the two particular variables as follows
// Target the first variable by the anchor tag's class name
$new_var1 = $html->find('a[class=group1 cboxElement]', 0)->href;
$new_var2 = $html->find('a[class=comment_attach_file_link_dwl]', 0)->href;