获取标签之间的内容,包括内部标签


get content between tags include inner tags

我有了内容:

<html>
<body>
    <div class="another div">
        other content
    </div>
    <div class="fck_detail width_common">
        <p class="Normal">
            Some text 1.
        </p>
        <p class="Normal">
            Some text 2.
        </p>
        <div style="text-align:center;">
            <div class="embed-container">
                <div id="video-18574" data-component="true" data-component-type="video" data-component-value="18574" data-component-typevideo="2"></div>
            </div>
        </div>
        <p class="Normal">
            Some text 3.
        </p>
        <p class="Normal">
            Some text 4.
        </p>
    </div>
</body>
</html>

我使用下面的函数来获取'div class="fck_detail width_common"'的内容

function get_content_by_tag($content, $tag_and_more, $include_tag = true){
        $p = stripos($content,$tag_and_more,0);
        if($p==false) return "";
        $content=substr($content,$p);
        $p = stripos($content," ",0);
        if(abs($p)==0) return "";
        $open_tag = substr($content,0,$p);
        $close_tag = substr($open_tag,0,1)."/".substr($open_tag,1).">";
        $count_inner_tag = 0;
        $p_open_inner_tag = 1; 
        $p_close_inner_tag = 0;
        $count=1;
        do{
            $p_open_inner_tag = stripos($content,$open_tag,$p_open_inner_tag);
            $p_close_inner_tag = stripos($content,$close_tag,$p_close_inner_tag);
            $count++;
            if($p_close_inner_tag!=false) $p = $p_close_inner_tag;
            if($p_open_inner_tag!=false){
                if(abs($p_open_inner_tag)<abs($p_close_inner_tag)){
                    $count_inner_tag++;
                    $p_open_inner_tag++;
                }else{
                    $count_inner_tag--;
                    $p_close_inner_tag++;
                }
            }else{
                $count_inner_tag--;
                if($p_close_inner_tag>0) $p_close_inner_tag++;
            }
        }while($count_inner_tag>0);
        if($include_tag)
            return substr($content,0,$p+strlen($close_tag));
        else{
            $content = substr($content,0,$p);
            $p = stripos($content,">",0);
            return substr($content,$p+1);
        }
    }

then I try

echo get_content_by_tag($content, '<div class="fck_detail width_common">');

只返回:

<div class="fck_detail width_common">
    <p class="Normal">
        Some text 1.
    </p>
    <p class="Normal">
        Some text 2.
    </p>
    <div style="text-align:center;">
        <div class="embed-container">
            <div id="video-18574" data-component="true" data-component-type="video" data-component-value="18574" data-component-typevideo="2"></div>
        </div>
    </div>

缺少内容为"some text 3"answers"some text 4"的DIV

谁能告诉我怎么了?

一种方法是通过PHP Simple HTML DOM Parser

$str = '
<html>
<body>
    <div class="another div">
        other content
    </div>
    <div class="fck_detail width_common">
        <p class="Normal">
            Some text 1.
        </p>
        <p class="Normal">
            Some text 2.
        </p>
        <div style="text-align:center;">
            <div class="embed-container">
                <div id="video-18574" data-component="true" data-component-    type="video" data-component-value="18574" data-component-typevideo="2"></div>
        </div>
    </div>
    <p class="Normal">
        Some text 3.
    </p>
    <p class="Normal">
        Some text 4.
    </p>
</div>
</body>
</html> 
';
$html = str_get_html($str);
echo $html->find("div[class='fck_detail width_common']",0)->innertext;

尝试使用这个库:http://sourceforge.net/projects/simplehtmldom/

您可以通过以下方式获取数据

$url="www.yoururl.html";
$html = new simple_html_dom();
$html = file_get_html($url);
$data = $html->find('.fck_detail',0);

$html = str_get_html($str);
$data = $html->find("div[class='fck_detail width_common']",0)->innertext;