PHP - 从 HTML 数组子项中提取多个信息


PHP - Extract several information from HTML array children

我得到了一个名为$topPaid的数组:

Array
(
    [0] => <li>
        <a class="livelink" href="#%21/content/5664">
            <span title="Relief Terrain Pack v3.2" class="title">Relief Terrain Pack v3.2</span>
            <br><small>
                    Editor Extensions/Terrain
            </small>
            <br></a>
    </li>
    [1] => <li>
        <a class="livelink" href="#%21/content/368">
            <span title="Playmaker" class="title">Playmaker</span>
            <br><small>
                    Editor Extensions/Visual Scripting
            </small>
            <br></a>
    </li>
    [2] => <li>
        <a class="livelink" href="#%21/content/4243">
            <span title="Amplify Motion" class="title">Amplify Motion</span>
            <br><small>
                    Scripting/Effects
            </small>
            <br></a>
    </li>
    [3] => <li>
        <a class="livelink" href="#%21/content/16899">
            <span title="Skele: Character Animation Tools" class="title">Skele: Character Animation Tools</span>
            <br><small>
                    Editor Extensions/Modeling
            </small>
            <br></a>
    </li>
    [4] => <li>
        <a class="livelink" href="#%21/content/19245">
            <span title="SnazzyGrid" class="title">SnazzyGrid</span>
            <br><small>
                    Editor Extensions/Utilities
            </small>
            <br></a>
    </li>
    [5] => <li>
        <a class="livelink" href="#%21/content/19352">
            <span title="Zones, Fields, and Shields" class="title">Zones, Fields, and Shields</span>
            <br><small>
                    Shaders
            </small>
            <br></a>
    </li>
    [6] => <li>
        <a class="livelink" href="#%21/content/18920">
            <span title="PlayerPrefs Elite" class="title">PlayerPrefs Elite</span>
            <br><small>
                    Scripting/Integration
            </small>
            <br></a>
    </li>
    [7] => <li>
        <a class="livelink" href="#%21/content/18358">
            <span title="Bolt" class="title">Bolt</span>
            <br><small>
                    Scripting/Network
            </small>
            <br></a>
    </li>
    [8] => <li>
        <a class="livelink" href="#%21/content/13198">
            <span title="BIG Environment Pack Vol.3" class="title">BIG Environment Pack Vol.3</span>
            <br><small>
                    3D Models/Environments
            </small>
            <br></a>
    </li>
    [9] => <li>
        <a class="livelink" href="#%21/content/23930">
            <span title="VertExmotion" class="title">VertExmotion</span>
            <br><small>
                    Editor Extensions/Animation
            </small>
            <br></a>
    </li>
)

现在我尝试取出"href"链接,"标题"和"小"文本,以使用以下代码在表格中显示它们:

foreach($topPaid as $key => $value)
{
    $xml = simplexml_load_string($key);
    $list = $xml->xpath("//@href");
    $preparedUrls = array();
    foreach($list as $item) {
        $item = parse_url($item);
        $preparedUrls[] = $item['scheme'] . '://' .  $item['host'] . '/';
    }
    print_r($preparedUrls);
}

但是我总是收到尝试访问非成员对象的错误。我应该逐行解析每个数组元素并解析其行内容,还是有什么更好的方法来获取信息?

格莱兹,

我没有

尝试您的代码,但在我看来,错误是一致的:

$xml = simplexml_load_string($key);

应该是:

$xml = simplexml_load_string($value);

并且您传递给函数的文本不是有效的 XML,<br>应该是<br />

通过检查以下行来解决它:

foreach($topPaid as $key => $value)
{
    foreach(preg_split("/(('r?'n)|('r'n?))/", $value) as $line)
    {
        $difline = strip_tags($line);
        if(strpos($line, '<a class="livelink" href="#%21/') !== false)
        {
            $link = explode('href="', $line);
            $link = substr($link[1], 0, -2);
            // https://www.assetstore.unity3d.com/en/#!/content/4243
            $link = str_replace('#%21', 'https://www.assetstore.unity3d.com/en/#!', $link);
            //print_r($link);
        }
        else if(strpos($line, '<span title') !== false)
        {
            $title = explode('title=', $line);
            $title = explode('class=', $title[1]);
            $title = $title[0];
            //print_r($title);
        }
        else if($difline == $line)
        {
            $type = $line;
            //print_r($type);
        }
    }
}