PHP - 如何从内容块中检索和处理所有图像及其父锚标记


PHP - How can I retrieve and process all images and their parent anchor tags from a content block?

注意:我正在使用Wordpress,但我认为它与答案无关,所以我在SO上问过它。如果我错了,请告诉我/移动问题。

好的,我正在加载丰富的内容块(通过 Wordpress),这些内容经常包含许多包裹在锚标签中的图像。我想逐步浏览所有这些,以便将它们显示为a标签,其中包含相关的img

我已经找到了这个方便的正则表达式驱动的代码,它可以很好地为我提供图像:

            // Get the all post content in a variable
            $posttext = $post->post_content;
            //$posttext1 = get_cleaned_excerpt();
            // We will search for the src="" in the post content
            $regular_expression = '~src="[^"]*"~';
            $regular_expression1 = '~<img [^'>]*' />~';
            // WE will grab all the images from the post in an array $allpics using preg_match_all
            preg_match_all( $regular_expression, $posttext, $allpics );
            // Count the number of images found.
            $NumberOfPics = count($allpics[0]);
            // This time we replace/remove the images from the content
             $only_post_text = preg_replace( $regular_expression1, '' , $posttext1);
            /*Only text will be printed*/
            // Check to see if we have at least 1 image
            if ( $NumberOfPics > 0 )
            {
            $this_post_id = get_the_ID();

            for ( $i=0; $i < $NumberOfPics ; $i++ )
            {           $str1=$allpics[0][$i];
            $str1=trim($str1);
            $len=strlen($str1);
            $imgpath=substr_replace(substr($str1,5,$len),"",-1);

            $theImageSrc = $imgpath;
            global $blog_id;
            if (isset($blog_id) && $blog_id > 0) {
                $imageParts = explode('/files/', $theImageSrc);
                if (isset($imageParts[1])) {
                    $theImageSrc = '/blogs.dir/' . $blog_id . '/files/' . $imageParts[1];
                }
    }
            ?>
            <img class="alignleft" src='<?php echo get_bloginfo('template_directory').'/timthumb.php?src=' . $theImageSrc  . '&h=150&w=150'; ?>' height="150" width="150" alt=""/>

我真的很想用相关的父a包裹底部img。这里的任何帮助将不胜感激。

要搜索的内容的示例可能是:

    <h5>
    <a href="http://www.example.com/imagefoo.jpg">
        <img class="size-thumbnail wp-image-4091 alignleft" src="http://www.example.com/imagefoo-150x150.jpg" alt="" width="150" height="150" />
    </a>
</h5>
<h5>
    <a href="http://www.example.com/Image-Bar.jpg">
        <img class="wp-image-4087 alignleft" title="Image - Bar" src="http://www.example.com/Image-Bar-150x150.jpg" alt="" width="150" height="150" />
    </a>
</h5>
<h5>
    <a href="http://www.example.com/Image-Alphe.jpg">
        <img class="wp-image-4090 alignleft" title="Image-Alpha" src="http://www.example.com/Image-Alpha-150x150.jpg" alt="" width="150" height="150" />
    </a>
</h5>
    <a href="http://www.example.com/EXAMPLE-image-150.jpg"><img class="size-thumbnail wp-image-4088 alignleft" title="EXAMPLE-image-150" src="http://www.example.com/EXAMPLE-image-150-150x150.jpg" alt="" width="150" height="150" /></a>
<h5>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</h5>
<a href="http://www.example.com/insanely-long-permalink-created-as-if-by-a-madman-who-knows-no-bounds-of-shame/" rel="attachment wp-att-2780">
    <img class="alignright size-thumbnail wp-image-2780" title="Exhibition Title: Image Name by Artist Person" src="http://www.example.com/wp-content/uploads/2011/12/ExtraordinaryImage-150x150.jpg" alt="Example UK | Exhibition: Image by Artist Person" width="150" height="150" />
</a>
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

编辑:这是基于我需求的工作代码。它使用XPath,基于cHao在下面的答案。(对于它的价值,我发现Tizag的网页作为XPath入门非常有用,与这个EarthInfo页面一起。

            // Get the all post content in a variable
            $posttext = $post->post_content;
            $document = DOMDocument::loadHTML($posttext);
            $xpath = new DOMXPath($document);
             $i = 0;
            # for each link that has an image inside it, set its href equal to
            # the image's src.
            foreach ($xpath->query('//a/img/..') as $link) :

                $img = $link->getElementsByTagName('img')->item(0);
                $link_src = $link->getAttribute('href');
                $link_title = $link->getAttribute('title');
                $img_src = $img->getAttribute('src');

                $theImageSrc = $img_src;
                global $blog_id;
                if (isset($blog_id) && $blog_id > 0) {
                    $imageParts = explode('/files/', $theImageSrc);
                    if (isset($imageParts[1])) {
                        $theImageSrc = '/blogs.dir/' . $blog_id . '/files/' . $imageParts[1];
                    }
                }
                ?>
                <a href="<?php echo $link_src; ?>" rel="lightbox[<?php echo $this_post_id; ?>]" title="<?php if ($link_title) {
                    echo $link_title;
                } else { the_title(); } ?>" class="cboxElement">
                <img class="alignleft" src='<?php echo get_bloginfo('template_directory').'/timthumb.php?src=' . $theImageSrc  . '&h=150&w=150'; ?>' height="150" width="150" alt=""/>
            </a>
            <?php
            endforeach;
            ?>

您最好不要尝试使用正则表达式来查找图像。 他们在解析HTML方面很糟糕。

相反,请查看 DOMDocument 和 DOMXPath 类。

$document = DOMDocument::loadHTML($posttext);
$xpath = new DOMXPath($document);
# for each link that has an image inside it, set its href equal to
# the image's src.
foreach ($xpath->query('//a[/img]') as $link) {
    $img = $link->getElementsByTagName('img')->item(0);
    $src = $img->getAttribute('src');
    # do your mangling of $src here, resulting in $href.
    # for example...
    $href = preg_replace('/-'d+x'd+(?='.[^.]*$)/', '', $src);
    $link->setAttribute('href', $href);
}
$fixed_html = $document->saveHTML();