如何提取ul标签内容和p标签内容在<dd>标记在php中使用正则表达式


How can i extract ul tag content and p tag content inside <dd> tag using regular expresstion in php

我想从<dd>标签中提取内容,我想取p标签内容和ul标签内容,我尝试在php中使用preg_match_all来获取<dd>内的所有内容,但一无所获这是我的html代码

<dd style="display: block;">
                                    <p>Lightweight, comfy and cool - the dressy shirt he won''t mind wearing!</p>
                                    <ul>
                                        <li>Made of 100% cotton</li>                        
                                        <li>Specially treated for a soft feel</li>                      
                                        <li>Classically styled with a pointed collar and button front</li>                      
                                        <li>Chest pocket; curved shirttail hem</li>                     
                                        <li>Canvas taping at inner neck</li>                        
                                        <li>Imported</li>                       
                                    </ul>

                                    <div id="BVSecondaryCustomerRatings" style="display:none;margin-left: 15px" class="BVBrowserWebkit"> <div class="BVRRRootElement">
<div class="BVRRRatingSummary BVRRSecondaryRatingSummary">
<div class="BVRRRatingSummary BVRRPrimaryRatingSummary"><div class="BVRRRatingSummaryStyle2"><div class="BVRRRatingSummaryNoReviews"> <div id="BVRRRatingSummaryNoReviewsWriteImageLinkID" class="BVRRRatingSummaryLink BVRRRatingSummaryNoReviewsWriteImageLink">
<a name="BV_TrackingTag_Rating_Summary_2_WriteReview_I2613L0022" target="BVFrame" href="http://reviews.childrensplace.com/4154/I2613L0022/writereview.htm?format=embedded&amp;campaignid=BV_RATING_SUMMARY_ZERO_REVIEWS&amp;sessionparams=__BVSESSIONPARAMS__&amp;return=http%3A%2F%2Fwww.childrensplace.com%2Fwebapp%2Fwcs%2Fstores%2Fservlet%2Fproduct_10001_10001_-1_1005476_827676_26601%257C72469%257C813599_boy%257Coutfits%257Cplaid%2520patrol_boy&amp;innerreturn=http%3A%2F%2Freviews.childrensplace.com%2F4154%2FI2613L0022%2Freviews.htm%3Fformat%3Dembedded&amp;user=__USERID__&amp;authsourcetype=__AUTHTYPE__&amp;submissionparams=__BVSUBMISSIONPARAMETERS__&amp;submissionurl=http%3A%2F%2Fwww.childrensplace.com%2Fwebapp%2Fwcs%2Fstores%2Fservlet%2FTCPCheckUserAuthenticationCmd%3FlangId%3D-1%26catalogId%3D10001%26storeId%3D10001"> <img src="http://reviews.childrensplace.com/static/4154/translucent.gif" alt="Write a review">
</a> </div>
<div id="BVRRRatingSummaryLinkWriteFirstID" class="BVRRRatingSummaryLink BVRRRatingSummaryLinkWriteFirst">
<span class="BVRRRatingSummaryLinkWriteFirstPrefix">Be the first to review this item.</span>
<a name="BV_TrackingTag_Rating_Summary_2_SocialBookmarkKaboodle_I2613L0022" target="_blank" class="BVRRSocialBookmarkingSharingLink BVRRSocialBookmarkingSharingLinkKaboodle" onclick="this.href=bvReplaceTokensInSocialURL(this.href);window.open(this.href,'','left=0,top=0,width=795,height=700,toolbar=1,location=0,resizable=1,scrollbars=1'); return false;" onfocus="this.href=bvReplaceTokensInSocialURL(this.href);" rel="nofollow" href="http://reviews.childrensplace.com/4154/share.htm?site=Kaboodle&amp;url=http%3A%2F%2Fwww.childrensplace.com%2Fwebapp%2Fwcs%2Fstores%2Fservlet%2Fproduct_10001_10001_-1_1005476&amp;title=__TITLE__&amp;robot=__ROBOT__&amp;image=http%3A%2F%2Fcontent.childrensplace.com%2Fwww%2Fb%2FTCP%2Fimages%2Fstyles%2F188410_m.jpg" onmouseover="this.href=bvReplaceTokensInSocialURL(this.href);"><img width="16" height="16" class="BVRRSocialBookmarkLinkImage" src="http://reviews.childrensplace.com/static/4154/link-kaboodle.gif" alt="Kaboodle" title="Add To Kaboodle"></a>
</div></div></div></div> </div>
</div>
                                    <p class="TCP-Phrase">Big Fashion, Little Prices</p>
                                    <div id="product_social_icons" style="height: 20px;">




                                            <div class="social_icon current_social">
                                                <div class="twitter"><iframe scrolling="no" frameborder="0" allowtransparency="true" src="http://platform.twitter.com/widgets/tweet_button.1336551279.html#_=1336767195241&amp;count=horizontal&amp;id=twitter-widget-0&amp;lang=en&amp;original_referer=http://www.childrensplace.com/webapp/wcs/stores/servlet/product_10001_10001_-1_1005476&amp;size=m&amp;text=The Childrens Place - plaid shirt&amp;url=http://www.childrensplace.com/webapp/wcs/stores/servlet/product_10001_10001_-1_1005476" class="twitter-share-button twitter-count-horizontal" style="height: 20px; width: 90px;" title="Twitter Tweet Button"></iframe></div>
                                                <div class="pinterest" id="pin_it">
                                                    <iframe scrolling="no" frameborder="0" src="http://pinit-cdn.pinterest.com/pinit.html?url=http://www.childrensplace.com/webapp/wcs/stores/servlet/product_10001_10001_-1_1005476&amp;media=//content.childrensplace.com/www/b/TCP/images/cloudzoom/p/188410_p.jpg&amp;description=plaid shirt&amp;layout=horizontal" style="border: medium none; width: 90px; height: 20px;"></iframe>
                                                </div>
                                                <div class="fb-like-btn" id="fb-root">
                                                    <script src="//connect.facebook.net/en_US/all.js#xfbml=1"></script>
                                                    <fb:like layout="button_count" show_faces="false" width="90" action="like" font="arial" colorscheme="light" fb-xfbml-state="rendered" class="fb_edge_widget_with_comment fb_iframe_widget"><span style="height: 20px; width: 76px;"><iframe id="f111d3371c" name="f5f7b234c" scrolling="no" style="border: none; overflow: hidden; height: 20px; width: 76px;" title="Like this content on Facebook." class="fb_ltr" src="http://www.facebook.com/plugins/like.php?api_key=&amp;locale=en_US&amp;sdk=joey&amp;channel_url=http%3A%2F%2Fstatic.ak.facebook.com%2Fconnect%2Fxd_arbiter.php%3Fversion%3D23%23cb%3Df11898a314%26origin%3Dhttp%253A%252F%252Fwww.childrensplace.com%252Ff210aed7%26domain%3Dwww.childrensplace.com%26relation%3Dparent.parent&amp;href=http%3A%2F%2Fwww.childrensplace.com%2Fwebapp%2Fwcs%2Fstores%2Fservlet%2Fproduct_10001_10001_-1_1005476_827676_26601%257C72469%257C813599_boy%257Coutfits%257Cplaid%2520patrol_boy&amp;node_type=link&amp;width=90&amp;font=arial&amp;layout=button_count&amp;colorscheme=light&amp;action=like&amp;show_faces=false&amp;extended_social_context=false"></iframe></span></fb:like></div>
                                            </div>

                                    </div>
                                </dd>

我搜索了很多来找出这个问题,我尝试了dom解析,但客户端需要regex解析而不是…

这个答案不会告诉你你的做法是不道德的:

$pattern = "/<dd.*?>.*?<p>(.*?)<'/p>.*?<ul>(.*?)<'/ul>/s";
if (preg_match($pattern, $html, $matches)) {
    echo "P-tag content: ".$matches[1];
    echo "<br>";
    echo "UL-tag content: ".$matches[2];
}

我用你发布的HTML测试了它,它工作了。

不要用正则表达式解析html,这是错误的。尝试使用simplexml代替,如果这对您来说太多了,请尝试查询路径:http://querypath.org/

相关文章: