Symfony 2 DOM爬网程序.获取不带标记的文本


Symfony 2 DOM Crawler. Take text without tag

我用以下代码抓取页面:

<br/>
<td class="PropertyBody">
<b>Category:</b>
 Miscellanea: Soft Skill
<br>
<b>Owner:</b>
<a href="mailto:">blabla</a>
<br>
<b>Location:</b>
 bla bla
<br>
<b>Duration:</b>
 6:00
<br>
<b>Max attendees:</b>
 15
<br>
<b>Start at:</b>
 7/19/2012 10:00:00 AM
<br>
<b>Your status:</b>
<br>
</td>

如何使用Symfony Crawler从该代码中提取'7/19/2012 10:00:00 AM'$crawler->filter('.PropertyBody > b')->eq(5)->text();只取'Start at:'

谢谢,我做到了:

$bigPiece = $crawler->filter('.PropertyBody')->text();
        //getting CATEGORY         
         $pos = strpos($bigPiece, ':')+1;
         $pos2 = strpos($bigPiece, 'Owner:');
         $category = trim(substr($bigPiece, $pos, $pos2-$pos));
         $this->category = $category;
        //getting OWNER
         $pos = strpos($bigPiece, 'Owner:')+6;
         $pos2 = strpos($bigPiece, 'Location:');
         $owner = trim(substr($bigPiece, $pos, $pos2-$pos));
         $training->setOwner($owner);
        //getting LOCATION
         $pos = strpos($bigPiece, 'Location:')+9;
         $pos2 = strpos($bigPiece, 'Duration:');
         $location = trim(substr($bigPiece, $pos, $pos2-$pos));
         $training->setLocation($location);
        //getting DURATION
         $pos = strpos($bigPiece, 'Duration:')+9;
         $pos2 = strpos($bigPiece, 'Max attendees:');
         $duration = trim(substr($bigPiece, $pos, $pos2-$pos));
         $training->setDuration($duration);
        //getting MAXATTENDEES
         $pos = strpos($bigPiece, 'Max attendees:')+14;
         $pos2 = strpos($bigPiece, 'Start at:');
         $maxattendees = trim(substr($bigPiece, $pos, $pos2-$pos));
         $training->setMaxattendies($maxattendees);
        //getting START AT
         $pos = strpos($bigPiece, 'Start at:')+9;
         $pos2 = strpos($bigPiece, 'Your status:');
         $start = trim(substr($bigPiece, $pos, $pos2-$pos));
         $training->setStarts($start);

如果您需要测试这个特定的情况,但您没有能力添加标记(它是封闭的),那么您可能应该考虑使用PHPUnit的assertContains()

$text = $crawler->filter('.PropertyBody > b')->text();
$this->assertContains('7/19/2012 10:00:00 AM', $text);

添加一个span标记。做一些类似的事情:

<b>Start at:</b>
<span class="wantthis">7/19/2012 10:00:00 AM</span>

然后用选择

$crawler->filter('.wantthis')->text();