正则表达式帮助…单词中的撇号


Regular Expression Help... apostrophes within words

我正试图编写一个正则表达式,将匹配字符串的前8个单词(包括末尾的任何标点符号),但是当一个单词包含撇号或单引号字符时,我遇到了问题。我当前的正则表达式如下:

/('b['w,']+[.?'!'"]*'s){8}/

和我的例子字符串是:

Went for Valentine's day, food was about a B, filet mignon was served chopped up

当前,我得到的匹配是:

s day, food was about a B, filet

但是我希望它是这样的:

Went for Valentine's day, food was about a

我尝试将'引入我的字符集['w,'],但它不能正常工作。如有任何帮助,不胜感激。

谢谢!

虽然这可以用regex来完成,但至少可以用preg_split来完成:

$string="Went for Valentine's day, food was about a B, filet mignon was served chopped up";
$words=preg_split("/'s+/",$string);
#If there are more than eight words, only take the first eight elements of $words.
if(count($words)>8)
{
  $words=array_slice($words,0,8);
}
echo implode(" ",$words) . "'n";

这会产生以下输出:

Went for Valentine's day, food was about a

这基本上将撇号视为单词字符:

'b('w|')+'b

如果你想在正则表达式中包含撇号,正确的方法是这样做:

['w']+

那么就可以根据需要得到单词边界。'b

$text = "Knock, knock. Who's there? r2d2!";
$pattern = "/(?:'w''w|'w)+/";
$words = preg_match_all($pattern, $text, $matches);
var_dump($matches);