我正试图编写一个正则表达式,将匹配字符串的前8个单词(包括末尾的任何标点符号),但是当一个单词包含撇号或单引号字符时,我遇到了问题。我当前的正则表达式如下:
/('b['w,']+[.?'!'"]*'s){8}/
和我的例子字符串是:
Went for Valentine's day, food was about a B, filet mignon was served chopped up
当前,我得到的匹配是:
s day, food was about a B, filet
但是我希望它是这样的:
Went for Valentine's day, food was about a
我尝试将'
引入我的字符集['w,']
,但它不能正常工作。如有任何帮助,不胜感激。
谢谢!
虽然这可以用regex来完成,但至少可以用preg_split来完成:
$string="Went for Valentine's day, food was about a B, filet mignon was served chopped up";
$words=preg_split("/'s+/",$string);
#If there are more than eight words, only take the first eight elements of $words.
if(count($words)>8)
{
$words=array_slice($words,0,8);
}
echo implode(" ",$words) . "'n";
这会产生以下输出:
Went for Valentine's day, food was about a
这基本上将撇号视为单词字符:
'b('w|')+'b
如果你想在正则表达式中包含撇号,正确的方法是这样做:
['w']+
那么就可以根据需要得到单词边界。'b
$text = "Knock, knock. Who's there? r2d2!";
$pattern = "/(?:'w''w|'w)+/";
$words = preg_match_all($pattern, $text, $matches);
var_dump($matches);