我从url中获取内容并将其存储在数据库中。在此之前,我将所有内容转换为纯文本。为此,我这样做了。
我想在所有段落后添加新的额外行。
我试过了,但结果不太清楚。。
$string = "<p>As a result, his move to Microsoft has raised many questions about the Redmond-based company's plans in the PC gaming space.</p><p>With the Xbox One launch looming, Microsoft has greatly de-emphasized PC gaming of late. </p><p>Holtman's hiring could signal a renewed emphasis on the computer, though.</p>
Arcadia Investment Corp.董事总经理John Taylor表示:"在我看来,一个来自Valve的人,在游戏领域相对于真正强大的B-to-C(企业对消费者)关系而言,似乎没有同行,这可能表明该领域的重要性正在上升
$search = array('@<script[^>]*?>.*?</script>@si', // Strip out javascript
'@<['/'!]*?[^<>]*?>@si', // Strip out HTML tags
'@<style[^>]*?>.*?</style>@siU', // Strip style tags properly
'@<!['s'S]*?--[ 't'n'r]*>@' // Strip multi-line comments including CDATA
);
// remove excess whitespace
// looks for a one or more spaces and replaces them all with a single space.
$string = preg_replace($search, '', $string);
$string = preg_replace('/ +/', ' ', $string);
// check for instances of more than two line breaks in a row
// and then change them to a total of two line breaks
$string = preg_replace('/(?:(?:'r'n|'r|'n)'s*){2}/s', "'r'n'r'n", $string);
file_put_contents('testing.txt', $string );
你没有给出想要的输出,但我认为你想要的是:
<p>Text</p>'r'n
<p>Another text</p>'r'n
不要使用沉重的REG EXP,只需在</p>
上爆炸并添加额外的行:
$array = explode ('</p>', $string);
new_string = '';
$temp = count ($array);
foreach ($array as $key => $paragraph)
{
if ($key !== $temp - 1);
$new_string .= $paragraph . "</p>'r'n";
else
$new_string .= $paragraph;
}
$new_string变量应该是您想要的,如果我是对的,请告诉我。它在之后添加''r''n每个</p>
用于添加额外换行符的正则表达式有一个错误-正确的版本是:
$string = preg_replace('/(?:(?:'r'n|'r|'n)'s*){1,}/s', "'r'n'r'n", $string);
区别如下:{2}(正如您的代码中所示)确保只有在已经有两个换行符的情况下才添加额外的换行符。(对于表达式(?:(?:''r''n|''r''n|''s*),它需要两个匹配项。)
将{2}更改为{1,}可以确保添加一个独立于现有换行符数量的额外换行符。