此代码在Linux上可用,但在Windows:上不匹配
if ( preg_match ( "~<meta name='date' content='(.*)'>'n<meta name='time' content='(.*)'>'n<meta name='venue' content='(.*)'>'n~", file_get_contents($filename), $matches) )
...
我想行尾编码是错误的。我应该如何修改模式结束编码独立?
Windows的行结尾为:
"'r'n"
最简单的解决方案是:
if (preg_match ("~<meta name='date' content='(.*)'>'n<meta name='time' content='(.*)'>'n<meta name='venue' content='(.*)'>'n~", file_get_contents($filename), $matches)
||
preg_match("~<meta name='date' content='(.*)'>'r'n<meta name='time' content='(.*)'>'r'n<meta name='venue' content='(.*)'>'r'n~", file_get_contents($filename), $matches))
正确的解决方案可能是:
if (preg_match("~<meta name='date' content='(.*)'>['r]?'n<meta name='time' content='(.*)'>['r]?'n<meta name='venue' content='(.*)'>['r]?'n~", file_get_contents($filename), $matches))
也就是说,您可能真的应该使用另一种方法来处理HTML&XML。有专门为此构建的解析器。
例如。http://docs.php.net/manual/en/domdocument.loadhtml.php或http://php.net/manual/en/book.xml.php
顺便说一句,我也没有真正测试过,但iirc,它们有效。Regex不是我经常使用的东西。
编辑:看起来工作正常吗?
$file = "iorahgjajgasjgasjgasjgjaagaspokadsfgals<meta name='date' content='(.*)'>'n<meta name='time' content='(.*)'>'n<meta name='venue' content='(.*)'>'niorahgjajgasjgasjgasjgjaagaspokadsfgals";
if (preg_match("~<meta name='date' content='(.*)'>'n<meta name='time' content='(.*)'>'n<meta name='venue' content='(.*)'>'n~", $file, $matches)
|| preg_match ("~<meta name='date' content='(.*)'>'r'n<meta name='time' content='(.*)'>'r'n<meta name='venue' content='(.*)'>'r'n~", file, $matches)) {
echo "Success";
}
else {
echo "Fail";
}
$file = "iorahgjajgasjgasjgasjgjaagaspokadsfgals<meta name='date' content='(.*)'>'r'n<meta name='time' content='(.*)'>'n<meta name='venue' content='(.*)'>'r'niorahgjajgasjgasjgasjgjaagaspokadsfgals";
if (preg_match ("~<meta name='date' content='(.*)'>['r]?'n<meta name='time' content='(.*)'>['r]?'n<meta name='venue' content='(.*)'>['r]?'n~", $file, $matches)) {
echo "Success";
}
else {
echo "Fail";
}