我有一个相当大且非常混乱的数据文件,我希望从中过滤有用的数据。它的结构看起来像这样:
!bla bla
more bla
some useless data
something interesting
something interesting
something interesting
some useless data
something interesting
something interesting
some useless data
bla bla
我的计划是用file_get_contents()
读取文件,然后用str_replace()
替换一些数据并将其用作标记。接下来,我尝试将无用的数据从文件的开头删除到marker1
,然后从marker2
删除到marker3
,然后从marker4
删除到文件的末尾,这样我只能在输出中获得有用的数据(目前我还不确定是否需要数据中的标记)。我尝试使用strstr()
,但无法使其工作。
!bla bla
more bla
some useless data
==marker1==
something interesting
something interesting
something interesting
==marker2==
some useless data
==marker3==
something interesting
something interesting
==marker4==
some useless data
bla bla
我将使用explode()
将生成的有用数据传输到我的数据库中。
编辑:我就这样解决了。
preg_match('/(==marker1==)(.*?)(==marker2==)/s', $input, $marker1to2);
$marker1to2 = trim($marker1to2[2]);
$marker1to2 = preg_replace('/something /', '==marker1== something ', $marker1to2, 1);
echo $marker1to2;
您需要正则表达式:
$data = "!bla bla
more bla
some useless data
==marker1==
something interesting
something interesting
something interesting
==marker2==
some useless data
==marker3==
something interesting
something interesting
==marker4==
some useless data
bla bla";
preg_match("/(==marker1==)(.*)(==marker2==)/s", $data, $marker1to2);
$marker1to2 = trim($marker1to2[2]);
preg_match("/(==marker3==)(.*)(==marker4==)/s", $data, $marker3to4);
$marker3to4 = trim($marker3to4[2]);
echo "Marker 1 to 2:'n$marker1to2'n'n";
echo "Marker 3 to 4:'n$marker3to4'n'n";
输出:
标记1至2:有趣的事有趣的事有趣的事标记3至4:有趣的事有趣的事