多行php上数字的regex - regex for numbers on multiple lines php

我有一个文件看起来像这样（是的，换行符是正确的）：

39                                              9
30 30 30 31 34 30 30 32 33 32 36 30 31 38 0D 0A 00014002326018..
39 30 30 30 31 34 30 30 32 33 32 36 30 35 34 0D 900014002326054.
0A                                              .
39 30 30 30 31 34 30 30 32 33 32 36 30 39 31 0D 900014002326091.
0A                                              .
39 30 30 30 31 34 30 30 32 33 32 36 31 36 33 0D 900014002326163.
0A                                              .
39                                              9
30 30 30 31 34 30 30 32 33                      000140023
32 36 32 30 30 0D 0A                            26200..
39                                              9
30 30 30 31 34 30 30 32 33 32 36 32 30 30 0D 0A 00014002326200..
39 30 30 30 31 34 30 30 32 33 32 36 31 32 32 0D 900014002326122.
0A                                              .
39                                              9
30 30 30 31 34 30 30 32 33                      000140023
32 36 31 35 34 0D 0A                            26154..
39 30 30 30 31 34 30 30 32 33                   9000140023
32 36 31 33 31 0D 0A                            26131..
39                                              9
30 30 30 31 34 30 30 32 33                      000140023
32 36 31 30 34 0D 0A                            26104..
39 30 30 30 31 34 30 30 32 33 32 36 30 39 30 0D 900014002326090.
0A                                              .
39 30 30 30 31 34 30 30 32 33 32 36 31 39 37 0D 900014002326197.
0A                                              .
39                                              9
30 30 30 31 34 30 30 32 33 32 36 32 30 38 0D 0A 00014002326208..
39 30 30 30 31 34 30 30 32 33                   9000140023
32 36 31 31 35 0D 0A                            26115..
39                                              9
30 30 30 31 34 30 30 32 33                      000140023
32 36 31 36 34 0D 0A                            26164..
39                                              9
30 30 30 31 34 30 30 32 33                      000140023
32 36 30 31 36 0D 0A 39 30 30 30 31 34 30 30 32 26016..900014002
33                                              3
32 36 32 34 36 0D 0A                            26246..
39                                              9
30 30 30 31 34 30 30 32 33                      000140023
32 36 32 34 36 0D 0A                            26246..
39                                              9
30 30 30 31 34 30 30 32 33                      000140023
32 36 30 37 39 0D 0A                            26079..
39                                              9
30 30 30 31 34 30 30 32 33                      000140023
32 36 31 32 30 0D 0A                            26120..
39                                              9
30 30 30 31 34 30 30 32 33 32 36 32 32 38 0D 0A 00014002326228..
39 30 30 30 31 34 30 30 32 33                   9000140023
32 36 31 38 36 0D 0A                            26186..

我有这个代码可以获取EID标签（以9000开头的数字），但我不知道如何让它执行多行操作。

$data = file_get_contents('tags.txt');
$pattern = "/('d{15})/i";
preg_match_all($pattern, $data, $tags);
$count = 0;
foreach ( $tags[0] as $tag ){
    echo $tag . '<br />';
    $count++;
}
echo "<br />" . $count . " total head scanned";

例如，第一和第二行应该返回900014002326018，而不是忽略第一和第二线

我不擅长正则表达式，所以如果你能解释一下，这样我就可以学习了，不用再让人帮我做简单的正则表达式了，那就太棒了。

编辑：整个数字是从9000 开始的15位

您可以这样做：

$result = preg_replace('~'R?(?:[0-9A-F]{2}'h+)+~', '', $data);
$result = explode('..', rtrim($result, '.'));

图案细节：

'R?            # optional newline character
(?:            # open a non-capturing group
  [0-9A-F]{2}  # two hexadecimal characters
  'h+          # horizontal white characters (spaces or tabs)
)+             # repeat the non-capturing group one or more times

更换后，您必须删除的内容只有两个点。删除尾随点后，可以使用这些点将字符串分解为数组。

另一种方式

既然你知道整数（和点）的部分前面总是有48个字符，你也可以使用这个模式：

$result = preg_replace('~(?:^|'R).{48}~', '', $data);

没有正则表达式的其他方法

其想法是逐行读取文件，由于内容之前的长度始终相同（即16*3个字符->48个字符），因此提取带有整数的子字符串，并将其连接到$data临时变量中。

ini_set("auto_detect_line_endings", true);
$data = '';
$handle = @fopen("tags.txt", "r");
if ($handle) {
    while (($buffer = fgets($handle, 128)) !== false) {
        $data .= substr($buffer, 48, -1);
    }
    if (!feof($handle)) {
        echo "Error: fgets() has failed'n";
    }
    fclose($handle);
} else {
    echo "Error opening the file'n";
}
$result = explode ('..', rtrim($data, '.'));

注意：如果文件是windows格式（以行'r'n结尾），则必须将substr()函数的第三个参数更改为-2。如果你对如何检测换行类型感兴趣，你可以看看这篇文章。

我认为用一个正则表达式甚至不可能做到这一点，但如果您一步一步地处理这一问题，您的代码将更加易读和可维护。

这是有效的，应该不难弄清楚它是如何工作的：

$eid_tag_src = <<<END_EID_TAGS
39                                              9
30 30 30 31 34 30 30 32 33 32 36 30 31 38 0D 0A 00014002326018..
39 30 30 30 31 34 30 30 32 33 32 36 30 35 34 0D 900014002326054.
  :
 etc.
  :
39 30 30 30 31 34 30 30 32 33                   9000140023
32 36 31 38 36 0D 0A                            26186..
END_EID_TAGS;
/* Remove hex data from first 48 characters of each line */
$eid_tag_src = preg_replace('/^.{48}/m','',$eid_tag_src);
/* Remove all white space */
$eid_tag_src = preg_replace('/'s+/','',$eid_tag_src);
/* Replace dots (CRLF) with spaces */
$eid_tag_src = str_replace('..',' ',$eid_tag_src);
/* Convert to array of EID tags */
$eid_tags = explode(' ',trim($eid_tag_src));
print_r($eid_tags);

这是输出：

Array
(
    [0] => 900014002326018
    [1] => 900014002326054
    [2] => 900014002326091
    [3] => 900014002326163
    [4] => 900014002326200
    [5] => 900014002326200
    [6] => 900014002326122
    [7] => 900014002326154
    [8] => 900014002326131
    [9] => 900014002326104
    [10] => 900014002326090
    [11] => 900014002326197
    [12] => 900014002326208
    [13] => 900014002326115
    [14] => 900014002326164
    [15] => 900014002326016
    [16] => 900014002326246
    [17] => 900014002326246
    [18] => 900014002326079
    [19] => 900014002326120
    [20] => 900014002326228
    [21] => 900014002326186
)

以下是一种使用有效抓取（无需替换）的方法：

RegEx:/(?:^.{48}|'.)([0-9]+'.?)/m-演示

意思是（在普通英语中）：开始抓取数字，然后是一个可选的点如果从行的开头开始，前面有48个字符或者一个点（特殊情况）。

你的代码可能是这样的：

$pattern = '/(?:^.{48}|'.)([0-9]+'.?)/m'; 
preg_match_all($pattern, $data, $tags);
//join all the bits belonging to the number
$data=implode("", $tags[1]); 
//count the dots to have a correct count of the numbers grabbed
//since each number was grabbed with an ending dot initially
$count=substr_count($data, ".");
//replace the dots with a html <br> tag (avoiding a split and a foreach loop)
$tags=str_replace('.', "<br>", $data); 
print $tags . "<br>" . $count . " total scanned";

现场查看代码http://3v4l.org/Z4EhI