排除字符串并限制正则表达式中的过滤器长度


excluding string and limiting the filter length in regex

这是我的正则表达式。我想

  1. 长度为8-14的滤波器模式。
  2. 从正则表达式匹配中排除123456789

正则表达式:

^(?=.{8,14})b$'(?(?:(?:0(?:0|11)')?['s-]?'(?|'+)44')?['s-]?'(?(?:0')?['s-]?'(?)?|0)(?:'d{2}')?['s-]?'d{4}['s-]?'d{4}|'d{3}')?['s-]?'d{3}['s-]?'d{3,4}|'d{4}')?['s-]?(?:'d{5}|'d{3}['s-]?'d{3})|'d{5}')?['s-]?'d{4,5}|8(?:00['s-]?11['s-]?11|45['s-]?46['s-]?4'd))(?:(?:['s-]?(?:x|ext'.?'s?|'#)'d+)?)$^|^2(?:0[01378]|3[0189]|4[017]|8[0-46-9]|9[012])'d{7}|1(?:(?:1(?:3[0-48]|[46][0-4]|5[012789]|7[0-49]|8[01349])|21[0-7]|31[0-8]|[459]1'd|61[0-46-9]))'d{6}|1(?:2(?:0[024-9]|2[3-9]|3[3-79]|4[1-689]|[58][02-9]|6[0-4789]|7[013-9]|9'd)|3(?:0'd|[25][02-9]|3[02-579]|[468][0-46-9]|7[1235679]|9[24578])|4(?:0[03-9]|2[02-5789]|[37]'d|4[02-69]|5[0-8]|[69][0-79]|8[0-5789])|5(?:0[1235-9]|2[024-9]|3[0145689]|4[02-9]|5[03-9]|6'd|7[0-35-9]|8[0-468]|9[0-5789])|6(?:0[034689]|2[0-689]|[38][013-9]|4[1-467]|5[0-69]|6[13-9]|7[0-8]|9[0124578])|7(?:0[0246-9]|2'd|3[023678]|4[03-9]|5[0-46-9]|6[013-9]|7[0-35-9]|8[024-9]|9[02-9])|8(?:0[35-9]|2[1-5789]|3[02-578]|4[0-578]|5[124-9]|6[2-69]|7'd|8[02-9]|9[02569])|9(?:0[02-589]|2[02-689]|3[1-5789]|4[2-9]|5[0-579]|6[234789]|7[0124578]|8'd|9[2-57]))'d{6}|1(?:2(?:0(?:46[1-4]|87[2-9])|545[1-79]|76(?:2'd|3[1-8]|6[1-6])|9(?:7(?:2[0-4]|3[2-5])|8(?:2[2-8]|7[0-4789]|8[345])))|3(?:638[2-5]|647[23]|8(?:47[04-9]|64[015789]))|4(?:044[1-7]|20(?:2[23]|8'd)|6(?:0(?:30|5[2-57]|6[1-8]|7[2-8])|140)|8(?:052|87[123]))|5(?:24(?:3[2-79]|6'd)|276'd|6(?:26[06-9]|686))|6(?:06(?:4'd|7[4-79])|295[567]|35[34]'d|47(?:24|61)|59(?:5[08]|6[67]|74)|955[0-4])|7(?:26(?:6[13-9]|7[0-7])|442'd|50(?:2[0-3]|[3-68]2|76))|8(?:27[56]'d|37(?:5[2-5]|8[239])|84(?:3[2-58]))|9(?:0(?:0(?:6[1-8]|85)|52'd)|3583|4(?:66[1-8]|9(?:2[01]|81))|63(?:23|3[1-4])|9561))'d{3}|176888[234678]'d{2}|16977[23]'d{3}|7(?:[1-4]'d'd|5(?:0[0-8]|[13-9]'d|2[0-35-9])|624|7(?:0[1-9]|[1-7]'d|8[02-9]|9[0-689])|8(?:[014-9]'d|[23][0-8])|9(?:[04-9]'d|1[02-9]|2[0-35-9]|3[0-689]))'d{6}|76(?:0[012]|2[356]|4[0134]|5[49]|6[0-369]|77|81|9[39])'d{6}|80(?:0'd{6,7}|8'd{7})|500'd{6}|(?:87[123]|9(?:[01]'d|8[0-3]))'d{7}|8(?:4[2-5]|70)'d{7}|70'd{8}|56'd{8}|(?:3[0347]|55)'d{8}|8(?:001111|45464'd)$|(?:'(('+?'d+)?')|('+'d{0,3}))? ?'d{2,3}([-'.]?'d{2,3} ?){3,4}

这是可以的,但问题是,它也过滤长度大于14的数字。

以上是将123456789从过滤中排除的正则表达式。

^(?=.{8,14})b$^(?!123456789)'(?(?:(?:0(?:0|11)')?['s-]?'(?|'+)44')?['s-]?'(?(?:0')?['s-]?'(?)?|0)(?:'d{2}')?['s-]?'d{4}['s-]?'d{4}|'d{3}')?['s-]?'d{3}['s-]?'d{3,4}|'d{4}')?['s-]?(?:'d{5}|'d{3}['s-]?'d{3})|'d{5}')?['s-]?'d{4,5}|8(?:00['s-]?11['s-]?11|45['s-]?46['s-]?4'd))(?:(?:['s-]?(?:x|ext'.?'s?|'#)'d+)?)$^|^2(?:0[01378]|3[0189]|4[017]|8[0-46-9]|9[012])'d{7}|1(?:(?:1(?:3[0-48]|[46][0-4]|5[012789]|7[0-49]|8[01349])|21[0-7]|31[0-8]|[459]1'd|61[0-46-9]))'d{6}|1(?:2(?:0[024-9]|2[3-9]|3[3-79]|4[1-689]|[58][02-9]|6[0-4789]|7[013-9]|9'd)|3(?:0'd|[25][02-9]|3[02-579]|[468][0-46-9]|7[1235679]|9[24578])|4(?:0[03-9]|2[02-5789]|[37]'d|4[02-69]|5[0-8]|[69][0-79]|8[0-5789])|5(?:0[1235-9]|2[024-9]|3[0145689]|4[02-9]|5[03-9]|6'd|7[0-35-9]|8[0-468]|9[0-5789])|6(?:0[034689]|2[0-689]|[38][013-9]|4[1-467]|5[0-69]|6[13-9]|7[0-8]|9[0124578])|7(?:0[0246-9]|2'd|3[023678]|4[03-9]|5[0-46-9]|6[013-9]|7[0-35-9]|8[024-9]|9[02-9])|8(?:0[35-9]|2[1-5789]|3[02-578]|4[0-578]|5[124-9]|6[2-69]|7'd|8[02-9]|9[02569])|9(?:0[02-589]|2[02-689]|3[1-5789]|4[2-9]|5[0-579]|6[234789]|7[0124578]|8'd|9[2-57]))'d{6}|1(?:2(?:0(?:46[1-4]|87[2-9])|545[1-79]|76(?:2'd|3[1-8]|6[1-6])|9(?:7(?:2[0-4]|3[2-5])|8(?:2[2-8]|7[0-4789]|8[345])))|3(?:638[2-5]|647[23]|8(?:47[04-9]|64[015789]))|4(?:044[1-7]|20(?:2[23]|8'd)|6(?:0(?:30|5[2-57]|6[1-8]|7[2-8])|140)|8(?:052|87[123]))|5(?:24(?:3[2-79]|6'd)|276'd|6(?:26[06-9]|686))|6(?:06(?:4'd|7[4-79])|295[567]|35[34]'d|47(?:24|61)|59(?:5[08]|6[67]|74)|955[0-4])|7(?:26(?:6[13-9]|7[0-7])|442'd|50(?:2[0-3]|[3-68]2|76))|8(?:27[56]'d|37(?:5[2-5]|8[239])|84(?:3[2-58]))|9(?:0(?:0(?:6[1-8]|85)|52'd)|3583|4(?:66[1-8]|9(?:2[01]|81))|63(?:23|3[1-4])|9561))'d{3}|176888[234678]'d{2}|16977[23]'d{3}|7(?:[1-4]'d'd|5(?:0[0-8]|[13-9]'d|2[0-35-9])|624|7(?:0[1-9]|[1-7]'d|8[02-9]|9[0-689])|8(?:[014-9]'d|[23][0-8])|9(?:[04-9]'d|1[02-9]|2[0-35-9]|3[0-689]))'d{6}|76(?:0[012]|2[356]|4[0134]|5[49]|6[0-369]|77|81|9[39])'d{6}|80(?:0'd{6,7}|8'd{7})|500'd{6}|(?:87[123]|9(?:[01]'d|8[0-3]))'d{7}|8(?:4[2-5]|70)'d{7}|70'd{8}|56'd{8}|(?:3[0347]|55)'d{8}|8(?:001111|45464'd)$|(?:'(('+?'d+)?')|('+'d{0,3}))? ?'d{2,3}([-'.]?'d{2,3} ?){3,4}

但这并不排除它。我做错了什么?

UPDATE1

sample input : "I will call you on 2034561278 number"
sample output: "I will call you on ********** number"

sample input : "I will call you on 20345612781234567 number"
sample output: "I will call you on 20345612781234567 number" (length > 14)

更新2

    $text = 'I will call you on 20345612781234567 number';
$resultSet = array();
    $pattern = '/^(?=.{8,14})b$'(?(?:(?:0(?:0|11)')?['s-]?'(?|'+)44')?['s-]?'(?(?:0')?['s-]?'(?)?|0)(?:'d{2}')?['s-]?'d{4}['s-]?'d{4}|'d{3}')?['s-]?'d{3}['s-]?'d{3,4}|'d{4}')?['s-]?(?:'d{5}|'d{3}['s-]?'d{3})|'d{5}')?['s-]?'d{4,5}|8(?:00['s-]?11['s-]?11|45['s-]?46['s-]?4'd))(?:(?:['s-]?(?:x|ext'.?'s?|'#)'d+)?)$^|^2(?:0[01378]|3[0189]|4[017]|8[0-46-9]|9[012])'d{7}|1(?:(?:1(?:3[0-48]|[46][0-4]|5[012789]|7[0-49]|8[01349])|21[0-7]|31[0-8]|[459]1'd|61[0-46-9]))'d{6}|1(?:2(?:0[024-9]|2[3-9]|3[3-79]|4[1-689]|[58][02-9]|6[0-4789]|7[013-9]|9'd)|3(?:0'd|[25][02-9]|3[02-579]|[468][0-46-9]|7[1235679]|9[24578])|4(?:0[03-9]|2[02-5789]|[37]'d|4[02-69]|5[0-8]|[69][0-79]|8[0-5789])|5(?:0[1235-9]|2[024-9]|3[0145689]|4[02-9]|5[03-9]|6'd|7[0-35-9]|8[0-468]|9[0-5789])|6(?:0[034689]|2[0-689]|[38][013-9]|4[1-467]|5[0-69]|6[13-9]|7[0-8]|9[0124578])|7(?:0[0246-9]|2'd|3[023678]|4[03-9]|5[0-46-9]|6[013-9]|7[0-35-9]|8[024-9]|9[02-9])|8(?:0[35-9]|2[1-5789]|3[02-578]|4[0-578]|5[124-9]|6[2-69]|7'd|8[02-9]|9[02569])|9(?:0[02-589]|2[02-689]|3[1-5789]|4[2-9]|5[0-579]|6[234789]|7[0124578]|8'd|9[2-57]))'d{6}|1(?:2(?:0(?:46[1-4]|87[2-9])|545[1-79]|76(?:2'd|3[1-8]|6[1-6])|9(?:7(?:2[0-4]|3[2-5])|8(?:2[2-8]|7[0-4789]|8[345])))|3(?:638[2-5]|647[23]|8(?:47[04-9]|64[015789]))|4(?:044[1-7]|20(?:2[23]|8'd)|6(?:0(?:30|5[2-57]|6[1-8]|7[2-8])|140)|8(?:052|87[123]))|5(?:24(?:3[2-79]|6'd)|276'd|6(?:26[06-9]|686))|6(?:06(?:4'd|7[4-79])|295[567]|35[34]'d|47(?:24|61)|59(?:5[08]|6[67]|74)|955[0-4])|7(?:26(?:6[13-9]|7[0-7])|442'd|50(?:2[0-3]|[3-68]2|76))|8(?:27[56]'d|37(?:5[2-5]|8[239])|84(?:3[2-58]))|9(?:0(?:0(?:6[1-8]|85)|52'd)|3583|4(?:66[1-8]|9(?:2[01]|81))|63(?:23|3[1-4])|9561))'d{3}|176888[234678]'d{2}|16977[23]'d{3}|7(?:[1-4]'d'd|5(?:0[0-8]|[13-9]'d|2[0-35-9])|624|7(?:0[1-9]|[1-7]'d|8[02-9]|9[0-689])|8(?:[014-9]'d|[23][0-8])|9(?:[04-9]'d|1[02-9]|2[0-35-9]|3[0-689]))'d{6}|76(?:0[012]|2[356]|4[0134]|5[49]|6[0-369]|77|81|9[39])'d{6}|80(?:0'd{6,7}|8'd{7})|500'd{6}|(?:87[123]|9(?:[01]'d|8[0-3]))'d{7}|8(?:4[2-5]|70)'d{7}|70'd{8}|56'd{8}|(?:3[0347]|55)'d{8}|8(?:001111|45464'd)$|(?:'(('+?'d+)?')|('+'d{0,3}))? ?'d{2,3}([-'.]?'d{2,3} ?){3,4}/';
    preg_match_all($pattern, $text, $matches, PREG_OFFSET_CAPTURE );
    $this->pushToResultSet($matches);   
return $resultSet;

我能想到的最好的方法是使用preg_replace_callback():

$str = preg_replace_callback('/'b'd{8,14}'b/', function($match) {
    if ($match[0] == '123456789') {
        return $match[0];
    } else {
        return str_repeat('*', strlen($match[0]));
    }            
}, $str);

更新
if (preg_match_all('/'b'd{8,14}'b/', $str, $matches)) {
    foreach ($matches[0] as $match) {
        if ($match != '123456789') {
            $this->pushToResultSet($match);
        }
    }
}

如果您绝对需要使用正则表达式匹配这些,我建议采用迭代方法,使用单独的匹配,首先使用"最便宜"的匹配,然后逐步使用更详尽的匹配。

这将使单个正则表达式更容易阅读,使代码更容易阅读,并将确保您不会对字符串执行处理器密集型匹配,这很容易被确定为错误。

例如在一些奇怪的伪代码中:

FUNCTION CHECK ($MYNUMBER) {
    IF "123456789" is in $MYNUMBER:
        RETURN $ERROR
    IF LEN $MYNUMBER < 9 or LEN $MYNUMBER > 14:
        RETURN $ERROR
    IF $MYNUMBER not matches $CRAZY_REGEX:
        RETURN $ERROR
    RETURN $NOT_ERROR
}