如何从这些给定的url字符串中提取ID123
?
my-domain/product/name-product-ID123.html
my-domain/product/name-product-ID123.html/
my-domain/product/name-product-ID123.html?bla=123&some=456
如果不是ID
,则取一个长度等于2 (AB, EF, GH, ...)
的随机字符串
有人能帮帮我吗?
这可能不是正则表达式的工作,而是您选择的语言中的现有工具的工作。正则表达式不是你在涉及字符串的所有问题上挥舞的魔杖。您可能希望使用已经编写、测试和调试过的现有代码。
在PHP中,使用parse_url
函数
Perl: URI
module.
Ruby: URI
module.
。. NET: 'Uri'类
$zeichenkette = "my-domain/product/name-product-ID123.html";
$suchmuster = '/ID[0-9]{3}/';
preg_match($suchmuster, $zeichenkette, $treffer, PREG_OFFSET_CAPTURE, 3);
print_r($treffer);
应该打印ID123
试试这个:
(?<=product-)ID[0-9]+(?='.html)
(?<=product-)
Positive Lookbehind -断言ID前面有字符串product-
ID
匹配字符ID[0-9]+
匹配数字序列(?='.html)
Positive Lookahead -断言ID后面跟着.html
我是这么想的:
(?!-)(ID[0-9]*)(?='.)
测试:http://regex101.com/r/rP0vI2
如果不是"ID",则为:
(?!-)([A-Z]{2}[0-9]*)(?='.)
测试:http://regex101.com/r/dW8qK0
简短有效:
<?php
$links = <<< LOB
my-domain/product/name-product-ID123.html
my-domain/product/name-product-ID123.html/
my-domain/product/name-product-ID123.html?bla=123&some=456
LOB;
preg_match_all('/-(ID'd+)'./',$links ,$ids, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($ids[1]); $i++) {
echo $ids[1][$i]."'n";
}
/*
ID123
ID123
ID123
*/
?>
现场演示:
http://ideone.com/OqhL6b
解释:
Match the character “-” literally «-»
Match the regular expression below and capture its match into backreference number 1 «(ID'd+)»
Match the characters “ID” literally «ID»
Match a single digit 0..9 «'d+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “.” literally «'.»