我正在尝试从SIP消息的Via:
标头中提取branch=z9hG4bKlmrltg10b801lgkf0681.1
。以下是我尝试过的PHP代码:
preg_match('/.branch=.* + From:/', $msg, $result)
这里是$msg
:的值
"INVITE sip:3310094@mediastream.voip.cabletel.net:5060 SIP/2.0
Via: SIP/2.0/UDP 192.168.50.240:5060;branch=z9hG4bKlmrltg10b801lgkf0681.1
From: DEATON JEANETTE<sip:9123840782@mediastream.voip.cabletel.net:5060>;tag=SDg7j0c01-959bf958-d8f0f4ea-13c4-50029-140b-4d106390-140b"
如何更正正则表达式以使其正常工作?
请正确解析SIP消息。我发现你不太可能只想要分支ID,你几乎肯定想要除了伪呼叫ID之外的其他交易信息。SIP消息遵循其他几种协议(包括HTTP;-((使用的标准化消息格式,并且有几个库是为解析这种消息格式而设计的。
为了演示这是多么的简单和强大,我们首先来看看我不久前写的RFC822消息解析器类(尽管它们最近得到了改进和更新(。这些可以用于解析电子邮件,我还有一些简单的HTTP消息解析器类,它们是从以下类扩展而来的:
<?php
/**
* Class representing the basic RFC822 message format
*
* @author Chris Wright
* @version 1.1
*/
class RFC822Message
{
/**
* @var array Collection of headers from the message
*/
protected $headers = array();
/**
* @var string The message body
*/
protected $body;
/**
* Constructor
*
* @param array $headers Collection of headers from the message
* @param string $body The message body
*/
public function __construct($headers, $body)
{
$this->headers = $headers;
$this->body = $body;
}
/**
* Get the value of a header from the message
*
* @param string $name The name of the header
*
* @return array The value(s) of the header from the request
*/
public function getHeader($name)
{
$name = strtolower(trim($name));
return isset($this->headers[$name]) ? $this->headers[$name] : null;
}
/**
* Get the message body
*
* @return string The message body
*/
public function getBody()
{
return $this->body;
}
}
/**
* Factory which makes RFC822 message objects
*
* @author Chris Wright
* @version 1.1
*/
class RFC822MessageFactory
{
/**
* Create a new RFC822 message object
*
* @param array $headers The request headers
* @param string $body The request body
*/
public function create($headers, $body)
{
return new RFC822Message($headers, $body);
}
}
/**
* Parser which creates RFC822 message objects from strings
*
* @author Chris Wright
* @version 1.2
*/
class RFC822MessageParser
{
/**
* @var RFC822MessageFactory Factory which makes RFC822 message objects
*/
protected $messageFactory;
/**
* Constructor
*
* @param RFC822MessageFactory $messageFactory Factory which makes RFC822 message objects
*/
public function __construct(RFC822MessageFactory $messageFactory)
{
$this->messageFactory = $messageFactory;
}
/**
* Split a message into head and body sections
*
* @param string $message The message string
*
* @return array Head at index 0, body at index 1
*/
protected function splitHeadFromBody($message)
{
$parts = preg_split('/'r?'n'r?'n/', ltrim($message), 2);
return array(
$parts[0],
isset($parts[1]) ? $parts[1] : null
);
}
/**
* Parse the header section into a normalized array
*
* @param string $head The message head section
*
* @return array The parsed headers
*/
protected function parseHeaders($head)
{
$expr =
'!
^
([^()<>@,;:''"/[']?={} 't]+) # Header name
[ 't]*:[ 't]*
(
(?:
(?: # First line of value
(?:"(?:[^"'''']|''''.)*"|'S+) # Quoted string or unquoted token
[ 't]* # LWS
)*
(?: # Folded lines
'r?'n
[ 't]+ # ...must begin with LWS
(?:
(?:"(?:[^"'''']|''''.)*"|'S+) # ...followed by quoted string or unquoted tokens
[ 't]* # ...and maybe some more LWS
)*
)*
)?
)
'r?$
!smx';
preg_match_all($expr, $head, $matches);
$headers = array();
for ($i = 0; isset($matches[0][$i]); $i++) {
$name = strtolower($matches[1][$i]);
if (!isset($headers[$name])) {
$headers[$name] = array();
}
$value = preg_replace('/'s+("(?:[^"'''']|''''.)*"|'S+)/s', ' $1', $matches[2][$i]);
$headers[$name][] = $value;
}
return $headers;
}
/**
* Create a message object from a string
*
* @param string $message The message string
*
* @return RFC822Message The parsed message object
*/
public function parseMessage($message)
{
list($head, $body) = $this->splitHeadFromBody($message);
$headers = $this->parseHeaders($head);
return $this->requestFactory->create($headers, $body);
}
}
如果你忽略了解析邮件头的可怕正则表达式,那就没有什么特别可怕的了:-p-但说真的,这些类可以不加修改地用于解析电子邮件的邮件头部分,这是RFC822格式消息的基础。
SIP以HTTP为模型,因此,只要对HTTP消息解析类进行一些相当简单的修改,我们就可以很容易地将它们调整为SIP。让我们来看看这些——在这些类中,我(或多或少(搜索了HTTP
,并将其替换为SIP
:
<?php
/**
* Abstract class representing a SIP message
*
* @author Chris Wright
* @version 1.0
*/
abstract class SIPMessage extends RFC822Message
{
/**
* @var string The message protocol version
*/
protected $version;
/**
* Constructor
*
* @param array $headers Collection of headers from the message
* @param string $body The message body
* @param string $version The message protocol version
*/
public function __construct($headers, $body, $version)
{
parent::__construct($headers, $body);
$this->version = $version;
}
/**
* Get the message protocol version
*
* @return string The message protocol version
*/
public function getVersion()
{
return $this->version;
}
}
/**
* Class representing a SIP request message
*
* @author Chris Wright
* @version 1.0
*/
class SIPRequest extends SIPMessage
{
/**
* @var string The request method
*/
private $method;
/**
* @var string The request URI
*/
private $uri;
/**
* Constructor
*
* @param array $headers The request headers
* @param string $body The request body
* @param string $version The request protocol version
* @param string $method The request method
* @param string $uri The request URI
*/
public function __construct($headers, $body, $version, $method, $uri)
{
parent::__construct($headers, $body, $version);
$this->method = $method;
$this->uri = $uri;
}
/**
* Get the request method
*
* @return string The request method
*/
public function getMethod()
{
return $this->method;
}
/**
* Get the request URI
*
* @return string The request URI
*/
public function getURI()
{
return $this->uri;
}
}
/**
* Class representing a SIP response message
*
* @author Chris Wright
* @version 1.0
*/
class SIPResponse extends SIPMessage
{
/**
* @var int The response code
*/
private $code;
/**
* @var string The response message
*/
private $message;
/**
* Constructor
*
* @param array $headers The request headers
* @param string $body The request body
* @param string $version The request protocol version
* @param int $code The response code
* @param string $message The response message
*/
public function __construct($headers, $body, $version, $code, $message)
{
parent::__construct($headers, $body, $version);
$this->code = $code;
$this->message = $message;
}
/**
* Get the response code
*
* @return int The response code
*/
public function getCode()
{
return $this->code;
}
/**
* Get the response message
*
* @return string The response message
*/
public function getMessage()
{
return $this->message;
}
}
/**
* Factory which makes SIP request objects
*
* @author Chris Wright
* @version 1.0
*/
class SIPRequestFactory extends RFC822MessageFactory
{
/**
* Create a new SIP request object
*
* The last 3 arguments of this method are only optional to prevent PHP from triggering
* an E_STRICT at compile time. IMO this particular error is itself an error on the part
* of the PHP designers, and I don't feel bad about about this workaround, even if it
* does mean the signature is technically wrong. It is the lesser of two evils.
*
* @param array $headers The request headers
* @param string $body The request body
* @param string $version The request protocol version
* @param string $method The request method
* @param string $uri The request URI
*/
public function create($headers, $body, $version = null, $method = null, $uri = null)
{
return new SIPRequest($headers, $body, $version, $method, $uri);
}
}
/**
* Factory which makes SIP response objects
*
* @author Chris Wright
* @version 1.0
*/
class SIPResponseFactory extends RFC822MessageFactory
{
/**
* Create a new SIP response object
*
* The last 3 arguments of this method are only optional to prevent PHP from triggering
* an E_STRICT at compile time. IMO this particular error is itself an error on the part
* of the PHP designers, and I don't feel bad about about this workaround, even if it
* does mean the signature is technically wrong. It is the lesser of two evils.
*
* @param array $headers The response headers
* @param string $body The response body
* @param string $version The response protocol version
* @param int $code The response code
* @param string $message The response message
*/
public function create($headers, $body, $version = null, $code = null, $message = null)
{
return new SIPResponse($headers, $body, $version, $code, $message);
}
}
/**
* Parser which creates SIP message objects from strings
*
* @author Chris Wright
* @version 1.0
*/
class SIPMessageParser extends RFC822MessageParser
{
/**
* @var SIPRequestFactory Factory which makes SIP request objects
*/
private $requestFactory;
/**
* @var SIPResponseFactory Factory which makes SIP response objects
*/
private $responseFactory;
/**
* Constructor
*
* @param SIPRequestFactory $requestFactory Factory which makes SIP request objects
* @param SIPResponseFactory $responseFactory Factory which makes SIP response objects
*/
public function __construct(SIPRequestFactory $requestFactory, SIPResponseFactory $responseFactory)
{
$this->requestFactory = $requestFactory;
$this->responseFactory = $responseFactory;
}
/**
* Remove the request line from the message and parse into tokens
*
* @param string $head The message head section
*
* @return array The parsed request line at index 0, the remainder of the message at index 1
*
* @throws 'DomainException When the request line of the message is invalid
*/
private function removeAndParseRequestLine($head)
{
// Note: this method forgives a couple of minor standards violations, mostly for benefit
// of some older Polycom phones and for Voispeed, who seem to make stuff up as they go
// along. It also treats the whole line as case-insensitive even though methods are
// officially case-sensitive, because having two different casings of the same verb mean
// different things makes no sense semantically or implementationally.
// Side note, from RFC3261:
// > The SIP-Version string is case-insensitive, but implementations MUST send upper-case
// Wat. Go home Rosenberg, et. al., you're drunk.
$parts = preg_split('/'r?'n/', $head, 2);
$expr =
'@^
(?:
([^'r'n 't]+) [ 't]+ ([^'r'n 't]+) [ 't]+ SIP/('d+'.'d+) # request
|
SIP/('d+'.'d+) [ 't]+ ('d+) [ 't]+ ([^'r'n]+) # response
)
$@ix';
if (!preg_match($expr, $parts[0], $match)) {
throw new 'DomainException('Request-Line of the message is invalid');
}
if (empty($match[4])) { // request
$requestLine = array(
'method' => strtoupper($match[1]),
'uri' => $match[2],
'version' => $match[3]
);
} else { // response
$requestLine = array(
'version' => $match[4],
'code' => (int) $match[5],
'message' => $match[6]
);
}
return array(
$requestLine,
isset($parts[1]) ? $parts[1] : ''
);
}
/**
* Create the appropriate message object from a string
*
* @param string $message The message string
*
* @return SIPRequest|SIPResponse The parsed message object
*
* @throws 'DomainException When the message string is not valid SIP message
*/
public function parseMessage($message)
{
list($head, $body) = $this->splitHeadFromBody($message);
list($requestLine, $head) = $this->removeAndParseRequestLine($head);
$headers = $this->parseHeaders($head);
if (isset($requestLine['uri'])) {
return $this->requestFactory->create(
$headers,
$body,
$requestLine['version'],
$requestLine['method'],
$requestLine['uri']
);
} else {
return $this->responseFactory->create(
$headers,
$body,
$requestLine['version'],
$requestLine['code'],
$requestLine['message']
);
}
}
}
似乎有很多代码只是为了提取一个标头值,不是吗?是的。但这不是,只是它的作用。它将整个消息解析为一个数据结构,该结构提供了对任意数量信息的轻松访问,允许(或多或少(标准可以向您抛出的任何东西。
所以,让我们来看看你将如何实际使用它:
// First we create a parser object
$messageParser = new SIPMessageParser(
new SIPRequestFactory,
new SIPResponseFactory
);
// Parse the message into an object
try {
$message = $messageParser->parseMessage($msg);
} catch (Exception $e) {
// The message parsing failed, handle the error here
}
// Get the value of the Via: header
$via = $message->getHeader('Via');
// SIP is irritatingly non-specific about the format of branch IDs. This
// expression matches either a quoted string or an unquoted token, which is
// about all that you can say for sure about arbitrary implementations.
$expr = '/branch=(?:"((?:[^"'''']|''''.)*)"|(.+?)(?:'s|;|$))/i';
// NB: this assumes the message has a single Via: header and a single branch ID.
// In reality this is rarely the case for messages that are received, although
// it is usually the case for messages before they are sent.
if (!preg_match($expr, $via[0], $matches)) {
// The Via: header does not contain a branch ID, handle this error
}
$branchId = !empty($matches[2]) ? $matches[2] : $matches[1];
var_dump($branchId);
看到它工作
对于眼前的问题来说,这个答案无疑是大材小用。然而,我认为这是解决这个问题的正确方法。
preg_match('/branch=.*/i', $msg, $result);
print_r($result);
会产生类似的结果
Array
(
[0] => branch=z9hG4bKlmrltg10b801lgkf0681.1
)
试试这个
$str = "INVITE sip:3310094@mediastream.voip.cabletel.net:5060 SIP/2.0
Via: SIP/2.0/UDP 192.168.50.240:5060;branch=z9hG4bKlmrltg10b801lgkf0681.1
From: DEATON JEANETTE<sip:9123840782@mediastream.voip.cabletel.net:5060>;tag=SDg7j0c01-959bf958-d8f0f4ea-13c4-50029-140b-4d106390-140b";
preg_match('/branch=(.*)From:/i', $str, $output);
print_r( $output );
试试这个正则表达式。它检查branch
代码后面是否有空格或换行符。您想要的结果总是存储在$output[0]
中
$str = "INVITE sip:3310094@mediastream.voip.cabletel.net:5060 SIP/2.0
Via: SIP/2.0/UDP 192.168.50.240:5060;branch=z9hG4bKlmrltg10b801lgkf0681.1 From: DEATON JEANETTE<sip:9123840782@mediastream.voip.cabletel.net:5060>;tag=SDg7j0c01-959bf958-d8f0f4ea-13c4-50029-140b-4d106390-140b";
preg_match('/(branch=.*)( |'r'n)/', $str, $output);
print_r( $output ); // $output[0] is what you need
示例:http://codepad.viper-7.com/Gj0lWD
您可以使用这样的前瞻性断言:
preg_match_all('/.branch=(.*?)(?=^'S|'Z)/sm', $msg, $matches);
这里,(?=^'S|'Z)
断言一个新行,后面跟着一个非空格(又名折叠标题(或主题结尾。这就是比赛应该结束的地方
或者只匹配branch=
,直到行的末尾:
preg_match_all('/.branch=(.*)/m', $msg, $matches);
适用于未折叠的页眉
另请参阅:HTTP标头的基本规则