使用 PHP 匹配字符串以字符串中的特定字符开头和结尾


Match strings starts and end with particular character in a String using PHP?

这是我的字符串。总 json 响应以字符串形式出现。任务是识别子域和注释后面的单词。

{item_type:a,custom_domain:"google.com",子域:分析,持续时间:324.33, ID:2892928, 评论:GoAhead,domain_verified:yes}, {item_type:B,custom_domain:"yahoo.com",子域:新闻,评论:真棒,domain_verified:否}, {item_type:C,custom_domain:"amazon.com",子域:AWS,宽度:221,image_id:3233,高度:13,评论:保持,domain_verified:否}, {item_type:D,custom_domain:"facebook.com",子域:m,slug:sure,domain_verified:yes}

输出应该是这样的,

analytics, goahead
news, awesome
aws, keep it up
m, sure

简单地说,我需要以 ^subdomain: 开头并以逗号结尾的单词,然后以 ^comment: 开头并以逗号结尾的单词。

传入的字符串包含大量数据。每个字符串都将包含数千个子域和注释。我已经尝试过preg_match_all方法。但我没有得到正确的方法。

我看到 3 种方法(我不确定哪一种性能最好,但我会打赌最后一种程序方式):

  1. 使用 json_decode 函数,您将从字符串中获取一个数组,然后迭代它以获取数据
  2. 使用正则表达式,请参阅此处的示例,其中包含模式/subdomain:(.*?),.*?comment:(.*?),/
  3. 使用过程函数,例如:

    $subdomains = [];
    $comments = [];
    $subdomainLen = strlen('subdomain:');
    $commentLen = strlen('comment:');
    $str = '{item_type:a,custom_domain:"google.com",subdomain:analytics,duration:324.33, id:2892928, comment:goahead,domain_verified:yes}, {item_type:b,custom_domain:"yahoo.com",subdomain:news,comment:awesome,domain_verified:no}, {item_type:c,custom_domain:"amazon.com",subdomain:aws,width:221,image_id:3233,height:13, comment:keep it up,domain_verified:no}, {item_type:d,custom_domain:"facebook.com",subdomain:m,slug:sure,domain_verified:yes}';
    // While we found the 'subdomain' pattern
    while(($subdomainPos = strpos($str, 'subdomain')))
    {
        // Removes all char that are behind 'subdomain'
        $str = substr($str, $subdomainPos + $subdomainLen);
        // Retrieves the subdomain str and push to array
        $subdomains[] = substr($str, 0, strpos($str, ','));
        // If pattern 'comment' exists, do the same as before to extract the comment
        if($commentPos = strpos($str, 'comment'))
        {
            $str = substr($str, $commentPos + $commentLen);
            $comments[] = substr($str, 0, strpos($str, ','));
        }
    }
    

给你一个字符串示例,你可以使用以下正则表达式来捕获所有子域:

/(subdomain:)['w|'s]+,/gm

和:

/(comment:)['w|'s]+,/gm

捕获注释。

下面是子域的工作示例。

如果只想要子域或评论的内容,则可以将它们从匹配结果中删除。

试试这段代码... 这是现场示例

<?php 
$string ='{item_type:a,custom_domain:"google.com",subdomain:analytics,duration:324.33, id:2892928, comment:goahead,domain_verified:yes}, {item_type:b,custom_domain:"yahoo.com",subdomain:news,comment:awesome,domain_verified:no}, {item_type:c,custom_domain:"amazon.com",subdomain:aws,width:221,image_id:3233,height:13, comment:keep it up,domain_verified:no}, {item_type:d,custom_domain:"facebook.com",subdomain:m,slug:sure,domain_verified:yes}';
$v1= explode(',',str_replace("}","",str_replace("{","",$string)));
$result =array();
foreach($v1 as $key=>$val)
{
    $v2 = explode(':',$val);
    if(trim($v2[0])=='subdomain' || trim($v2[0])=='comment')
    {
        $result[]= $v2[1];
    }
}
echo implode(',',$result);
?>

这将输出:

analytics,goahead,news,awesome,aws,keep it up,m