Regex从[shortcode]中提取变量


Regex extract variables from [shortcode]

在将一些内容从WordPress迁移到Drupal后,我有一些需要转换的短代码:

字符串内容:

无关技术。。。[sublimevideo class="崇高"poster="http://video.host.com/_previews/600x450/sbx-60025-00-da-ANA.png"src1="http://video.host.com/_video/H.264/LO/sbx-60025-00-da-ANA.m4v"src2="(hd)http://video.host.com/_video/H.264/HI/sbx-60025-00-da-ANA.m4v"width="560"height="315"]..更多不相关的文本。

我需要在shortcode〔sublimevideo…〕中找到所有变量,并将其转换为数组:

Array (
    class => "sublime"
    poster => "http://video.host.com/_previews/600x450/sbx-60025-00-da-FMT.png"
    src1 => "http://video.host.com/_video/H.264/LO/sbx-60025-00-da-FMT.m4v"
    src2 => "(hd)http://video.host.com/_video/H.264/HI/sbx-60025-00-da-FMT.m4v"
    width => "560"
    height => "315"
)

并且最好处理多个短代码实例。

我想这可以用preg_match_all()完成,但我运气不好。

这会给你想要的。

$data = 'Irrelevant tekst... [sublimevideo class="sublime" poster="http://video.host.com/_previews/600x450/sbx-60025-00-da-ANA.png" src1="http://video.host.com/_video/H.264/LO/sbx-60025-00-da-ANA.m4v" src2="(hd)http://video.host.com/_video/H.264/HI/sbx-60025-00-da-ANA.m4v" width="560" height="315"] ..more irrelevant text.';
$dat = array();
preg_match("/'[sublimevideo (.+?)']/", $data, $dat);
$dat = array_pop($dat);
$dat= explode(" ", $dat);
$params = array();
foreach ($dat as $d){
    list($opt, $val) = explode("=", $d);
    $params[$opt] = trim($val, '"');
}
print_r($params);

预计在处理短代码时将面临下一个挑战,您可以使用preg_replace_callback将短标记数据替换为其结果标记。

$data = 'Irrelevant tekst... [sublimevideo class="sublime" poster="http://video.host.com/_previews/600x450/sbx-60025-00-da-ANA.png" src1="http://video.host.com/_video/H.264/LO/sbx-60025-00-da-ANA.m4v" src2="(hd)http://video.host.com/_video/H.264/HI/sbx-60025-00-da-ANA.m4v" width="560" height="315"] ..more irrelevant text.';
function processShortCode($matches){
    // parse out the arguments
    $dat= explode(" ", $matches[2]);
    $params = array();
    foreach ($dat as $d){
        list($opt, $val) = explode("=", $d);
        $params[$opt] = trim($val, '"');
    }
    switch($matches[1]){
        case "sublimevideo":
            // here is where you would want to return the resultant markup from the shorttag call.
             return print_r($params, true);        
    }
}
$data = preg_replace_callback("/'[('w+) (.+?)]/", "processShortCode", $data);
echo $data;

您可以使用以下RegEx来匹配变量:

$regex = '/('w+)'s*='s*"(.*?)"/';

我建议首先匹配sublimevideo短代码,并使用以下RegEx:将其转换为字符串

$pattern = '/'[sublimevideo(.*?)']/';

为了获得正确的数组密钥,我使用了以下代码:

// $string is string content you specified
preg_match_all($regex, $string, $matches);
$sublimevideo = array();
for ($i = 0; $i < count($matches[1]); $i++)
    $sublimevideo[$matches[1][$i]] = $matches[2][$i];

这将返回以下数组:(您请求的数组)

Array
(
    [class] => sublime
    [poster] => http://video.host.com/_previews/600x450/sbx-60025-00-da-ANA.png
    [src1] => http://video.host.com/_video/H.264/LO/sbx-60025-00-da-ANA.m4v
    [src2] => (hd)http://video.host.com/_video/H.264/HI/sbx-60025-00-da-ANA.m4v
    [width] => 560
    [height] => 315
)

这是我的解释,我来自WordPress背景,并试图为自定义php项目重新创建设置。

它将处理[PHONE][PHONE abc="123"]等

它唯一失败的是WordPress风格的[HERE]到[HERE]

建立可用短代码列表的功能


// Setup the default global variable
function create_shortcode($tag, $function)
{
    global $shortcodes;
    $shortcodes[$tag] = $function;
}

单独定义短代码,例如[IFRAME url="https://www.bbc.co.uk"]:


/**
 * iframe, allows the user to add an iframe to a page with responsive div wrapper
 */
create_shortcode('IFRAME', function($atts) {
    // ... some validation goes here
    // The parameters that can be set in the shortcode
    if (empty($atts['url'])) {
        return false;
    }
    return '
    <div class="embed-responsive embed-responsive-4by3">
      <iframe class="embed-responsive-item" src="' . $atts['url'] . '">
      </iframe>
    </div>';
});

然后,当你想通过短代码处理传递一个html块时,做…handle_shortcodes($some_html_with_shortcodes);

function handle_shortcodes($content)
{
    global $shortcodes;
    // Loop through all shortcodes
    foreach($shortcodes as $key => $function){
        $matches = [];
        // Look for shortcodes, returns an array of ALL matches
        preg_match_all("/'[$key([^_^']].+?)?']/", $content, $matches, PREG_UNMATCHED_AS_NULL);
        if (!empty($matches))
        {
            $i = 0;
            $full_shortcode = $matches[0];
            $attributes = $matches[1];
            if (!empty($attributes))
            {
                foreach($attributes as $attribute_string) {
                    // Decode the values (e.g. &quot; to ") 
                    $attribute_string = htmlspecialchars_decode($attribute_string);
                    // Find all the query args, looking for `arg="anything"`
                    preg_match_all('/'w+'='"(.[^"]+)'"/', $attribute_string, $query_args);
                    $params = [];
                    foreach ($query_args[0] as $d) {
                        // Split the
                        list($att, $val) = explode('=', $d, 2);
                        $params[$att] = trim($val, '"');
                    }
                    $content = str_replace($full_shortcode[$i], $function($params), $content);
                    $i++;
                }
            }
        }
    }
    return $content;
}

我已经从工作代码中提取了这些例子,希望它可读,并且没有任何我们设置所独有的额外功能。

如本答案所述,我建议让WordPress使用get_shortcode_regex()函数为您完成工作。

 $pattern = get_shortcode_regex();
 preg_match_all("/$pattern/",$wp_content,$matches);

这将为您提供一个易于使用的数组,并显示内容中的各种短代码和附属属性。它不是最明显的数组格式,所以打印它并查看一下,这样您就知道如何操作所需的数据。