使用可选的尾随字符分析和重新格式化文本


Parse and reformat text with optionally occurring trailing characters

由于我们接收数据的某个网站的格式糟糕且不一致,我需要解析以下字符串,并在替换/删除一两个子字符串后打印新字符串。

$s = array(
    'James Cussen''s Destructor Bot',
    'Andre Riverra''s Quick-Runner - San francisco', 
    'Kim Smith''s - NightBot'
);

期望结果:

James Cussen: Destructor Bot
Andre Riverra: Quick-Runner
Kim Smith: Nightbot

如何将带有两个"-"的行解析为相应的Owner: name格式?

我当前的代码:

$bot ='';
$creator = '';
foreach($s as $parse)
{
   //if string contains '
    if(strpos($parse,'''') !== false)
            {
              if(substr_count ($parse, '-') > 1)
              {
                  $arr =  explode('''', $parse);
                
                  
                  $line =  trim(substr($arr[1], 1));
                
              }
               if(strpos($parse,'–') !== false)
                 {
                 $temp = explode('–',$parse);
                 }
                 else
                 {
                 $temp =  explode('-', $parse);
                 }
            
                $arr =  explode('''', $temp[0]);
                $creator = $arr[0];
                
                $bot =  trim(substr($arr[1], 1));
                
                
            }
    echo $creator.':'.$bot;
    echo '<br>';
}

这在未来肯定会失败,因为数据传递的格式不一致,但嘿,至少现在可以工作了。

foreach ($s as $entry):
    list($creator, $bot) = explode('''s', $entry);
    if (substr($bot, 0, 3) !== ' - '):
        $bot = substr($bot, 0, strpos($bot, ' - '));
    else:
        $bot = substr($bot, 3);
    endif;
    echo $creator . ': ' . $bot . '<br>';
endforeach;

对于提供的输入字符串,请使用以下模式。

's          # match apostrophe, s
(?:- )?     # optionally match hyphen, soace
(.*?)       # lazily capture zero or more of any character as backreference $1
(?: - .+)?  # match space, hyphen, space, one or more of any character
$           # match the end of the string

代码:(演示(

$strings = [
    'James Cussen''s Destructor Bot',
    'Andre Riverra''s Quick-Runner - San francisco', 
    'Kim Smith''s - NightBot'
];
foreach ($strings as $string) {
    echo preg_replace(
             "/'s (?:- )?(.*?)(?: - .+)?$/",
             ': $1',
             $string,
             1
         );
    echo "'n";
}

输出:

James Cussen: Destructor Bot
Andre Riverra: San francisco
Kim Smith: NightBot