查找/替换RTF/txt文件中的文本


PHP - Find/Replace Text in RTF/txt files

我遇到了查找特定文本并用替代文本替换它的问题。我只使用.rtf.txt文件测试下面的代码。我还确保文件是可写的,从我的服务器。

这是一个偶然的情况,我很好奇是我的代码错了,还是这只是打开和操作文件的奇怪。

<?php
$filelocation = '/tmp/demo.txt';
$firstname = 'John';
$lastname = 'Smith';
$output = file_get_contents($filelocation);
$output = str_replace('[[FIRSTNAME]]', $firstname, $output);
$output = str_replace('[[LASTNAME]]', $lastname, $output);
$output = str_replace('[[TODAY]]', date('F j, Y'), $output);
// rewrite file
file_put_contents($filelocation, $output);
?>

所以,在demo.txt文件中,我有大约一整页的文本,[[FIRSTNAME]], [[LASTNAME]]和[[TODAY]]散落在周围。

查找/替换是命中注定的。到目前为止,[[TODAY]]总是被正确地替换,而名称却不是。

有人有同样的问题吗?

(顺便说一下,我检查了错误日志,到目前为止没有PHP警告/错误从打开文件返回,也没有写它)

如果没有看到demo.txt的内容,很难确定。我的第一个猜测是,这可能是一个问题,使用括号为您的指针。我会尝试更改为RTF不使用的东西,如百分号或星号。例如:%%FIRSTNAME%%, **FIRSTNAME**(当然这是假设你已经控制了demo.txt的内容)

我也有过这个问题。似乎Microsoft Word在标签中插入了格式化代码。我在我的技术博客上写了一篇关于如何解决这个问题的博文。

http://tech.humlesite.eu/2017/01/13/using-regular-expression-to-merge-database-content-into-rich-text-format-template-documents/

PHP示例如下:
<?php 
$file = file_get_contents('mergedoc.rtf');
// To temporary get rid of the escape characters...
$mergetext = str_replace("''", "€€", $file); 
// New seven part regex with default value detection
$regex2 = '/<<((?:€€[a-z0-9]*|'}|'{|'s)*)([a-z0-9.'-'+_æøåÆØÅA-Z]*)((?:€€[a-z0-9]*|'}|'{|'s)*)([a-z0-9.'-'+_æøåÆØÅA-Z]*)((?:€€[a-z0-9]*|'}|'{|'s)*)(?:'s*:(.*?)'s*)?((?:€€[a-z0-9]*|'}|'{|'s)*)>>/';
// Find all the matches in it....
preg_match_all($regex2,$mergetext, $out, PREG_SET_ORDER);
// Lets see the result
var_dump($out); 
foreach ($out as $match) {
    $whole_tag = $match[0]; // The part we actually replace. 
    $start = $match[1]; // The start formatting that has been injected in our tag, if any
    $tag = $match[2]; // The tag word itself. 
    if (($match[4].$match[6]) != "") { //some sec-part tag or default value?
        $end = $match[5]; // The end formatting that might be inserted. 
        if ($end == "") {
            $end = $match[7]; // No end in 5, we try 7. 
        }
    } else {
        $end = $match[3]; // No second tag or default value, we find end in match-3 
    }
    $secPartTag = $match[4]; // Do we have inserted some formatting inside the tag word too ? 
    if ($secPartTag != "") {
        $tag .= $secPartTag; // Put it together with the tag word. 
    }
    $default_value = $match[6]; 
    // Simple selection of what we do with the tag. 
    switch ($tag) {
        case 'COMPANY_NAME': 
            $txt = "MY MERGE COMPANY EXAMPLE LTD"; 
            break; 
        case 'SOMEOTHERTAG':
            $txt = "SOME OTHER TEXT XX"; 
            break; 
        case 'THISHASDEFAULT':
            $txt = ""; 
            break; 
        default:
            $txt = "NOTAG"; 
    }
    if ($txt == "") {
        $txt = $default_value; 
    }
    // Create RTF Line breaks in text, if any. 
    $txt = str_replace(chr(10), chr(10)."''line", $txt); 
    // Do the replace in the file. 
    $mergetext = str_replace($whole_tag, $start.$txt.$end, $mergetext); 
}
// Put back the escape characters. 
$file = str_replace("€€", "''", $mergetext);
// Save to file. Extention .doc makes it open in Word by default. 
file_put_contents("ResultDoc.doc", $file); 
?>