我想清理一个.txt文件,通过在每四个单词之后插入一个换行符,使其更具可读性。文件已被数组中的数据填充。
这是我目前为止通过观察这个和这个得出的结论:
<?php
require "simple_html_dom.php";
$html = file_get_html("http://www.lipsum.com/");
$data = array();
$counter = 0;
foreach($html->find("div") as $tr){
$row = array();
foreach($tr->find("div") as $td){
$row[] = $td->plaintext;
}
$data[] = $row;
}
ob_start();
var_dump($data);
$data = preg_replace('~(('w+'s){4})~', '$1' . "'n", $data);
file_put_contents('new_text_file.txt', $data);
//$handlefile = fopen("newfile.txt", "w") or die("Unable to open file!");
//file_put_contents('newfile.txt', $data);
//$output = ob_get_clean();
//$outputFile = "newfile.txt";
//fwrite($handlefile, $output);
//fclose($handlefile);
?>
正如您所看到的,我创建了一个for循环和两个if语句来"计数"单词之间的间距,当counter
变量达到3时,插入一个换行符。但它不起作用,因为网站上的数据没有被打印到文本文件中。但是,如果我去掉for循环和if语句,当然不需要排序,它就能工作。任何类型的帮助都是值得赞赏的!
编辑:更新后的代码。
编辑2:最终工作版本。原问题留作参考这是最终工作版本:
<?php
require "simple_html_dom.php";
$html = file_get_html("http://www.lipsum.com/");
$data = array();
$counter = 0;
foreach($html->find("div") as $tr){
$row = array();
foreach($tr->find("div") as $td){
$row[] = $td->plaintext;
}
$data[] = $row;
}
ob_start();
var_dump($data);
$handlefile = fopen("newfile.txt", "w") or die("Unable to open file!");
file_put_contents('newfile.txt', $data);
$output = ob_get_clean();
$outputFile = "newfile.txt";
fwrite($handlefile, $output);
fclose($handlefile);
function MakeFileReadable($source , $export) {
$content = file_get_contents($source);
$x = explode (" " , $content);
$newx = "";
$count = 1;
foreach ($x as $word) {
$newx .= $word . " ";
if ($count %12 ==0) $newx .= "'r'n";
$count ++;
}
$fp = fopen("file-export.txt" , 'w');
if (!$fp) die("There is a problem with opening file...");
fwrite($fp , $newx);
fclose($fp);
}
MakeFileReadable("newfile.txt" , "file-export.txt");
?>
使用一些数组函数
$string = "one two three four five six seven eight nine ten eleven tweleve thirteen fourteen";
$arr = explode (" " , $string);
$lines = array_chunk($arr,4);
foreach($lines as $line)
echo implode (" ", $line)."'r'n";
结果one two three four
five six seven eight
nine ten eleven tweleve
thirteen fourteen
这是一个正则表达式的解决方案。
<?php
$test = 'one two three four five six seven eight nine ten eleven tweleve thirteen fourteen';
echo preg_replace('~(('w+'s){4})~', '$1' . "'n", $test);
输出:one two three four
five six seven eight
nine ten eleven tweleve
'w
是一个单词字符(字母数字字符加上"_",http://en.wikipedia.org/wiki/Regular_expression),如果您只想要A-Z
,则使用[a-z]
并使用i
修饰符使其不区分大小写,~(('w+'s){4})~i
。's
是一个空格,{4}
需要出现4次'w+'s
。
根据你的代码…
$data = preg_replace('~(('w+'s){4})~', '$1' . "'n", $data);
file_put_contents('new_text_file.txt', $data);
http://php.net/manual/en/function.file-put-contents.php 每个更新的代码:
<?php
require "simple_html_dom.php";
$html = file_get_html("http://www.lipsum.com/");
$data = array();
$counter = 0;
foreach($html->find("div") as $tr){
$row = array();
foreach($tr->find("div") as $td){
$row[] = $td->plaintext;
}
$data[] = $row;
}
$data = preg_replace('~(('w+'s){4})~', '$1' . "'n", implode(' ', $data));
file_put_contents('new_text_file.txt', $data, FILE_APPEND | LOCK_EX);
最终测试代码:
<?php
function MakeFileReadable($source , $export) {
$content = file_get_contents($source);
$x = explode (" " , $content);
sort($x);
$newx = "";
$count = 1;
foreach ($x as $word) {
$newx .= $word . " ";
if ($count %4 ==0) $newx .= "'r'n";
$count ++;
}
$fp = fopen($export , 'w');
if (!$fp) die("There is a problem with opening file...");
fwrite($fp , $newx);
fclose($fp);
}
MakeFileReadable("file-input.txt" , "file-export.txt");