分割文本到一个多维数组php


split text into a multidimensional array php

我很难做到这一点,所以我有一个文件与以下文本

The Boys in the Boat    Daniel James Brown  067002581X  19.99   5   16.99   9   16.99
Harry Potter and the Cursed Child   J. K. Rowling, Jack Thorne, John Tiffany    1338099133  18.95   25  17.98       0   17.98
Just Mercy  Bryan Stevenson 0812994520  17.50   8   16.25   10  16.25
Me Before You   Jojo Moyes  0670026603  18.95   2   17.50   1   17.25
A Thousand Splendid Suns    Khaled Hosseini 1594489505  19.00   7   15.50   4   14.95
The Wright Brothers David McCullough    1476728742  21.95   3   18.95   3   18.95

我需要以某种方式使用函数将其读入二维关联数组。从头开始创建数组没有问题事实上我已经做过了。数组看起来应该像

$books = array( 
        "The Boys in the Boat" => array (
           "author" => 'Daniel James Brown',
           "isbn" => '067002581X',  
           "hardcover" => 19.99,
            "quantity" => 5,
            "softcover" => 5.99,
            "e-book" => 6.99,
        ),
"Jungle" => array (
           "author" => 'Upton Sinclair',
           "isbn" => '067002581',   
           "hardcover" => 19.99,
            "quantity" => 5,
            "softcover" => 5.99,
            "e-book" => 6.99,
        ),

     );

我不知道如何创建一个函数,将通过文件文本逐行,使2d数组…我知道我必须使用爆炸,但我不知道使用什么分隔符,因为如果我使用空格,它将不起作用,并且文件中没有其他分隔符。请帮助我花了一整天的时间在这个上面…

这就是我想出来的。

请注意代码中的注释,并注意我假设您的示例文件在发布时被更改(制表符到空格)。

如果你有一组空格作为分隔符,这将变得更加困难,因为我们可能在一个字段中有相同的一组空格(例如;书的名字)。这可能使脚本不能按预期运行。您可以将"'t"更改为" "(四个空格),如果这是您所拥有的,您需要测试文件/字符串才能知道。

制表符是可以的,但是你不能在任何文本域中也有制表符,否则会失败。

因为这里的制表符被改为空格,所以我在下面的代码中将它们硬编码为't

<?php
# This is basically just exploding the elements 2 times
# and getting the elements in the right order
# This will hold the array with info
$finalResult = array();
# Your input text to be parsed
# I am supposing you have TABS as delimiters and the order never changes
$inputText = <<<EOF
The Boys in the Boat'tDaniel James Brown't067002581X't19.99't5't16.99't9't16.99
Harry Potter and the Cursed Child'tJ. K. Rowling, Jack Thorne, John Tiffany't1338099133't18.95't25't17.98't0't17.98
Just Mercy'tBryan Stevenson't0812994520't17.50't8't16.25't10't16.25
Me Before You'tJojo Moyes't0670026603't18.95't2't17.50't1't17.25
A Thousand Splendid Suns'tKhaled Hosseini't1594489505't19.00't7't15.50't4't14.95
The Wright Brothers'tDavid McCullough't1476728742't21.95't3't18.95't3't18.95
EOF;
# First break each line into array of lines
# If you are sure this file comes from *nix you can use 'n as newline
# If you are sure this comes from Windows you can use 'r'n
# Or you can use PHP_EOL but this use the current system as a basis for what to use
$textLines = explode("'n", $inputText);
# Now we go through each line getting all data
foreach($textLines as $line) {
    # Get each tab-separated field
    $expLine  = explode("'t", $line);
    # Sanity check
    if (count($expLine) < 8) {
        # The line does not have enough items, deal with error
        echo "Item " . (isset($expLine[0]) ? $expLine[0]." " : "") . "ignored because of errors'n";
        continue;
    }
    # new item
    # I have changed this a bit as it seems you have more fields to get than
    # what shows on your example (2 quantities, for hard and softcovers)
    $finalResult[$expLine[0]] = array(
        "author"      => $expLine[1],
        "isbn"        => $expLine[2],  
        "hardcover"   => $expLine[3],
        "hc-quantity" => $expLine[4],
        "softcover"   => $expLine[5],
        "sc-quantity" => $expLine[6],
        "e-book"      => $expLine[7],
    );
}
# Just show the data structure
var_dump($finalResult);
?>

然后是结果:

array(6) {
  ["The Boys in the Boat"]=>
  array(7) {
    ["author"]=>
    string(18) "Daniel James Brown"
    ["isbn"]=>
    string(10) "067002581X"
    ["hardcover"]=>
    string(5) "19.99"
    ["hc-quantity"]=>
    string(1) "5"
    ["softcover"]=>
    string(5) "16.99"
    ["sc-quantity"]=>
    string(1) "9"
    ["e-book"]=>
    string(5) "16.99"
  }
  ["Harry Potter and the Cursed Child"]=>
  array(7) {
    ["author"]=>
    string(40) "J. K. Rowling, Jack Thorne, John Tiffany"
    ["isbn"]=>
    string(10) "1338099133"
    ["hardcover"]=>
    string(5) "18.95"
    ["hc-quantity"]=>
    string(2) "25"
    ["softcover"]=>
    string(5) "17.98"
    ["sc-quantity"]=>
    string(1) "0"
    ["e-book"]=>
    string(5) "17.98"
  }
  ["Just Mercy"]=>
  array(7) {
    ["author"]=>
    string(15) "Bryan Stevenson"
    ["isbn"]=>
    string(10) "0812994520"
    ["hardcover"]=>
    string(5) "17.50"
    ["hc-quantity"]=>
    string(1) "8"
    ["softcover"]=>
    string(5) "16.25"
    ["sc-quantity"]=>
    string(2) "10"
    ["e-book"]=>
    string(5) "16.25"
  }
  ["Me Before You"]=>
  array(7) {
    ["author"]=>
    string(10) "Jojo Moyes"
    ["isbn"]=>
    string(10) "0670026603"
    ["hardcover"]=>
    string(5) "18.95"
    ["hc-quantity"]=>
    string(1) "2"
    ["softcover"]=>
    string(5) "17.50"
    ["sc-quantity"]=>
    string(1) "1"
    ["e-book"]=>
    string(5) "17.25"
  }
  ["A Thousand Splendid Suns"]=>
  array(7) {
    ["author"]=>
    string(15) "Khaled Hosseini"
    ["isbn"]=>
    string(10) "1594489505"
    ["hardcover"]=>
    string(5) "19.00"
    ["hc-quantity"]=>
    string(1) "7"
    ["softcover"]=>
    string(5) "15.50"
    ["sc-quantity"]=>
    string(1) "4"
    ["e-book"]=>
    string(5) "14.95"
  }
  ["The Wright Brothers"]=>
  array(7) {
    ["author"]=>
    string(16) "David McCullough"
    ["isbn"]=>
    string(10) "1476728742"
    ["hardcover"]=>
    string(5) "21.95"
    ["hc-quantity"]=>
    string(1) "3"
    ["softcover"]=>
    string(5) "18.95"
    ["sc-quantity"]=>
    string(1) "3"
    ["e-book"]=>
    string(5) "18.95"
  }
}

由于所有这些字符之间都有两个或两个以上的空格字符:

//create books array
$books = array();
//$text to array by each line
$arr = explode("'n",$text);
//explode and store each line
foreach ($arr as $line){
$arr2 = explode('  ',$line);
$books[] = array (
                  'name'=>trim($arr2[0]),
                  'author'=>trim($arr2[1]),
                  'isbn'=>trim($arr2[2]),
                  'hardcover'=>trim($arr2[3]),
                  'quantity'=>trim($arr2[4]),
                  'softcover'=>trim($arr2[5]),
                  'e_book'=>trim($arr2[6])
                 );
}
//the array that you want
print_r($books);

PHP允许这样做,但我没有在数组键中使用name,因为它有空间。如果你坚持,你可以修改这一行

$books[] = array (

$books[$arr2[0]] = array (

考虑每行是一个子数组,每个书名是一个索引,可以使用preg_match将其分割为2个或更多的空格,返回的数组的第一项是书名(索引),另一项是子数组中的值。也就是说,你可以这样写:

<?php
$book_info_keys = array('author', 'isbn', 'hardcover', 'quantity', 'softcover', 'e-book');
$input_file = 'a.txt';
$result = array();
foreach (file($input_file) as $line) { // Read the file as an array
  // On each line, split items by 2 or more spaces
  $a = preg_split('/ {2,}/', $line);
  // Now we divide the line into two part: the first item is book name,
  // the rest is book info.
  $book_name = $a[0];
  $book_info = array();
  $i = 1; // $a[0] is book name, so the first info is at index 1
  foreach ($book_info_keys as $key) {
    $book_info[$key] = $a[$i++];
  }
  // Now insert the infos into the result array, with book name served as index
  $result[$book_name] = $book_info;
}
var_dump($result);
?>

并且,记住上面的代码不是为了在现实生活中使用而编写的。尽可能在任何地方添加错误检查,永远不要假设输入文件总是有效的(正如我在上面的代码中所假设的那样)。可能在某些行上有8、9或更多项,而在其他行上只有6、5或更少项。在这种情况下,上面的代码将会失效。