如何使用pdf2htmlEX在PHP中将PDF文件转换为HTML文件


how to use pdf2htmlEX to convert pdf file to html file in php

如何使用pdf2htmlEX在php中将pdf文件转换为html文件

这是链接:-

https://github.com/coolwanglu/pdf2htmlEX

如果有人知道,请帮忙

提前谢谢。

正在解决这个问题让它工作,共享代码,以便它可以帮助某人:)

它需要在linux上安装pdf2htmlEX和pdftocairo才能工作。

$ext = pathinfo($_FILES['file']['name'], PATHINFO_EXTENSION);
$allowedExt = array('pdf');
if (in_array(strtolower($ext), $allowedExt)) {
    $upload_file = time() . '.pdf';
    $upload_file = str_replace(' ', '_', $upload_file);
    if (!file_exists('upload_directory')) {
        mkdir('upload_directory', 0777, true);
    }
    if (move_uploaded_file($_FILES['file']['tmp_name'], 'upload_directory/' . $upload_file)) {
        $unq_no = 1; //Can be from database entry
        $file_name = $upload_file;
        $pdf_path = 'upload_directory/' . $file_name;
        $name = str_replace('.pdf', '', $file_name);
        $save_path = 'upload_directory/' . $unq_no;
        new Folder($save_path, true, 0777);
        $pdf_thumb_save_path = $save_path . '/' . $name;
        if (!file_exists($pdf_thumb_save_path)) {
            mkdir($pdf_thumb_save_path, 0777, true);
        }
        shell_exec("pdf2htmlEX --dest-dir $save_path --embed cfi --fit-width 760 --hdpi 72 $pdf_path");
        shell_exec("pdftocairo -png -singlefile  $pdf_path $pdf_thumb_save_path");
        if (file_exists($save_path . '/f1.woff')) {
            shell_exec("/path/rename_font.fs $save_path/f1.woff f1 $save_path/");
        }
        if (file_exists($save_path . '/f2.woff')) {
            shell_exec("/path/rename_font.fs $save_path/f2.woff f2 $save_path/");
        }
        if (file_exists($save_path . '/f3.woff')) {
            shell_exec("/path/rename_font.fs $save_path/f3.woff f3 $save_path/");
        }
        if (file_exists($save_path . '/f4.woff')) {
            shell_exec("/var/app/current/webroot/img/uploads/rename_font.fs $save_path/f4.woff f4 $save_path/");
        }
        if (file_exists($save_path . '/f5.woff')) {
            shell_exec("/path/rename_font.fs $save_path/f5.woff f5 $save_path/");
        }
        if (file_exists($save_path . '/f6.woff')) {
            shell_exec("/var/app/current/webroot/img/uploads/rename_font.fs $save_path/f6.woff f6 $save_path/");
        }
        if (file_exists($save_path . '/f7.woff')) {
            shell_exec("/path/rename_font.fs $save_path/f7.woff f7 $save_path/");
        }
        if (file_exists($save_path . '/f8.woff')) {
            shell_exec("/path/rename_font.fs $save_path/f8.woff f8 $save_path/");
        }

        $base_folder_path = 'uploads/pdfs/html/' . $unq_no . '/';
        $file_path = $base_url . $base_folder_path . $name . '.html';
        $css_path = $base_url . $base_folder_path . $unq_no . '/';
        chmod($pdf_thumb_save_path, 0777);
        $current_data = file_get_contents($file_path);
        $modified_data = str_replace('"stylesheet" href="', '"stylesheet" href="' . $css_path, $current_data);
        $file_handle = fopen($pdf_thumb_save_path . '.html', 'w');
        fwrite($file_handle, $modified_data);
        fclose($file_handle);
    }
}

你看过维基吗?

例如,请参阅:https://github.com/coolwanglu/pdf2htmlEX/wiki/Quick-Start

我使用了Scribd Platform API,它易于实现且效果最好。

谢谢