Good evening, dear aspose developers.
I need your help with a pesky issue concerning pdf into html convertion.
Locally (windows/glassfish) the application works perfectly fine, but when deployed to weblogic on our linux machine, it fails to convert a simple one-paged pdf-file into html. Namely, all the text-content of a file gets skipped as it is converted — we get an html-page with all the images but with no text whatsoever and no exception is thrown.
So, for example, on windows machine we get this:
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<title>
</title>
<link rel="stylesheet" type="text/css" href="converted_files/style.css" />
</head>
<body>
<object data="converted_files/img_01.svg" type="image/svg+xml" class="stl_01">
<embed src="converted_files/img_01.svg" type="image/svg+xml" class="stl_01" />
</object>
<div class="stl_02"><span class="stl_03">This is a test pdf file</span></div>
<div class="stl_04"><span class="stl_03">Converted from pdf into html</span></div>
</body>
</html>
As opposed to what we get on our linux machine (all the text content disappears, and this happens to every pdf file we try to convert):
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<title>
</title>
<link rel="stylesheet" type="text/css" href="converted_files/style.css" />
</head>
<body>
<object data="converted_files/img_01.svg" type="image/svg+xml" class="stl_01">
<embed src="converted_files/img_01.svg" type="image/svg+xml" class="stl_01" />
</object>
</body>
</html>
The details:
- First — we create a Document and embed fonts into it as it is recommended here: http://www.aspose.com/docs/display/pdfjava/Embedding+Fonts+in+an+existing+PDF+file
- Second — we optimize the document and convert it into html saving the result into a temporary file.
Thanks in advance for your help.