We wanted to convert text, image & HTML in to Single Multipage
PDF based on selections.
//Create
Aspose PDF Document Object
Aspose.Pdf.Document m_PDFDocument = new Aspose.Pdf.Document();
System.IO.MemoryStream
mOutStream = newMemoryStream();
foreach (string FilePath in Filnames)
{
string ext = System.IO.Path.GetExtension(FilePath);
byte m_binaryData = System.IO.File.ReadAllBytes(FilePath);
if (ext.ToUpper() == "TXT")
{
Aspose.Pdf.Page mPage =
m_PDFDocument.Pages.Add();
System.Text.ASCIIEncoding enc = new System.Text.ASCIIEncoding();
Aspose.Pdf.Text.TextFragment t2 = new Aspose.Pdf.Text.TextFragment(enc.GetString(m_BinaryData));
mPage.Paragraphs.Add(t2);
}
elseif (ext.ToUpper() == "HTML")
{
//Create HTML PDF Document
Aspose.Pdf.Document mHTMLPDFDoc;
HtmlLoadOptions
htmlLoadOptions = newHtmlLoadOptions();
System.IO.MemoryStream
mStream;
mStream = newMemoryStream(m_BinaryData);
mHTMLPDFDoc = new Aspose.Pdf.Document(mStream,
htmlLoadOptions);
foreach (Aspose.Pdf.Page mPDFPage in mHTMLPDFDoc.Pages)
{
m_PDFDocument.Pages.Add(mPDFPage);
}
}
else
{
//Image conversion Code.
}
}
m_PDFDocument.Save(mOutStream);
Issue: Above is the
code snippet using new DOM approach; HTML Document not able to convert to PDF
properly, it lost its font related information and shows the junk characters.
Have a look on attached “Output_Sample.pdf”. If you look over my observation
mentioned below, the issue only replicates when we are adding pages of Object
which represents HTML Content in to the Main PDF Object which contains pdf
representation of different types of document like Text, images & HTML.
We have also tried out by creating HTMLFragement object from
HTML String; but it throws an Out of Memory exception with one of our user’s HTML File while converting
to PDF. However, it will be able to converted to PDF using new DOM approach but having the issue mentioned above.
mPage.Paragraphs.Add(mHTMLFragement);
Observation: if
we create the HTMLPDF Object and call the “Save” method of that object instead
of adding the pages of it in to the Main PDF Object then it will convert to PDF
properly. Have a look on below Code:
//Create
HTML PDF Document
Aspose.Pdf.Document mHTMLPDFDoc;
HtmlLoadOptions
htmlLoadOptions = newHtmlLoadOptions();
System.IO.MemoryStream
mStream;
mStream = newMemoryStream(m_BinaryData);
mHTMLPDFDoc = new Aspose.Pdf.Document(mStream,
htmlLoadOptions);
mHTMLPDFDoc.Save(mOutStream);