Hi , We are using aspose to extract arabic text from pdf files .
The problem is the extracted text looks encrypted , our code :
public String getString() throws Exception {
com.aspose.pdf.Document pdfDocument =null;
String extractedText = "";
try {
if (inputStream == null) {
pdfDocument = new com.aspose.pdf.Document(this.path);
}
else {
pdfDocument = new com.aspose.pdf.Document(this.inputStream);
}
com.aspose.pdf.TextAbsorber textAbsorber = new com.aspose.pdf.TextAbsorber();
pdfDocument.getPages().accept(textAbsorber);
extractedText = textAbsorber.getText();
}
finally {
pdfDocument.freeMemory();
pdfDocument.dispose();
pdfDocument.close();
pdfDocument=null;
}
return extractedText;
}
Attached the Result of text extraction with sample pdf file.
Could you please assist us to solve this issue .
Thanks in advance.