Hi,
Extracting text from pdf files uses too much memory. My code is like this:
Aspose.Pdf.Document doc;
Aspose.Pdf.Text.TextAbsorber textAbsorber = new Aspose.Pdf.Text.TextAbsorber();
using (doc = new Aspose.Pdf.Document(fileName))
doc.Pages.Accept(textAbsorber);
string text = textAbsorber.Text;
doc.Dispose();
I'll extract text of pages seperately so will call this function -doc.Pages[x].Accept(textAbsorber- for every page and because textAbsorber is not Disposable I'll need to call GC.Collect() every time.
Do you have any solution/suggestion for my case?
Some files give OutOfMemoryException. You may download one example file from following link http://www.filedropper.com/mukayeseraporu