Hi all, I am trying to use a TextFragmentAbsorber to get an entire line of text from a PDF.
I am certain that the text we are looking for is all on one line; I have engineered it to be the case since PDF searching across lines seems to be difficult.
I am using a TextAbsorber to get the page text I am searching (so that I can try to track down the cause of this issue). Text from absorber attached as Aspose_PageText.txt -- gathered from the TextVisualizer in VisualStudio, and then pasted into a text file for attaching.
The text we are giving to the TextFragmentAbsorber is:
Prepared by: Reviewed by: Reviewed by Accounting:
I am finding that the page text contains the search text, but the TextFragmentAbsorber returns 0 TextFragments.
Code:
TextFragmentAbsorber Absorber = new TextFragmentAbsorber("Prepared by: Reviewed by: Reviewed by Accounting:");
TempPDF.Pages[1].Accept(Absorber);
FinalTextFragments = Absorber.TextFragments;
From here, when I check FinalTextFragments.Count I get 0 every time.
Am I doing something incorrectly? Is the string not as-presented? Do I need to provide options? I am not doing a regex search, I am searching for literal text, but do I need to run this as regex anyway?
Any guidance is appreciated -- I am in a bit of a time crunch and I have used TextFragmentAbsorbers successfully many times, but I am really not sure why I can't find text that I know is present (I pulled the text from the PDF in the first place, so I know it is in there!)