Quantcast
Channel: Aspose.Pdf Product Family
Viewing all articles
Browse latest Browse all 3131

Not Finding TextFragment using TextFragmentAbsorber

$
0
0
Hi all,  I am trying to use a TextFragmentAbsorber to get an entire line of text from a PDF.
I am certain that the text we are looking for is all on one line; I have engineered it to be the case since PDF searching across lines seems to be difficult.

I am using a TextAbsorber to get the page text I am searching (so that I can try to track down the cause of this issue).  Text from absorber attached as Aspose_PageText.txt -- gathered from the TextVisualizer in VisualStudio, and then pasted into a text file for attaching.

The text we are giving to the TextFragmentAbsorber is:

Prepared by:                                      Reviewed by:                                          Reviewed by Accounting:

I am finding that the page text contains the search text, but the TextFragmentAbsorber returns 0 TextFragments.

Code:
TextFragmentAbsorber Absorber = new TextFragmentAbsorber("Prepared by:                                      Reviewed by:                                          Reviewed by Accounting:");
TempPDF.Pages[1].Accept(Absorber);
FinalTextFragments = Absorber.TextFragments;


From here, when I check FinalTextFragments.Count I get 0 every time.

Am I doing something incorrectly?  Is the string not as-presented?  Do I need to provide options?  I am not doing a regex search, I am searching for literal text, but do I need to run this as regex anyway?

Any guidance is appreciated -- I am in a bit of a time crunch and I have used TextFragmentAbsorbers successfully many times, but I am really not sure why I can't find text that I know is present (I pulled the text from the PDF in the first place, so I know it is in there!)




Viewing all articles
Browse latest Browse all 3131

Trending Articles