Quantcast
Channel: Aspose.Pdf Product Family
Viewing all articles
Browse latest Browse all 3131

Java code to read "Marathi" (Indian local language) PDF and store it in MySQL and retrieving it .

$
0
0
Hi ,
           I am developing a project in java , which reads data from PDF (Marathi - (Indian local Language) ) and that data will be formatted .i.e. Only required fields will be stored in database. e.g.
Name of Voter,Address , age ( we can use for it split() or any other function in String) .

When user tries to search by name then all information about him/her will be displayed . I tried to read data from PDF using UTF-8. Its showing o/p but not in proper format .
i.e. some marathi words and some characters in between them. I want to store clear "Marathi" data in mysql and retrieve it also.

I tried following code for displaying "Marathi" data in console as initial step . after that I will store it in Mysql and then will display it. But following o/p shows only some Marathi woeds and some symbols.

Again in project its required to use "Marathi" keyboard . i.e. user will enter in Marathi data and will get "marathi" o/p.

Note-  I also changed default encoding from eclipse by pressing ctrl+Enter .  Encoding - UTF-8

Following is code I tried as first step.:
---------------------------------------------------------------------------------------------------------------------
import java.io.IOException;
import java.nio.charset.Charset;


import java.util.Locale;


//iText imports
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.parser.PdfTextExtractor;
import com.itextpdf.text.pdf.parser.TextRenderInfo;
public class iTextReadDemo {
  public static void main(String[] args) {
 try {
          PdfReader reader = new PdfReader("D://Vikram//Workspace//Projects//Election//List.pdf");
          System.out.println("This PDF has "+reader.getNumberOfPages()+" pages.");
          int i=reader.getNumberOfPages();
          byte[] bytes = new byte[10];
          Locale loc = new Locale("hi","IN");
          for(int i1=1;i1<=i;i1++)
          {
          String page = PdfTextExtractor.getTextFromPage(reader, 1);
          System.out.println("Page Content:\n\n"+new String(page.getBytes("UTF-8"))+"\n\n");
          }

} catch (IOException e) {
          e.printStackTrace();
      }
}
}







Viewing all articles
Browse latest Browse all 3131

Trending Articles