You're almost there! Please answer a few more questions for access to the Applications content. Complete registration
Interested in joining? Complete your registration by providing Areas of Interest here. Register

Extract data from PDF (Scanned image) based file

Summary:

Use Case:

  1. The user uploads a file from the front end.
  2. Based on the user’s instructions, specific details are extracted from the document.
  3. The extracted details are stored in corresponding table columns, where they can be reviewed and processed further by the user.

Current Approach:

| Step | Description                                              

|------|------------------------------------------------------------|

| 1    | **Start**                                                  |

| 2    | **Upload PDF file**                                        |

| 3    | **Extract content and instructions**                       |

| 4    | **Convert to vector embeddings**                           |

| 5    | **Store in 23ai vector database**                          |

| 6    | **Perform vector search**                                  |

| 7    | **Pass relevant vectors to LLM**                           |

| 8    | **LLM refines the result**                                 |

| 9    | **Output final refined result**                            |

This Approach works fine when pdf is text based.

Obstacle:

We’ve encountered a significant limitation in processing the PDF documents due to the following reasons:

Howdy, Stranger!

Log In

To view full details, sign in.

Register

Don't have an account? Click here to get started!