Oracle Text (MOSC)

MOSC Banner

full-text indexing of pdf files

edited Apr 22, 2009 11:19PM in Oracle Text (MOSC) 5 commentsAnswered
Our customer requires full-text search of pdf files.  Most of these files are created by a utility that does text recognition on scanned files and converts them to Adobe pdf format.

The customer is using Oracle 9i.

We have found that the native inso filter does not index many files.  Therefore, we have implemented the workaround using the Adobe ifilter and the ifilter.bat / filtdump.exe solution.

The indexing is much improved.  However, occasionally when indexing many files, it appears that the indexing process becomes corrupted while indexing a file (we think it may happen with very large files) and subsequent files are not indexed, although no error is reported in CTXSYS.DR$INDEX_ERROR table.

Howdy, Stranger!

Log In

To view full details, sign in to My Oracle Support Community.

Register

Don't have a My Oracle Support Community account? Click here to get started.

Category Leaderboard

Top contributors this month

New to My Oracle Support Community? Visit our Welcome Center

MOSC Help Center