The TextExport utility of Outside In Engine is what you would need to achieve the requirement . This is used in WCC when doing Full text indexing where in the entire contents of a file is exported to a text file and then sent to DB .
If you have WCC environment set up then you can run / test this from the following location (assuming this is a WCC 11g - 18.104.22.168.0 or higher)
Navigate to <mw_home>/Oracle_ECM1/oit/<platform>/lib/contentaccess
Then run ./textexport and it will list the parameters .
The output file generated will be a text file with entire contents of the input file .
Hope this helps .
You can use Dynamic Converter which can convert document to web page without having to use the native application to view the body. This works only when a request is made to the document via the web browser. There is no out of the box solution to archive the body of documents.
I'm not able to open TextExport utility exe file from windows,,
error: '.' is not recognized as an internal or external command,
operable program or batch file.
And do we have any documentation for this utility??
On windows you don't have to use the dot (.) . Just type this :
textexport.exe and there it will show the options .
Sample manifest file is also listed in the options , so all that you have to do is create manifest.hda file similar to one showed there and use it for running the command . Verify the output.
I have created manifest file and placed both manifest and input file in folder where textexport utilty reside
and ran command textexport startbatch... Its showing nothing
is this the correct way to do this or do i need to specify the manifest file name
Any documentation on this will help
The command syntax is as follows :
textexport.exe -c manifest.hda
Sample manifest file contents is as follows :
Run this and see the output.txt file will have the word file's contents in tokenized form .
Documentation link :Oracle Outside In Technology
Thanks Srinath its perfectly getting what am looking for
One more Question, Does utility provide the way to get metadata along content for the documents??
I'm aware of achieving it through archiver!!!
Am looking for Content of the Document + plus metadata in single file
1 person found this helpful
For that you may have to write a Java code where in you will need to search for the items , iterate through the result set and do these 2 actions :
1. Run the textexport utility for the content's native file
2. From the DOC_INFO result set export the metadata to a corresponding file .
This way you can use a single custom utility to achieve your requirement .
For every Document do we need to have manifest file or is there way to add 'n' input file and 'n' output files in single manifest.