We have many tiff documents stored in UCM Webcontent Content and we need to perform redaction on all of these documents. We need to replace the original file with the new redacted file so that there is no trace of the original file. First of all we need to do this as a one off exercise on all documents that are currently stored and secondly on any new documents that are automatically stored in UCM.
My questions are:
- Is this feasible with AutoVue? (I believe that it is if done manually but can this be achieved as a batch process?)
- Is this much effort (days , weeks or months)?
- Do people who has used Autovue believe that this is a valid solution?
I would be really interested to hear your opinions. I know its a very open question but would still be interested to canvas opinion.
We need to redact all credit card numbers within our TIFF documents in Webcentre Content 11g.
We have a few million scanned documents that are stored as tiffs although not all of these contain credit card details. We are only concerned with obscuring the 16 digit credit card number but all of these are handwritten. To further complicate things there is no standard form used meaning that these details could appear on any part of the scanned image.
We first need to identify that a 16 digit number exists on the document and then redact that data.
Yes, we do have customers that are using AutoVue to redact documents.
Have you read: https://blogs.oracle.com/enterprisevisualization/entry/redaction_in_autovue
Most of our batch redaction is done in the following way:
1) Search the document for the required text.
2) Get the bounding rectangle of the found text.
3) Construct a Rectangle Markup Entity with the same coordinates (to hide the underlying text).
4) Convert the document with the opaque markup to TIFF.
The problem with your situation is that you have TIFF images or your documents. These documents do not contain any searchable text.
Because AutoVue does not have character recognition, you would have to draw the markup rectangle manually.
The only way that I see this working is if you first convert the TIFF images to PDF using another tool that makes them searchable.
Also, our UCM integration can automatically check-in the redacted conversion, but will not delete the original. You will have to do it externally.
I presume you are aware that IBR TIF to PDF converter requires a 3rd party tool from a company called CVISION (http://www.cvisiontech.com/). See http://docs.oracle.com/cd/E23943_01/doc.1111/e10724/c02_wcc.htm , section 22.214.171.124 Tiff Converter.
Not sure if you are aware but CVISION also provide a hand writing recognition component that extends integrated OCR component (http://www.cvisiontech.com/solutions/general/handwriting-recognition.html?lang=eng). Might be worth checking it out for your use case.
As George said, assuming you can use CVISION to Convert fromTIF to PDF including OCR of handwritten Credit Card details then AutoVue APIs can be used to search for CCN, automatically redact, burn and check in new TIF/PDF rendition/version.
Interested to know how you get on (I have a couple of other WCC/AV redaction ops that require similar solution).