When OCR is too much of a good thing

By Steve Weissman posted 11-28-2012 10:40


As an information management consultant and best-practices instructor, I spend a lot of time touting the virtues of OCR, which we all know can save organizations gobs of time and money when capturing enterprise content. But there are times when it can be too much of a good thing, most notably in situations involving personal information.

OCR's greatest strength lies in its ability to pick up and interpret pretty much everything in the way of letters and numbers that's on a given page. This does wonders for throughput and efficiency, but the technology’s indiscriminate nature also means it may input personal information that really ought to be protected.

Healthcare is one arena in which this issue is especially acute, not in the least because the Health Insurance Portability and Accountability Act (HIPAA) include specific language regarding privacy. In fact, the US Department of Health and Human Services published document just two days ago on the subject of (deep breath) “Guidance Regarding Methods for De-identification of Protected Health Information in Accordance with the [HIPAA] Privacy Rule.”  

Developed by the agency’s Office for Civil Rights – coincidentally known by the acronym OCR – it focuses on the removal of personal identifiers from people’s “protected health information.” But the detail it contains speaks well to the broader issue, and what it reminds me is that optical character recognition, as great as it is, needs to be deployed in partnership with other compliance-aware solutions and personnel (yes, actual human beings) to ensure data that shouldn't be made generally available, isn't.

#ElectronicRecordsManagement #OCR #privacy #ScanningandCapture #Capture #InformationGovernance