The Skyhigh Security DLP engine extracts text from supported image files using best-in-class Optical Character Recognition (OCR). You can use policies and Classifications for Skyhigh Security Service Edge or Skyhigh CASB to detect violations and cause an incident to be detected. OCR and Classification are available for both Skyhigh Security Service Edge or Skyhigh CASB.
OCR extends DLP protection against tax paperwork, passports, credit card information, or any other personally identifiable data that could be uploaded to the cloud or shared as images. This also fills the gaps in situations where confidential content could be shared even when users are prevented from copying and pasting data.
The OCR engine extracts text from the images and evaluates the files against the match rule criteria configured as part of the DLP policies. For instance, if a credit card image is encountered, the number is extracted and matched against the Classification and conditions configured as part of DLP policy. Or, if a design document's sections are encountered as images either as standalone images, or embedded within another file, the text is extracted, and matched against the fingerprint to detect and prevent the leak. No change to DLP policies are needed as the rules, exception criteria, and response rules apply to the images as well.
If you purchase the OCR feature, it is enabled by default for Skyhigh Security Service Edge or Skyhigh CASB DLP policies. You can also disable the feature to avoid a slowdown. For details, see Enable OCR.
- OCR only works with Classifications. It does not support legacy data identifiers.
- Fedramp/GovCloud environment does not support OCR.
Supported File Types
The following file types are supported with OCR:
- JPEG, JPEG 2000, JFIF
- JB2, JBIG2