Save Shadow/Web DLP Evidence and Match Highlights
| Advanced DLP: To access Save Shadow/Web DLP Evidence feature, you require additional entitlement. Contact Skyhigh Support or your account manager for assistance. | 
DLP Evidence is a copy of the compromised content that violates a Web Data Loss Prevention (DLP) policy detected during the policy evaluation. Match highlights are excerpts from a document that include highlighted keywords violating a DLP policy, along with surrounding text. The copy of the compromised content and document excerpt is associated with the appropriate incident and is saved in your data storage to allow you to perform additional forensics on generated incidents.
Skyhigh allows you to save evidence files with a maximum limit of 250 MB for Shadow/Web DLP incidents. This enables you to review large evidence related to potential data breaches, conduct in-depth analysis, and resolve incidents effectively. For details on the contents of Shadow/Web DLP evidence, see Shadow/Web DLP Evidence Contents section.
You can download saved evidence files or view match highlights that are linked to Shadow/Web DLP incidents using:
- Shadow/Web DLP Policy Incident Cloud Card: Download evidence files or view match highlights linked to DLP incidents individually from the Shadow/Web DLP Policy Incident Cloud Card on the Policy Incidents page.
- API: Use the API to download evidence files or match highlight files associated with DLP incidents. You can also download all the evidence files or match highlight files for DLP incidents in bulk via API.
Getting Started
Follow these steps to get started with the Save Evidence and Match Highlights features:
- Enable Match Highlighting to store match highlights for Shadow/Web DLP incidents. For details, see Enable Match Highlighting.
- Configure your own data storage provider to store your evidence files or match highlight files. Currently, you can store your web evidence files or match highlight files only on Amazon Web Services. For details on how to set up storage, see Data Storage.
- Create a Web DLP Policy Rule and select the response as Save Evidence. For details, see Create a Shadow/ Web DLP Policy.
How it works?
On creating a new rule in Web DLP Policy, you can set an additional response named Save Evidence. When a Web DLP Policy is violated, the Save Evidence response is triggered, and evidence files or match highlight files are saved on the generated incidents. Match Highlighting should be enabled, and your own data storage provider must be configured to store the evidence files or match highlight files. If a DLP policy is deleted, the web evidence file or match highlight file stored in the policy is unaffected. A backup of the evidence file or match highlight file is retained and stored in the data storage provider (AWS). You can use the Shadow/Web DLP Incident Cloud Card or the API to download those evidence files or match highlight files from the data storage provider. The data stored in the provider are encrypted and to decrypt the data, the user should:
- View match highlights linked to DLP incidents individually from the Shadow/Web DLP Policy Incident Cloud Card on the Policy Incidents page. For details, see View Match Highlights for Shadow/Web DLP Incidents.
- Download evidence files linked to DLP incidents individually from the Shadow/Web DLP Policy Incident Cloud Card on the Policy Incidents page. For details, see Download Shadow/Web DLP Evidence.
- Download evidence files or match highlight files associated with DLP incidents via API. You can also download all the evidence files or match highlight files for DLP incidents in bulk via API. For details, see Retrieve Evidence API.
Shadow/Web DLP Evidence Contents
Shadow/Web DLP evidence covers various file types that are stored in your data storage provider (such as AWS) and are linked to specific DLP incidents. This evidence is crucial for understanding and investigating potential data loss events in shadow and web services. It includes the following file types:
- Match Highlight Files. These files contain excerpts from documents that include highlighted keywords or phrases that violate DLP policies, along with surrounding contextual text.
    - Each match highlight file can include up to 250 match strings per classification, with a total limit of 4,000 match strings.
 
- Source Files. These files contain the complete, unmodified data processed by the DLP classification engine.
    - Each source file includes headers and encoded content, making it potentially unreadable by humans, but serves as an accurate and comprehensive record of the transferred data.
- Files larger than 250 MB are not saved to your data storage provider.
 
- Extracted Files. These files consist of DLP content that has been detected and extracted from the original documents. This content is decoded and saved in its native format.
    - The extraction process facilitates easier review by DLP administrators, as these files present the relevant sensitive information in a more accessible format.
- Files larger than 250 MB are not saved to your data storage provider.
 
