Skip to main content

Check out Interactive Visual Stories to gain hands-on experience with the SSE product features. Click here.

Skyhigh Security

Create a Custom Dictionary

You can define a Custom Dictionary and its keywords to classify your sensitive data. For example, sensitive keywords specific to your organization. You can import dictionary keywords in .CSV format, or you can manually enter keywords.

Create Custom Dictionary Classification

To create a custom dictionary:

  1. Log In to Skyhigh CASB.
  2. Go to Policy > DLP Policies > Classifications.
  3. Click Actions > Create Classification
    • Classification Name. Enter a name for this classification. For example, Medical Dictionary. Enter an optional description to describe its use or purpose.
    • Category. Select a Category from the list. For example, Sensitive.
    • Conditions. Click Select Criteria and select Dictionary. The Select Dictionaries cloud card displays.

NOTE: If you enable the Count each match string only one time checkbox, the unique match criteria apply to each dictionary in the classification. For example, if your classification has two dictionaries with the same keyword, then the classification will trigger two separate matches for the same keyword. 

 

  1. From the Dictionaries menu, click New. Enter a name and optional description for the dictionary. For example, Medical Keywords.
    1.png
  2. To enter keywords into the dictionary:
    • Click Add keyword to manually enter words or phrases. 
    • Click the three-dot menu to:
      • Import.csv. Import the keywords you want to add to this dictionary from a CSV file. For more details, see Import Keywords from CSV.
      • Advanced Settings. Advanced settings are flags that give more information to the DLP scanning engine. Define a keyword as Case-sensitive, Starts with, Ends with, and add a threshold Score to weight individual entries. Scores can be between negative or positive, -99 to 999. The higher the number, the greater the weight given to the keyword, which will exceed the threshold and trigger an incident. To learn more about Advanced Settings options, see Set Advanced Settings.
        2.png
      • Click Save. The custom dictionary is added to the Dictionaries list. Select the custom dictionary list to add to the Classification editor. You can add one or more custom dictionaries to the Dictionaries list.
        3.png

NOTE: You can select any number of Custom dictionaries from the Select Dictionaries cloud card, but only the first 10 Custom dictionaries are displayed on the Classification editor.

 

  1. Add more classification conditions as needed and click Save.

Your custom classification with a custom dictionary is saved to the selected category in the Classifications list. Add the classification to your data protection policies as needed.

Import Keywords from CSV 

When importing dictionaries from a CSV, the file should start with a heading row that contains specific column names to function correctly.

  • To import the basic or simple dictionaries, add the heading row as phrase.
  • For advanced dictionaries, add the heading rows as phrase, caseSensitive, startWith, endWith, score

A dictionary import will not be successful if these heading rows are not added correctly.

Example for a basic dictionary:

"phrase"

  • Email address
  • Date of Birth
  • Confidential

Example for the advanced dictionary:

"phrase", "caseSensitive", "startWith", "endWith", "score"

  • email address, false, false, false, 5
  • date of birth, false, true, true, 12

Custom Dictionary Use Cases

Count each match string only one time feature

Suppose you have a medical document with multiple instances of the keyword Patient Code and you have set the score for this keyword as 10 in the custom dictionary. This means that a match will only be triggered if the keyword Patient Code appears 10 or more times in the document. However, if you want to avoid triggering matches for duplicate counts, you can activate the Count each match string only one time checkbox. During the policy evaluation, the match will count only once, even though the score for the keyword is set to 10.  To find this option on UI, see Count each match string only one time.
clipboard_eca7db26342cc4c678682ee5296eb6fc3.png

Set Advanced Settings Options on the Custom Dictionary Keywords List

Let's say you have a significant medical recurring document that contains confidential information or sensitive keywords that should only be accessed by authorized personnel. To ensure the security of the document, you can set up an advanced dictionary option that alerts the DLP scanning engine with more precise information when anyone tries to access sensitive keywords beyond a specific limit. If a match is found, an incident is triggered to maintain the document's security.

To set Advanced Settings options such as weights (scores), case sensitivity, and partial string match to each keyword in a custom dictionary list, perform the following actions:

  1. Create a classification using a Custom dictionary. Perform the initial steps of creating your dictionary classification as provided in steps 1 to 5 in the Create Custom Dictionary Classification section.
    Use Cases.png
  2. Score. On adding the necessary keywords, you can set different scores for each keyword in the list by editing the default score [1] and set the threshold for the custom dictionary. For details, see Set the Threshold for the number of Keyword Matches in the Custom DictionaryIf the sum of all the triggered scores exceeds the threshold set for the custom dictionary, then a match is triggered. For example, if you set the threshold for the dictionary as 25:
    • Case 1: Defined a score for the keyword - Patient Phone Number. The score set for the Patient Phone Number keyword is 10. This means that when the keyword Patient Phone Number is found in the content, then the score will be 10. This does not exceed the threshold set for the custom dictionary so the classification does not trigger a match. 
    • Case 1 and Case 2: Defined a score for two keywords - Patient Phone Number and Patient Address. The score for the Patient Phone Number is 10, and the Patient Address is 20. This means that when the keywords Patient Phone Number and Patient Address are found in the content, then the total score will be 30. This exceeds the threshold set for the custom dictionary so the classification triggers a match.
  3. Case sensitive. On adding the necessary keywords, you can set the case-sensitivity options for your keywords to refine your search and to get the most accurate and relevant results possible.
    • Case 3: To match your content with the keyword patient name in lowercase, activate the Case Sensitive checkbox. If the keyword found in the content is in upper case i,.e PATIENT NAME, then a match will be triggered.
    • Case 4: If you want the keyword Patient Code to be matched regardless of its case (lower /upper) then do not activate the Case Sensitive checkbox. In this case, a match will be triggered based on the keyword or other advanced dictionary options and the case (upper/lower) will be ignored.
  4. Partial String Match (Starts with, Ends with). On adding the necessary keywords, you can set Starts with or Ends with options for your keywords to perform partial string match.
    • Case 5: Activating the Starts with and Ends with options for the keyword Medical Diagnosis. The "Starts with" method checks whether the string begins with Medical and ends with Diagnosis. If one of the conditions or both conditions are true, then a match is triggered.
  5. Case 6 provides the comprised example of all the 3 Advanced Dictionary options:
    • Case 5: The Ends with method for the keyword DISCHARGE CODE has been activated. The keyword is set as case-sensitive and has a score of 2. The scanning engine searches for the word "CODE" in uppercase letters, and the keyword DISCHARGE CODE should not appear in the document search results more than twice.
    • Result: When a match is found among any of these options, the DLP scanning engine will trigger an incident.

Match all Keywords in the Custom Dictionary List

  1. Create a classification using Custom Dictionary. Perform the initial steps of creating your dictionary classification as provided in steps 1 to 5 in the Create Custom Dictionary Classification section.
  2. Click i to view the Keywords for the selected Custom Dictionary displayed on the second side panel. Count the number of keywords in the selected Custom Dictionary set. For example, the number of keywords found is 10.
    clipboard_e1f20991cdd2aa7f807fd23559de1db64.png
  3. To match all keywords in the Custom Dictionary list, set the match threshold on the Classification editor equal to the number of items found in the dictionary list. For example, set the threshold as 10.
  4. To avoid counting duplicate matched keywords multiple times, enable the checkbox Count each match string only one time and save your classification.
    clipboard_e309dafc9b17c524a0b54ec9ee35f508f.png

Set the Threshold for the number of Keyword Matches in the Custom Dictionary

  1. Create a classification using Custom Dictionary. Perform the initial steps of creating your dictionary classification as provided in steps 1 to 5 in the Create Custom Dictionary Classification section.
  2. Select your custom dictionary. For example, Price List Dictionary and click i to view the number of keywords associated with it. You will find that there are more than 40 terms listed in the keyword list. But you need only 25 random keywords from that dictionary to match and trigger an incident.
    Price List.png
  3. To set the threshold as 25, add your dictionary to the Classification editor. Edit the threshold by clicking [1] and enter 25 to indicate the weight of the Dictionary in threshold matching and save your classification.
    clipboard_e00111141b8d45e1d48a09a21dcec2673.png

Exclude Matches on Keywords with Special Characters in Custom Dictionary List

Suppose you have a medical document that contains a broad range of sensitive keywords with special characters, and you want the DLP engine to match specific keywords with special characters exactly. To exclude matches on similar keywords with other special characters, you can create a custom dictionary list using keywords with special characters precisely. This enables the DLP engine to exclude matches on similar keywords with other special characters in the document, thereby reducing false positives and ensuring accuracy in your data protection measures

To exclude matches on keywords with special characters in a custom advanced pattern list:

  1. Create a classification using custom dictionary. Follow the steps of creating your dictionary classification as outlined in steps 1 to 5 in the Create Custom Dictionary Classification section.
  2. Make sure that you add the necessary keywords with special characters precisely into the dictionary to exclude these keywords with other special characters from being processed as matches by the DLP engine. For example, add two keywords with special characters - HIV-1 and Fever&Cough. This means that a match will only be triggered if the keywords HIV-1 and Fever&Cough with their respective special characters - and & are accessed within the document.
    clipboard_e1ec43f3fa491a3c5a5521e364821aef4.png

Match Keywords in Specific Email Sections

Suppose you have a medical email that contains a broad range of sensitive keywords, but you want the DLP engine to match keywords in specific sections of the email. To match keywords in specific sections of the email, you must first create a classification using a custom dictionary list of keywords. You can then configure a DLP policy with the newly created classification to specify the sections (Everywhere, Email Header) of the email. This enables the DLP engine to trigger matches on keywords in specific sections of the email, thereby reducing false positives and ensuring accuracy in your data protection measures. 

For example, create a classification using a custom dictionary of keywords named Medical Keywords, and configure a sanctioned DLP policy with the new classification to specify the Email Header section of the email. This ensures that a match is only triggered if the keywords in the Medical Keywords dictionary are accessed in the header section of the email.

To match keywords in specific email sections:

  1. Create a classification by selecting any of or all of Custom Dictionaries.
    clipboard_e3953c86d0cb7f8d52a77c059a5ae78bd.png
  2. Create a Sanctioned or Shadow DLP policy using the newly created classification. For example, create a sanctioned DLP policy. 
  3. Use the Skyhigh CASB DLP policy wizard to perform the initial steps of creating your Sanctioned DLP policy as provided in steps 1 to 4 in Create a Sanctioned DLP Policy
    clipboard_e01e27f6e948e158a7a105f1826c8451e.png
  4. On the Rules & Exceptions page, configure the following:
    • Rules. For IF, select Classifications. The Select Classification cloud card appears.
      clipboard_e56dfe547157ee17d22ca090328c77239.png
      • Classification. Select the newly created classification from the list of supported classifications and click Done.
        clipboard_ee010b9909c62878c749d7bb86d349d10.png
      • Location. Select Email Header. By default, All is selected.
        clipboard_e99d235be605636c45d5f8fea1b04a781.png
  5. Complete the remaining steps to configure your DLP policy as mentioned from step 5 (c) in Create a Sanctioned DLP Policy.
  • Was this article helpful?