Skip to main content

Check out Interactive Visual Stories to gain hands-on experience with the SSE product features. Click here.

Skyhigh Security

AI RegEx Generator for Custom Advanced Patterns

The Custom Advanced Pattern includes an AI RegEx Generator that seamlessly constructs and comprehends complex Google RE2-compliant regular expressions through a conversation-based interface. You can generate the expressions for the scenarios where Skyhigh’s predefined classifications are absent. The AI RegEx Generator simplifies the task of building complex expressions and is helpful for administrators unfamiliar with the details of regexes. This approach can minimize the chances of inaccuracies that could lead to false positives or negatives, making it an effective way to ensure accurate results. 

► Advantages of AI-based Regular Expression Generator
  • AI-Powered Expression Building. Harness the power of AI to create intricate expressions effortlessly.
  • Conversational Approach. Seamlessly construct and comprehend complex expressions through a conversation-based interface.
  • Rapid Expression Generation. Quickly produce expressions for scenarios where Skyhigh predefined classifications are absent.
  • Tailored Regular Expression Assistance. Specialized in addressing queries solely related to regular expressions.
  • Precise RE2 Format Suggestions. Provide customers with accurate expression recommendations, exclusively in the Google RE2 format.
  • Risk Reduction. Minimize the risk of inaccurate expressions, preventing false positives/negatives.
  • Mitigate App Blockages. Overcome organizational app restrictions, boosting data admins' productivity.

IMPORTANT: Do not enter confidential, personal, or sensitive data into the query field. Information you enter is sent to an external AI service to generate the required answers. The AI-generated answers are only to aid the expression building. Before using the AI-generated answer in the regular expressions, Skyhigh Security strongly recommend validating the results. For example, confirm that the regular expression used in your classification builder produces the expected results.

Create Custom Advanced Patterns using AI RegEx Generator

To create custom Advanced Patterns with AI RegEx Generator:

  1. Go to Policy > DLP Policies > Classifications
  2. Click Actions > Create Classification
  3. Classification Name. Enter a name for the classification. For example, Confidential Data.
  4. Add description. Enter an optional description to describe its use or purpose.
  5. Category. Select a Category from the list. For example, Sensitive.
  6. Conditions. Click Select Criteria and choose Advanced Pattern. The Select Advanced Patterns cloud card displays.
    • Count each match string only one time. Activate or deactivate the checkbox to count the match string only one time or multiple times. Activating the checkbox eliminates the duplicate match counts during the DLP Policy evaluation. To learn more about the use case, see Count each match string only one time feature
  7. Click New.
    clipboard_e86ef6d8239e7584d3cd34aa5295eb5e2.png
  8. Enter a name and optional description for your custom Advanced Pattern.
  9. Click AI RegEx Generator which assists in generating and tuning complex advanced patterns, Skyhigh has provided a chat facility driven by AI to help users build regular expressions. 
    clipboard_e0ba349a7e75fe6086829791313bedc3a.png
  10. On the AI RegEx Generator pop-up, write your request. For example, Give me a regular expression for UK driving license numbers.
    clipboard_e1a4f9aff9cb4e8cee9515217ebc2eed4.png
  1. Once the regular expression is generated, click Insert RegEx to insert the regular expression into the custom advanced pattern.
    clipboard_e1a760fbb76e924fc027d477855ce66ca.png
  2. The generated regex expressions are added to Google RE2 expression.
  3. To ensure your regular expressions are accurate, click No Validation to open the Validation Algorithm cloud card. 
    clipboard_e377df2aab12ef879116f996830335227.png
  4. Select the appropriate Validation Algorithm from the list and click Done. To add Luhn 10 Validation Algorithm and BINs for your custom regular expressions, click Add BIN Validator. For more details, see Add BIN Validator.
    AI Regex.png
  5. Add a Score to weigh the new regex Advanced Pattern. Scores can be between negative or positive, -99 to 999. The higher the number, the greater the weight given to the keyword, which will exceed the threshold and trigger an incident. 
  6. To reduce false positives, add expressions in the Exceptions tab to exclude specific keywords or regular expressions from being processed as matches in DLP classifications. 
  7. Click Save.
  8. The new Advanced Pattern is now added to the Advanced Pattern and Classification list.
    clipboard_e619635bc976441d2c5607770146048e3.png
  9. Optionally, you can edit the threshold by clicking [1]. Enter a number to indicate the weight of the Advanced Pattern in threshold matching.
    clipboard_e3b22ee8407f9fe51db4d0f9c42d718ed.png
  10. Add more classification conditions as needed and click Save.

Your custom classification with custom advanced patterns and validation are saved to the selected category in the Classifications list. Add the classification to your DLP policies as needed.​​​​​​

Custom Advanced Pattern Use Cases

Count each match string only one time feature 

Suppose you have a bank document with multiple instances of the pattern for France IBAN and you have set the score for this regular expression as 10 in the custom advanced pattern. This means that a match will only be triggered if the pattern France IBAN appears 10 or more times in the document. However, if you want to avoid triggering matches for duplicate counts, you can activate the Count each match string only one time checkbox. During the policy evaluation, the match will count only once, even though the score for the regular expression is set to 10. To find this option on UI, see Count each match string only one time.

clipboard_ed214db6821e1e240781ad7bb116acbdb.png

Set Scores for Regular Expressions on the Custom Advanced Pattern List

Let's say you have a confidential bank document containing sensitive information or patterns that should only be accessed by authorized personnel. To ensure the security of the document, you can set the scores for regular expressions that alert the DLP scanning engine with more precise information whenever someone tries to access sensitive patterns beyond a specific limit. If a match is found, an incident is triggered to maintain the document's security.

To set scores for each regular expression in a custom advanced pattern list, follow these steps:

  1. Create a classification using custom advanced patterns. Perform the initial steps of creating your advanced pattern classification as provided in steps 1 to 14 in the AI RegEx Generator for Custom Advanced Patterns section.
  2. Score. Once you add the necessary regular expressions, you can set different scores for each regular expression in the list by editing the default score [1]. For example, configure the scores for three regular expressions - France IBAN, German IBAN, and UK IBAN. Set the score for France IBAN to 10, German IBAN to 6and UK IBAN to 5. This means that when the patterns for France IBAN are accessed 10 or more times in the content, German IBAN is accessed 6 or more times, and UK IBAN is accessed 5 or more times then it triggers a match.
    clipboard_e9c17f4ae4aebed3a5c9e7978df5378b1.png

Re-use Regular Expressions in Custom Advanced Pattern List

Suppose you have multiple confidential documents containing common patterns, such as credit card numbers, that should only be accessed by authorized personnel. To ensure the security of these documents, you can create a custom advanced pattern list using regular expressions. This list can then be reused across classifications, eliminating the need to create or update custom advanced pattern lists repeatedly. 

To re-use regular expressions in a custom advanced pattern list:

  1. Create a classification using custom advanced patterns. Follow the steps of creating your advanced pattern classification as outlined in steps 1 to 6 in the AI RegEx Generator for Custom Advanced Patterns section. 
  2. On the Select Advanced Patterns cloud card, click All and select Custom.
    clipboard_e786f5ebadfd37b019694e6aa2720e9c4.png
  3. Select one or more existing Custom Advanced Patterns.
  4. Click i to view the Usage of the selected Advanced Patterns in other classifications.clipboard_ef1fe1724e19520d72a2bddb6246abad5.png

Exclude Matches on Keywords in Custom Advanced Pattern List

Suppose you have a financial document that contains a broad range of sensitive keywords, but you want to exclude specific keywords from being processed as matches by the DLP engine. To exclude matches on keywords, you can create a custom advanced pattern list using regular expressions and exceptions. These exceptions prevent specific keywords from triggering matches, thereby reducing false positives and ensuring accuracy in your data protection measures. 

To exclude matches on keywords in a custom advanced pattern list:

  1. Create a classification using custom advanced patterns. Follow the steps of creating your advanced pattern classification as outlined in steps 1 to 16 in the AI RegEx Generator for Custom Advanced Patterns section.
  2. Exception and Type. Once you add the necessary regular expressions, you can add exceptions to exclude specific keywords or regular expressions from being processed as matches by the DLP engine. For example, add exceptions such as two keywords - Account No and Balance, and add a regular expression for Spain IBAN. This means that a match will not be triggered if the keywords Account No and Balance, and patterns for Spain IBAN are accessed within the document.
    clipboard_e1b552ed8fe3c78d0761de034dbae4c49.png

Exclude Matches on Regular Expressions in Custom Advanced Pattern List

Suppose you have a financial document that contains a broad range of sensitive patterns, but you want to exclude specific patterns from being processed as matches by the DLP engine. To exclude matches on regular expressions, you can create a custom advanced pattern list using regular expressions and exceptions. These exceptions prevent specific patterns from triggering matches, thereby reducing false positives and ensuring accuracy in your data protection measures. 

To exclude matches on regular expressions in a custom advanced pattern list:

  1. Create a classification using custom advanced patterns. Follow the steps of creating your advanced pattern classification as outlined in steps 1 to 16 in the AI RegEx Generator for Custom Advanced Patterns section.
  2. Exception and Type. Once you add the necessary regular expressions, you can add exceptions to exclude specific keywords or regular expressions from being processed as matches by the DLP engine. For example, add exceptions such as two regular expressions - Netherlands IBAN and Italian IBAN, and add a keyword Account No. This means that a match will not be triggered if the patterns for Netherlands IBAN and Italian IBAN, and keyword Account No are accessed within the document. clipboard_e9bce692f908ed24fde909e93d92ddfe5.png
  • Was this article helpful?