Understand URL-Related Properties
Introduction
I have written the following guide to help understand the use of properties within rules, and how to formulate list entries to go with the corresponding rules. For example, a common question I get a lot is... I added [INSERT-SITE-HERE].com to [INSERT-LIST-HERE], but the site is still blocked---why isn't the allowlist entry working as expected? Understanding the rule criteria is essential in managing the Secure Web Gateway's rules and how they apply. This article will attempt to simplify some very common examples and explain use cases of certain properties. To start I will be focusing on URL-based properties only.
Best Practices
If you read any piece of this document, please at least read this section. After you read this, you can use the"Good/Bad Examples" for further detail and reference. The below examples outline use cases for the most commonly used URL-related properties.
URL.SmartMatch Purpose
The URL.SmartMatch property was created to allow for greater flexibility and usability. For example, URL.SmartMatch has relaxed syntax requirements whereas other properties, such as URL, URL.Host, URL.Domain, have specific syntax requirements and requires the management of multiple lists--one list needed per property. With URL.SmartMatch, those lists can be combined into a single list. The URL.SmartMatch property will accept the list as "input", and return TRUE if the given URL, URL host, or URL path variation was found in the list. In the past, we have found that customers often have to deal with multiple lists which have entries in differing formats. The URL.SmartMatchproperty was created to help accommodate these variations thereby reducing the need to manage multiple lists.
URL.SmartMatch
The URL.SmartMatch property was introduced in version 7.4.1. Similar to the URL.HostBelongsToDomains property it was designed to simplify the allowlisting process.
SmartList entries can be entered in the form of Host, Domain, URL, or Fragment of the URL. Example entries (wildcards "*" assumed on both sides of the entry):
- host.domain.tld - Is equivalent to URL.Host matches *.host.domain.tld or host.domain.tld
- domain.tld - Is equivalent to URL.Host matches *.domain.tld or domain.tld
- http://domain.tld - Is equivalent to URL matches http://*.domain.tld or http://domain.tld
- domain.tld/path - Is equivalent to URL matches *.domain.tld/path* or domain.tld/path*
- /path - Is equivalent to URL matches */path*
Good
Entries in Good: URL SmartMatch List
- Entry: mcafee.com
Why it's good: Using this entry, it would correctly match for all mcafee.com subdomains, including mcafee.com, www.mcafee.com, secure.mcafee.com, etc... - Entry: mcafee.com/us/products/
Why it's good: Using this entry would allow content from the 'mcafee.com' domain, which includes the path of '/us/products/'. - Entry: http://mcafee.com
Why it's good: Using this entry will allow only HTTP access to all 'mcafee.com' and subdomains. - Entry: http://www.mcafee.com/
Why it's good: Using this entry will allow only HTTP access to all 'www.mcafee.com'. - Entry: http://mcafee.com/us/products/
Why it's good: Using this entry would allow content from the 'mcafee.com' domain, which includes the path of '/us/products/'. - Entry: mcafee.com:80
Why it's good: Using this entry will allow only HTTP access to all 'mcafee.com' and subdomains on port 80. - Entry: mcafee.com:80/
Why it's good: Using this entry will allow only HTTP access to all 'mcafee.com' and subdomains on port 80. - Entry: http://mcafee.com:80/
Why it's good: Using this entry will allow only HTTP access to all 'mcafee.com' and subdomains on port 80 if it's HTTP. - Entry: mcafee.com.
Why it's good: Using this entry will allow only HTTP access to all 'mcafee.com' and subdomains.
Bad
Entries in Bad: URL SmartMatch List
- Entry: /us/products/
Why it's bad: Using this entry could potentially match on other hosts which contain the path '/us/products/', for example: http://maliciousdomain.mwginternal.com/us/products/. - Entry: http://www.mcafee.com:8080/
Why it's bad: This entry would not match because the request is for port 80, however entry has port 8080. - Entry: http://download.mcafee.com/
Why it's bad: Subdomain is 'www' not 'download'. - Entry: *.mcafee.com
Why it's bad: Wildcards are not used in URL.SmartMatch entries. - Entry: *.mcafee.com/*
Why it's bad: Wildcards are not used in URL.SmartMatch entries. - Entry: .mcafee.com
Why it's bad: Leading period causes the entry does not match.
Example URL Breakdown
Example URL
http://www.mcafee.com/us/products/web-gateway.aspx
The following shows examples of how the Example URL above could be allowlisted when using various properties. Please notice the syntax flexibility of the URL.SmartMatchproperty.
URL.SmartMatch (example entries)
mcafee.com
http://mcafee.com/us/products/
/us/products/
mcafee.com/us/products/
URL.Host
www.mcafee.com
URL.Domain (7.4+)
mcafee.com
URL.Host.BelongsToDomains (example entry)
mcafee.com
URL.Protocol
http
URL.Path
/us/products/web-gateway.aspx
Operator importance
is in list
Use of "is in list" implies an exact string match. Wildcard characters will be interpreted as literal strings.
matches in list
Use of "matches in list" allows for wildcard matches. Although wildcard characters are accepted, they are not completely necessary.
Good/Bad Examples by Property
The following examples below are listed by property used in the rule along with the corresponding operator.
URL using "is in list"
Using the property "URL", implies that you will create list entries that take into account the full URL. Using the operator "is in list" implies an exact string match.
Good
Entries in "Good: URL String List"
- Entry: http://www.mcafee.com/us/products/web-gateway.aspx
Why it's good: Full URL is used as it is needed due to "is in list" operator.
Bad
Entries in "Bad: URL String List"
- Entry: www.mcafee.com/us/products/web-gateway.aspx
Why it's bad: The entry doesn't include the protocol information (http://). The URL property evaluates the full URL and the operator "is in list", implies exact string match.
URL using "matches in list"
Using the property "URL" implies that you will create list entries that take into account the full URL. Using the operator "matches in list" allows for wildcard matches.
Good
Entries in "Good: URL Wildcard List"
- Entry: http://www.mcafee.com/*
Why it's good: This entry contains a trailing wildcard which will allow any HTTP request to www.mcafee.com. However, it will not match on requests for http://mcafee.com/. - Entry: regex(^htt(p|ps):\/\/([\w.-]*\.|\.?)mcafee\.com(\/.*|\/?))
Why it's good: This entry is a bit more complex as it uses regular expressions. This entry will allow any request, HTTP or HTTPS, to mcafee.com and it's subdomains. - Entry: regex(^htt(p|ps):\/\/([\w.-]*\.|\.?)mcafee\.(com|co\.uk)(\/.*|\/?))
Why it's good: This entry is the same as the previous entry but demonstrates how you can allow other top-level domains, such as '.com' or '.co.uk'.
Bad
Entries in "Bad: URL Wildcard List"
- Entry: *.mcafee.com*
Why it's bad: Using this entry, the entry could match on another string within the URL, for example: http://malicious-download-site.mwgin...www.mcafee.com - Entry: regex(htt(p|ps)://(.*\.|\.?)mcafee.com(\/.*|\/?))
Why it's bad: This entry is a bit more complex as it uses regular expressions. This entry will allow any request, HTTP or HTTPS, to mcafee.com and it's subdomains. However, the entry could match on another string within the URL, for example:http://malicious-download-site.mwgin...www.mcafee.com - Entry: regex(htt(p|ps)://(.*\.|\.?)mcafee.(com|co.uk)(\/.*|\/?))
Why it's bad: The entry could match on another string within the URL, for example:http://malicious-download-site.mwgin...www.mcafee.com
URL.Host using "is in list"
Using the property "URL.Host" implies that you will create list entries that take into account only the domain portion of the URL. Using the operator "is in list" implies an exact string match.
Good
Entry in "Good: URL.Host String List"
- Entry: www.mcafee.com
Why it's good: The domain of the requested URL is 'www.mcafee.com' which is an exact string match.
Bad
Entries in "Bad: URL.Host String List"
- Entry: mcafee.com
Why it's bad: The entry value is incorrect (mcafee.com), the actual property value is 'www.mcafee.com'. - Entry: *.mcafee.com
Why it's bad: The operator is "is in list" which implies an exact string match, wildcards will not match. - Entry: *.mcafee.com/us*
Why it's bad: The URL.Host property is limited only to the domain portion of the URL, not the path (/us). In addition, the operator "is in list" which implies an exact string match, wildcards will not match.
URL.Host using "matches in list"
Using the property "URL.Host" implies that you will create list entries that take into account only the domain portion of the URL. Using the operator "matches in list" allows for wildcard match.
Good
Entries in "Good: URL.Host Wildcard List"
- Entry: mcafee.com
Why it's good: This entry will not match for 'www.mcafee.com' but if you intend to allow access to mcafee.com (no www) you will need it unless you use regular expressions. - Entry: *.mcafee.com
Why it's good: This entry will match on any subdomain of mcafee.com (but not actually mcafee.com itself). - Entry: regex((.*\.|\.?)mcafee\.com)
Why it's good: This single entry uses regular expressions and will allow both mcafee.com and any subdomains of mcafee.com.
Bad
Entries in "Bad: URL.Host Wildcard List"
- Entry: *.mcafee.com*
Why it's bad: Using this entry, the entry could match on another string within the URL, for example: http://www.mcafee.com.malicious-down...ginternal.com/ - Entry: *.mcafee.com/us*
Why it's bad: URL.Host property is limited only to the domain portion of the URL is acceptable, not the path (/us).
URL.Domain vs. URL.Host.BelongsToDomains
The URL.Domain property was introduced in 7.4. It was a property designed to be more consistent with other URL-related properties (URL.Host, URL, etc...). It acts nearly identically to that of URL.Host.BelongsToDomains, but does not require a list as a setting, instead the list can be the operand.
URL.Domain is a string property that contains the top-level domain of the requested URL(i.e. "mcafee.com").
URL.Host.BelongsToDomains<ListName> is a boolean property that returns true if the URL's top level domain is in the list specified on the rule (ListName). If the domain of the URL is not in the list, the property returns false.
URL.Domain using "is in list"
Using the property "URL.Domain" implies that you will create list entries that take into account just the top-level domain of the URL. Using the operator "is in list" implies an exact string match.
Good
Entries in "Good: URL.Domain String List"
- Entry: mcafee.com
Why it's good: URL.Domain will simply equal "mcafee.com".
Bad
Entries in "Bad: URL.Domain String List"
- Entry: www.mcafee.com
Why it's bad: URL.Domain is "mcafee.com", not "www.mcafee.com". Use URL.Host instead. - Entry: *.mcafee.com
Why it's bad: URL.Domain equals "mcafee.com", so "*." would prevent matching."is in list" implies a string, not a wildcard.
URL.Domain using "matches in list"
Using the property "URL.Domain" implies that you will create list entries that take into account just the top-level domain of the URL. Using the operator "matches in list" allows for wildcard matches.
Good
Entries in "Good: URL.Domain Wildcard List"
- Entry: regex(mcafee\.(com|co\.uk))
Why it's good: URL.Domain equals "mcafee.com" so it will match. "mcafee.co.uk" will also match.
Bad
Entries in "Bad: URL.Domain Wildcard List"
- Entry: *.mcafee.com
Why it's bad: URL.Domain of "mcafee.com" will not match due to the "*.". - Entry: *mcafee.com
Why it's bad: It will match on "mcafee.com", BUT it could match on"maliciousdomainmcafee.com" too.
URL.Host.BelongsToDomains
The URL.Host.BelongsToDomains property was introduced in 7.2. It was designed to simplify the complexity of adding list entries. Using the property"URL.Host.BelongsToDomains" allows you to simply enter the domain of interest.
So if you wish to white list all mcafee.com sites (including subdomains), you can simply enter mcafee.com, there is no need to worry about wildcards.
Good
Entries in "Good: Only Domain List"
- Entry: mcafee.com
Why it's good: Using this entry, it would correctly match for all mcafee.com subdomains, including mcafee.com, www.mcafee.com, secure.mcafee.com, etc... - Entry: www.mcafee.com
Why it's good: Using this entry, it would correctly match only for www.mcafee.com subdomains. It would not allow other subdomains of the top domain 'mcafee.com'. This is useful in case you wanted to allow a subdomain, but not the entire domain.
Bad
Entries in "Bad: Only Domain List"
- Entry: *.mcafee.com
Why it's bad: Using URL.Host.BelongsToDomains does not need wildcards, the property requires an exact domain match such as 'www.mcafee.com' or the top domain 'mcafee.com'.
Test Ruleset
You can use the test ruleset in your own environment to see how it works! The test ruleset will work in versions 7.4.1+.
Conclusion
From the examples, it should be clear that the cleanest/easiest way to create domain-based allowlist entries is through the use of the "URL.SmartMatch" property. I hope this helps clarify use cases for the various URL-related properties, perhaps it will help with understanding other properties as well.