Understanding and Optimizing your Rules
Introduction
One of the most beautiful things about Skyhigh Secure Web Gateway is its near infinite flexibility. If you can dream up an idea for a rule, you can probably make it happen. This flexibility can be a double-edged sword however - and while it allows you nearly limitless potential, it also introduces the possibility of in efficiencies in your ruleset logic. My goal with this document is to equip you with the understanding and technology such that you can get the best performance possible out of your Secure Web Gateway
Understanding Cycles
Before we can begin discussing rulesets, rules, and how to lay them out, we must first grasp the concept of cycles. Secure Web Gateway has 4 different cycles. I will touch on all of them here briefly, but for the rest of the document, we’re only going to be concerning ourselves with the first three.
- Request
- Response
- Embedded
- Logging
The Request Cycle works with anything available in the initial user request. This means things like URL, Client IP, User Name (if we’re authenticating), and Headers sent by the client’s browser will be available.
The Response Cycle works with, as you’d guess, the response coming back from the web server after we’ve completed the request cycle. This will have the actual data requested, as well as any server-side headers.
The Embedded Cycle comes into play when we have an opener called in either the Request or Response Cycle. Openers allow your Secure Web Gateway to look more deeply into content of a given type. Currently, the two openers available are:
- Composite Opener - used for looking inside of other fi les -- such as .zip, .exe, etc.
- HTML Opener - very rarely used, typically only in very advanced and specialized configs
The Logging Cycle kicks off after the Request, Response and all Embedded cycles have completed -- allowing you to write to log files.
Important note about the logging cycle
If you have values in your access.log that you always want filled (for example: category information) -- you might have to create a rule at the very top of your rule sets to call the properties that fill these values, like so:
The action on this kind of population rule would be 'Continue' - as we don't want to block the traffic, but we still want all the subsequent rules to apply.
Without this kind of rule, some log fields may have blank entries if we are hitting a block action or a stop-cycle action before a specific point.
Here is an example to illustrate better what the issue is:
Your very first rule is your Global Whitelist, which contains youtube.com and the action is stop cycle. When a user goes to youtube.com, the request will be allowed, but your log files will not show the category information for youtube.com because the property URL.Categories was never called in the rules. To prevent this, you can create an initialization rule above your Global Whitelist that uses the property URL.Categories.
Here is a graphical representation of how the request and response cycle work, when handling a request from a client.
As general principle, you ideally want to try and get any traffic that is going to be ‘blocked’ out of the way as soon as possible (to limit the amount of work your MWG needs to do to block the traffic).
For an example - take URL Filtering. We will have the URL and can perform URL filtering in the request cycle (which is ideally where we’d want to do it). If we were to perform URL filtering in the response cycle - we would have already retrieved the page, only to find out the request is to be blocked.
Information that we have right off the bat (URL, Client.IP, etc.) should always be checked for in the request cycle if possible. If we’re checking the Client.IP or URL in the request cycle and allowing it, what’s the point of checking the same Client.IP/URL again in the response cycle? It’s simply additional work that doesn’t need to be done.
As a rule of thumb, these are the kinds of things you’d want to be doing in each cycle:
Request:
- URL Filtering
- Blacklisting
- User Authentication
- Rules based on browser-sent headers (User-Agent, etc.)
- Anti-Malware Scanning for Uploads
Response:
- Anti-Malware Scanning
- Media Type Filtering
- Rules based on website-sent headers (Content-Length, etc.)
Embedded:
- Body filtering for specific content
- Anti-Malware Scanning (if using Composite Opener to look into archives/etc.)
- Media Type Filtering
There are some other things that we will sometimes want to occur in both request and response cycles - such as Whitelisting.
Understanding Criteria
With both rulesets and rules, we can specify criteria to limit when a particular rule or ruleset will trigger. This is useful as we generally will not want to apply the same rules to all users across the board.
Some of the most common criteria used are:
- URL / URL.Host (used for looking at URL or URL host)
- Client.IP (IP address of client machine making request)
- URL.Categories (Categories the requested URL falls into from the TrustedSource db)
- Proxy.Port (The current proxy port being used - can be useful to differentiate between different clients)
- System.Hostname (Useful if you want rules to only occur on a single Secure Web Gateway in a cluster)
There are many more criteria available, you can see a full listing of them in the product guide, addendum A.
- Web Gateway 7.7 Product Guide - McAfee Corporate KB - Web Gateway 7.7.0 Product Guide PD26705
- Web Gateway 7.8 Product Guide - McAfee Corporate KB - Web Gateway 7.8.0 Product Guide PD27276
One important thing to keep in mind regarding criteria is that calling criteria that has not yet been filled will initiate whatever mechanism is required to fill it.
For example, the first time you call the criteria of URL.Categories - MWG will, at that point in the processing, perform a URL lookup. Likewise, the first time you call Antimalware.Infected, MWG will start its antimalware scanning process.
Because of this, it is very important for optimal performance that we structure our ruleset in a manner that attempts to check as many ‘cheap’ criteria as we can first -- before we resort to ‘expensive’ criteria such as Antimalware scanning. This is true both on a large-scale ruleset design, as well as on a smaller-scale when dealing with multiple criteria for a rule or ruleset.
If we can block a website based on having undesirable URL categories, then your Secure Web Gateway will never have to scan it for viruses since the traffic will already have been blocked.
Criteria by Cost/Weight
Low | Medium | High |
---|---|---|
Client.IP | URL.Destination.IP* | Antimalware.Infected |
URL // URL.Host | Media.EnsuredTypes | DLP |
Proxy.IP // Proxy.Port | Authentication* | HTML Opener (Event) |
URL.Categories* | Composite Opener (Event) | |
System.Hostname | ||
HTTP Headers |
* Some of these properties rely on external services like Active Directory, DNS or cloud lookups that could introduce delays beyond the control of Secure Web Gateway
Rule Engine Logic
It is possible to combine multiple criteria together. With two criteria, the logical operators AND and OR come into play. It is important to note that with an AND statement, if the first criteria checked is false, it will not check the second -- and the same is true with an OR statement if the first criteria is true.
As a rule of thumb, you will want to use the least expensive of the two as your first criteria.
AND rule (two variables)
Secure Web Gateway will first check the Client’s IP address. If it does not match, the rule will not be applied, and it would continue moving on in the ruleset.
If the Client IP was 1.2.3.4, then and only then would Secure Web Gateway do the additional check of looking at the URL Host to see if it matched the wildcard of *abcd.com. If it did, the request would be blocked, if it didn’t, it will not apply the rule and the traffic will continue.
We want to check the client IP for a match first, because if the user has a different IP, then we will not have to check the regex match against the URL.Host. Using a ‘matches’ action with wildcards/asterisks isn’t incredibly taxing, but it is more than a direct comparison check against the Client.IP.
OR rule (two variables)
With this ruleset, if the first parameter is true, we won’t bother to check the second since we will already have confirmed the criteria as true.
So, we will first check to see if the URL.Host matches *xyz.com. If it does, we will stop the search and apply our action (Block).
If it doesn’t match, we will proceed further and check the URL.Destination.IP (by performing a DNS lookup), and then check to see if it matches in the range. If it is in the range, we will block. If not, then the rule does not match either parameter and will not be applied.
In this example, we’re making use of a URL.Host wildcard lookup first, because the amount of work and latency introduced by doing a quick wildcard check against the URL.Host (which is a value we have right from the start) is much lower of an impact than asking your MWG to go out and do a reverse DNS lookup -- which is what the URL.Destination.IP criteria has your MWG do.
If we match *xyz.com -- there will be no need to check the second criteria because this is an OR statement.
More than two criteria
When you get beyond 2 criteria, you can involve another level of complexity -- that of parentheses (). These work just like they did in algebra class -- meaning whatever is in them will be evaluated first.
We generally suggest keeping your rules as straightforward as possible (ideally no more than 2 criteria per rule/ruleset) -- not because MWG cannot handle the complexity -- but more because dealing with incredibly complex rules can be very difficult to read for you as the administrator later on.
Compare the two sets of rules and see which is easier to understand logically.
Both these rulesets accomplish the same thing -- the first is all done as a single rule with 4 different criteria and a couple sets of parentheses.
The second is accomplished by splitting the logic out to multiple rules, no more than 2 criteria per ruleset. It is also much easier to read and understand.
If the URL.Host matches *testdomain.com, then we check the proxy port. Any proxy ports other than 9090 will result in a block page. After that, we check to see if the user is not an admin user (signified by the group membership and IP range), and if they are not, they will be blocked.
One last note about criteria - if you find yourself wanting to add more than 2 of any specific thing (Client.IPs, URL.Host checks, etc.) -- or if you see yourself wanting to add to them in the future, you would be well served to create a list and then use the criteria of ‘is in list’ or ‘matches in list’.
This will help to keep your criteria neat and easy to read, but allow you to have large lists of data in situations where it might be appropriate/necessary (such as whitelists/blacklists, group policy assignments, etc.).
Here’s an example whitelist ruleset that makes use of lists:
Rulesets and Policy Architecture
Now that we understand Cycles, we can have a look into rulesets. Rulesets are means by which we organize our rules and sub-rulesets and make a configuration easier to understand and manage. Rulesets are also where we specify what cycle(s) the rules and sub-rulesets within will be configured to run in. Much like with our criteria, it’s important for maximum performance to structure your ruleset in a manner that progresses from ‘least expensive’ to ‘most expensive’. While no one specific layout is necessarily ‘correct’ -- from reviewing a number of configurations, a rule of thumb would be a ruleset that looks a little something like this:
- Whitelists/Blacklists
- SSL Scanner
- Authentication
- URL Category filtering
- Common rules (cache/progress indications/composite opener)
- Media Type Filtering
- Gateway Anti-Malware
Obviously, all of these are optional - as you can pick and choose what rulesets you wish to use in your Secure Web Gateway configuration.
The clear majority of customers tend to go with a rather stock layout when it comes to the most of the rulesets. By and large, the bulk of the customization comes by way of whitelists/blacklists, and applying URL Category Filtering based on criteria (Username/Group/IP/etc.).
User-Defined Properties
User-Defined Properties can help you when it comes to optimizing the amount of checks that need to be done for rule evaluations. For example, group memberships in enterprise environments can get complex. It is not uncommon to see users with several hundred group memberships in AD.
On the Secure Web Gateway side, you would have to check against that long list of groups every time you need the group membership for policy assignments. Instead, you could do the check once and write the resulting policy name into a user-defined property.
Once you have your User-Defined variable set, you can decide which rules to apply based on this simple string variable instead of having to check the whole list of group memberships every time.
The check of User-Defined.URLFilteringPolicy equals "Admins" is cheaper than the check for Authentication.UserGroups contains "Administrators".
If you are interested in more information about policy mappings, please see this article:
Keep in mind that this was just one example for the usage of User-Defined Properties. You can take advantage of this feature every time you need to temporarily store information for later use. User-Defined properties persist for the duration of a transaction (request + response + logging).
Conclusion
You should now have a better understanding of how MWG works with cycles, logic and criteria -- and can use this knowledge to help weed out the inefficiencies in your configuration.
Takeaways:
- No more than 2 criteria per rule (for easy administration!)
- Remember cheap vs expensive criteria! Block as much ‘cheaply’ as you can.
- Use appropriate cycles for your rules. There’s no need to run a URL-Category rule in the response cycle, since we could have blocked it in the request and saved the time and bandwidth!