Rules for HTML filtering
Rules for HTML filtering
To enable HTML filtering on the appliance, a rule set containing appropriate rules must be implemented. This section describes a sample rule set from the rule set library.
After the initial setup, an HTML filtering rule set is not implemented on the appliance. You can import the HTML Filtering rule set from the library and modify it according to your requirements or create a rule set of your own.
HTML Filtering (rule set)
This section describes the rules of a library rule set for HTML Filtering.
Library rule set — HTML Filtering |
---|
Criteria — Always |
Cycles — Requests (and IM), responses, embedded objects |
The rule set contains a rule and the following two nested rule sets:
- Enable HTML Filtering
- HTML Filtering
The following rule is contained in the rule set in addition to the nested rules sets:
Remove Content-Encoding header
Always –> Continue — Header.RemoveAll (“Accept-Encoding”)
The rule uses an event to remove the content encoding header from a request.
This header is not needed because filtering is only applied to the content, which is eventually sent in not encoded format to the user who requested it. The name of the header is specified by the event parameter.
Processing continues with the next rule set.
Enable HTML Filtering
This nested rule set prepares HTML filtering by enabling the HTML opener and removing a header element.
Nested library rule set — Enable HTML Filtering |
---|
Criteria — Always |
Cycles — Requests (and IM) and responses |
The rule set contains the following rule:
Enable HTML opener
Always –> Continue — Enable HTML Opener<HTML Filtering>
The rule uses an event to enables the HTML opener. The settings of this module are specified with the event.
Processing continues with the next rule.
Remove header for “Content-Length”
Always –> Continue — Header.RemoveAll (“Content-Length”)
The rule uses an event to remove the header providing the content length from a request.
Processing continues with the next rule set.
HTML Filtering (nested rule set)
This nested rule set removes different types of objects embedded in HTML pages, using a nested rule set for each type.
Nested library rule set — Enable HTML Filtering |
---|
Criteria — MediaType.EnsuredTypes contains text/html |
Cycles — Embedded objects |
The following rule sets are nested in this rule sets:
- Embedded Objects
- Embedded Scripts
- ActiveX Controls
Note: This rule set is not enabled by default. - Advertising Filter
Note: This rule set is not enabled by default.
Embedded Objects
This nested rule set removes Java applets embedded in HTML pages, as well as other embedded media types if they are on a blocking list.
It is processed in the embedded object cycle when these objects are sent with requests or responses.
Nested library rule set — Embedded Objects |
---|
Criteria — Always |
Cycle — Embedded objects |
The rule set contains the following rules:
Java applets
HTMLElement.Name equals “APPLET” OR (
HTMLElement.Name equals “OBJECT” AND
HTMLElement..HasAttribute (“codetype”) equals true AND
HTMLElement.Attribute (“codetype”) equals “application/java”) –> Remove
The rule uses several HTMLElement ... properties to remove an element from an HTML page if it is found that particular values are true for these properties. An element is removed if its name is APPLET or if its name is OBJECT and has a code type attribute with application/java as its value.
Processing of the embedded object cycle stops then and the HTML page is forwarded without the removed element to the user who requested it or to the web if a user attempted to upload it.
Stop if element is not interesting
(HTMLElement.Name does not equal “OBJECT” AND
HTMLElement.Name does not equal “embed”) OR
HTMLElement.HasAttribute (“type”) equals false –> Stop Rule Set
The rule uses several HTMLElement ... properties to check whether an element needs not be removed. An element needs not be removed if its name is neither OBJECT nor embed or has no type attribute at all.
Processing of the rule set stops then, so the rule that removes elements from HTML pages (and follows this rule in the rule set) is not processed. Processing continues with the next rule set.
Default action for unlisted media types
HTMLElement.Attribute (“type”) is not in list Media Type Whitelist
HTMLElement.Attribute (“type”) is not in list Media Type Blocklist –> Stop Rule Set
The rule uses the HTMLElement.Attribute property to check whether an element is of a type that is neither on the relevant whitelist nor the blocking list. In this case, a default action is executed, which for this rule is Stop Rule Set.
Processing of the rule set stops then, so the whilelisting and blocking rules for media types that follow in the rule set are not processed. Processing continues with the next rule set.
Handle whitelisted media types
HTMLElement.Attribute (“type”) is in list Mediatype whitelist
The rule uses the HTMLElement.Attribute property to check whether the type of an element is on a media type whitelist. If it is, the rule applies.
Processing of the rule set stops then, so the removing rule that follows this rule in the rule set is not processed. Processing continues with the next rule set.
Note: This rule is not enabled by default.
Handle blocklisted media types
HTMLElement.Attribute (“type”) is in list Mediatype blocklist –> Remove
The rule uses the HTMLElement.Attribute property to check whether the type of an element is on a media type blocklist. If it is, the rule applies and the media type in question is removed from the HTML page.
Processing of the embedded object cycle stops then and the HTML page is forwarded without the removed element to the user who requested it or to the web if a user attempted to upload it.
Embedded Scripts
This nested rule set removes script code embedded in HTML pages, providing options for keeping some code types.
It is processed in the embedded object cycle when this code is sent with requests or responses.
Nested library rule set — Embedded Scripts |
---|
Criteria — HTMLElement.Name equals “SCRIPT” |
Cycle — Embedded objects |
The rule criteria specifies that the rule set applies when an element of the script type is embedded in an HTML page.
The rule set contains the following rules:
Variable resetter
Always –> Continue – Set User-Defined.removeOneScript = false
The rule sets the User-Defined.removeOneScript property to false, so the break rules that follow this rule later in the rule set do not apply. Processing continues with the next rule.
Note: This rule is not enabled by default.
JavaScript
HTMLElement.Script.Type (“type”) equals “text/javascript” –> Stop Rule Set – Set User-Defined.removeOneScript = true
The rule uses the HTMLElement.Script.Type property to check whether an element is of the
JavaScript type. If it does, the rule applies.
Processing of the rule set stops then, so the rule that removes script code at the end of the rule set is not processed. This way, the embedded script code is kept in the HTLM page. Processing continues with the next rule set.
If you want to remove JavaScript code, replace the Stop Rule Set by the Remove action. The rule also sets the User-Defined.removeOneScript property to true. This property is evaluated by the break rule that follows this JavaScript rule.
When this rule applies with Stop Rule Set or Remove as its action, processing of the rule set is stopped. If you let the rule use an action that does not stop the rule set, you can enable the break rule. It will find that the value for the User-Defined.removeOneScript property is true and stop processing of the rule set accordingly.
To reset the value of the User-Defined.removeOneScript property to false, you need to enable the reset rule at the beginning of the rule set. With this value for the property, the break rules of the rule set will not apply.
Break;
User-Defined.removeOneScript equals true –> Stop Rule Set
The rule stops processing of the rule set if the User-Defined.removeOneScript property has true as its value. Processing continues with the next rule set.
Note: This rule is not enabled by default.
JScript
HTMLElement.Script.Type equals “text/jscript” –> Stop Rule Set – Set User-Defined.removeOneScript = true
This rule removes or keeps JScript within HTML pages in the same way as the JavaScript rule.
Break;
User-Defined.removeOneScript equals true –> Stop Rule Set
This rule works in the same way as the break rule that follows the JavaScript rule.
Note: This rule is not enabled by default.
Visual Basic script
HTMLElement.Script.Type “text/vbscript” equals “vbscript” –> Stop Rule Set – Set User-Defined.removeOneScript = true
This rule removes or keeps JScript within HTML pages in the same way as the JavaScript rule.
Break;
User-Defined.removeOneScript equals true –> Stop Rule Set
This rule works in the same way as the break rule that follows the JavaScript rule.
Note: This rule is not enabled by default.
Other scripts
Always –> Remove
The rule removes all embedded script code from HTML pages, unless it is kept from doing so by one of the rules preceding it in the rule set. These can stop the rule set before the process reaches the removing rule. They can do so for JavaScript, JSCript, and Visual Basic script code if enabled. If you want this to happen for other script code as well, you can add appropriate rules.
The break rules of the rule set can also stop it and let the removing rule not be processed.
If the removing rule is processed, it stops processing of the embedded objects cycle. Processing then continues with the next cycle.
ActiveX Controls
This nested rule set removes ActiveX controls embedded in HTML pages.
It is processed in the embedded object cycle when these objects are sent with requests or responses.
Note: This rule set is not enabled by default.
Nested library rule set — ActiveX Controls |
---|
Criteria — Always |
Cycle — Embedded objects |
The rule set contains several rules and the nested Filter ActiveX in Scripts rule set.
Advertising Filter
The nested Advertising Filter library rule set removes advertising elements embedded in HTML pages, such as images, layers, forms, and others.
It is processed in the embedded object cycle when these objects are sent with requests or responses.
Note: This rule set is not enabled by default.
Nested library rule set — Advertising Filter |
---|
Criteria — Always |
Cycle — Embedded objects |
The rule set contains a rule and the following nested rule sets:
- Link Filter
- Dimension Filter
- Popup Filter
- Script Filter