Skip to main content

Check out Interactive Visual Stories to gain hands-on experience with the SSE product features. Click here.

Skyhigh Security

Understand NTLM and Windows Domain Membership

Introduction

The purpose of this article is to cover requirements, configuration, common issues, and troubleshooting Active Directory (AD) NTLM domain communication on the Secure Web Gateway. Being the most commonly used form of authentication, this is also meant to cover the most common questions and issues we experience in support, as well as make it easier to understand overall. This is not meant to cover authentication issues like intermittent authentication prompts.

Prerequisites

For NTLM authentication, the Secure Web Gateway must become a member of your AD domain. Because the Web Gateway cannot join the read-only AD domain, the Domain Controller must be a Read-Write Domain Controller (RWDC) instead of a Read-Only Domain Controller (RODC).There are a few things you have to make sure are setup correctly for this to work:

  1. Secure Web Gateway must be able to connect to your AD server over TCP port 445 (no other ports are required).
  2. For successful NTLM authentication, the Secure Web Gateway needs both the IP address (for TCP level communication) and the Fully Qualified Domain Name (FQDN) of the Domain controller (for SMB level communication). One of the two (either IP or FQDN) is provided in the Secure Web Gateway configuration. You have to ensure that the other one can be resolved by your DNS.
  3. When initially setting up the domain membership on the Secure Web Gateway , a domain administrator account is needed so a computer account can be created in AD for the Secure Web Gateway . Keep in mind that the domain administrator account is only used for the Secure Web Gateway account creation on the domain and those credentials are not stored on the Secure Web Gateway.

Configuration

The first step in configuration is to join the Secure Web Gateway(s) to the domain(s) that will be used to authenticate against. This is done within Configuration > > Windows Domain Membership > Join.

clipboard_e2fa08a3d9a73af0605a56747fa7efd15.png

 

  1. Windows Domain Name: The AD domain (NETBIOS name) to which Secure Web Gateway should be joined. In case you have issues determining the correct NETBIOS name, a helpful command to run from a command prompt in windows is nbtstat -n and the 'GROUP' that's returned is the name of the domain that the computer is part of.
  2. Gateway account name: This will be the name of the Secure Web Gateway computer account that's added to Active Directory when it successfully joins the domain. After this account is created, it should not be modified, nor should it be created manually.
  3. Overwrite existing account: If checked, this will overwrite the existing Secure Web Gateway computer name if it exists on Active Directory. Each Secure Web Gateway will need a unique account (computer name) on Active Directory, so if a computer name has been used by another Secure Web Gateway or computer, it will be overwritten. Keep in mind that if the same account is overwritten, the Secure Web Gateway that was using it will no longer be part of the domain and will no longer be able to authenticate against it.
  4. Use NTLMv2: It's recommended to use NTLMv2 if it's supported by your Active Directory environment. This option only enforces NTLMv2 for the Secure Web Gateway while it is joining your AD domain. It does not enforce NTLMv2 for client requests.
  5. Timeout for requests: This is the amount of time that the Secure Web Gateway will wait for a response from Active Directory before timing out. In case this timeout is reached, the domain controller in question will be flagged as down and we will fail over to the next one (if other DCs have been configured).
  6. Reconnection Timeout: Time to wait before reconnecting to a domain after a failure.
  7. Configured Domain Controllers: A comma-separated list of Active Directory Domain Controllers that the Secure Web Gateway should be using for this domain. It is suggested to use the fully qualified domain name (FQDN) since it's more likely to properly resolve (forward DNS -> Hostname to IP) than the IP address (Reverse DNS -> IP to hostname)of the Active Directory servers. You can leave this field blank to force the Secure Web Gateway to perform auto discovery of your DCs. Auto-discovery is not recommended as it introduces more complex DNS requirements. Hard coding the DCs is recommended for most environments.

NOTE: Read-only domain controllers are not supported, as the machine account password cannot be updated or rotated every 5 days after joining the domain controller.

  1. Number of Active DCs: The total number of active domain controllers the Secure Web Gateway will use for authentications. The Secure Web Gateway will distribute authentication requests between the active DCs. See Understand Active Domain Controllers, failover, and authentication request distribution.
  2. Administrator account/password: The domain admin account and password used to create the computer account in AD. The account and password is not stored anywhere on the Secure Web Gateway after it's used (Just like joining your windows PC to the domain)

To join the domain, click OK.

Secure Web Gateway joined the domain successfully

After joining the domain, you'll want to see a consistently green status indicator in the GUI after selecting refresh, as seen below. If the status is red, there is an issue. See Troubleshooting.

clipboard_eb830bf2917bd7cc4177ddf3ddfcfd9ab.png

Additionally, if the account creation was successful, the computer name should be visible within Active Directory.

The best method to test user credentials after joining the domain is to see what is returned in an authentication test. The settings for authentication can be found under Policy > Settings > Engines > Authentication > select your configured NTLM engine (or create one) > select the arrow next to 'Authentication Test' and test with your domain credentials. Here's an example of a successful and failed test.

Good credentials

clipboard_e2150e9e07cef5ca1caf67e9c7adafd70.png

Bad credentials

clipboard_e2677333a73e92422f9f85663023c3d5d.png

 

If nothing has failed so far and your authentication tests were successful, you are ready to start deploying an authentication policy for your users. 

Understand Active Domain Controllers, failover, and authentication request distribution

How Secure Web Gateway finds active DCs and handles failover

In this example, there are 4 configured Domain Controller IP addresses, which we’ll refer to as DC1, DC2, DC3, and DC4, and the ‘Number of active Domain Controllers’ is set to 2. Default timeout values are used.

Note: Secure Web Gateway tries to connect to up to 2 DCs. It doesn't connect to all 4 defined DCs simultaneously to select 2 DCs that answered first. Rather, the DC list defines in what order Secure Web Gateway will try to connect to the servers. A DC is marked as offline for 3 minutes in case of a communication error (Ex: Secure Web Gateway is not able to connect, or connection to the DC was aborted by a timeout (15 seconds)).

  • Secure Web Gateway tries to connect to DC1 and DC2. The connection to DC1 failed and DC1 is marked offline for 3 minutes. The connection to DC2 is successful and DC2 is marked active.
  • Secure Web Gateway looks for a second active DC and tries to connect to DC3 (next in the list). Connection to DC3 is successful and DC3 is marked active. Both DC2 and DC3 are active.
  • DC2 is no longer reachable and is marked offline for 3 minutes. DC3 is still active.
  • Secure Web Gateway looks for a second active DC and tries to connect to DC4 (next in the list). The connection to DC4 failed and DC4 is marked offline for 3 minutes. No additional DC can be contacted right now (DC1, DC2, DC4 are all still within the 3 minutes offline status).
  • DC1 status changes to standby status (3 minutes offline status expired).
  • Secure Web Gateway tries to connect to DC1. The connection to DC1 is successful and DC1 is marked active. Both DC3 and DC1 are active.
  • DC2 and DC4 status changes to standby status. DC3 and DC1 remain the active servers until one or both go offline.

As described above, the ‘active’ domain controller(s) are sticky and DCs in standby status are not checked unless an active DC goes offline. Restart forces Secure Web Gateway to start over from the beginning to find active DCs.

Authentication request distribution

Authentication requests are distributed across the active DCs where the fastest DC (first available of the active DCs) handles the next request.

What if the number of DCs in active status is fewer than the specified number of active DCs?

In an example with 3 configured Domain Controllers and 2 active, if 2 DCs are offline and only 1 remains active, Secure Web Gateway will attempt to reconnect to the offline DCs once they return to standby status in an effort to find a 2 active DC. In the case where all DCs are offline, all requests fail immediately until DCs return to standby status and Secure Web Gateway is able to find an active DC.

Troubleshooting

Here are a few troubleshooting examples where the Secure Web Gateway did not join the domain successfully or has issues communicating with the DCs.

NTLM CFilterAuth: FDEvent pipe timed out, request( 0 ) from client(x.y.a.b) is dropped. Timeout value was=5 -  highlights the server is not responding in time for the Ip address x.y.a.b

Note that there are only two main troubleshooting tools:

  1. The Secure Web Gateway authentication debug log
  2. A network capture/TCP dump taken on the Secure Web Gateway (this will give you the most comprehensive troubleshooting data).

Authentication Debug Log

You can find the authentication debug log under Configuration > > Troubleshooting > Authentication Troubleshooting.

The log files written can be found under Troubleshooting > Log files > Debug > mwg-core_Auth.debug.log.

There are two main options for the authentication debug log:

  1. Log management events. We recommend that this option is permanently enabled. It will log all events that have to do with your AD connection, joining or leaving the domain or failing over from one DC to another. Very little log data is being written, which allows you to always have this option enabled.
  2. Log authentication events. We recommend that you only enable this option for specific troubleshooting, limit it to a specific IP and disable it again as soon as possible after replicating an issue. This logging option will log all events related to actual user authentications. As you can imagine it will grow fast when enabled as not only every authentication request from a client but also group memberships and so on are being logged. It is most useful if you have specific clients that constantly get prompted for credentials or if they simply cannot log in at all. Enable the authentication event option and specify the client IP address that will be replicating the problem (for example open the browser and get a prompt). Right after wards disable the authentication event option again so the log does not grow to a point where it becomes a problem.

clipboard_ec376c25346e17009ffa5962ca53d64a1.png

Tcpdump

You can take a packet capture (tcpdump) from the Secure Web Gateway UI or from the command line (recommended option) as 'root':

Command Line: (ssh or console access)

cd /opt

tcpdump -i any -s0 -w ntlmcapture.cap port 445 or port 53

Reproduce Problem and let capture run for at least 3 minutes. (this is the default timeout value in which Secure Web Gateway attempts to reconnect to a DC)

Stop capture. (Ctrl +c)

File will be present in the directory (/opt) in which you ran the command.

SWG UI: (Troubleshooting > Packet Tracing)

Add these command line parameters:

-s0 -i any port 445 or port 53

Start Capture.

Reproduce Problem and let capture run for at least 3 minutes. (this is the default timeout value in which Secure Web Gateway attempts to reconnect to a DC)

Stop Capture.

You can view created traces on your desktop with the free tool "wireshark".

Below are a few examples of what you might see:

No IP address (Forward DNS failed)

In this example, I tried to join to the Active Directory server by providing the FQDNbob.jimc.local in the Secure Web Gateway UI (see field 6 above) but there is no DNS record for this name, so DNS returns 'No such name.'

Joining the domain will fail immediately.

clipboard_e270e8e1d02e7bf8d723954f1689ca7e0.png

No or incorrect hostname (reverse DNS failed)

In this example, I tried to join via IP of 10.10.95.12 which has no reverse record in DNS(or an incorrect hostname is returned). The Secure Web Gateway can establish the TCP connection to the DC as it has the IP address provided in the UI, but once the TCP connection is established and the protocol switches to SMB, the connection fails as the correct hostname is required.

A similar situation applies when the domain controllers are being load balanced via a virtual hostname. For example, if you provide the FQDN of DCpool.company.com (virtual name for load balanced DCs) and it resolves to the IP of one of your DCs (for example dc1.company.com), your connection will fail because as soon as the protocol switches to SMB, the hostname provided is DCpool.company.com and not the expected/correct hostname of dc1.company.com. Do NOT use virtual hostnames for your DCs. Use the real hostnames and let the Secure Web Gateway do the load balancing for you.

clipboard_eed2d73dca849ad2c18ca52eb6ccf6e63.png

Bad admin credentials

In this example, the credentials for the administrator used to join the domain were not valid.

clipboard_e762b0cf2f288eee1ec68261a8541e59f.png

Computer account deleted or disabled

In this example, the computer account for the Secure Web Gateway was deleted in AD, but the same error could also be thrown if the account is disabled/modified. Also note that the error message is the same as when the incorrect administrator credentials were used while trying to initially join the domain.

clipboard_e0cf8e2d1c9b8c16f67f68d5356ccd7a2.png

'Logon To' Account Permissions in Active Directory

When you join the Secure Web Gateway to the domain, a computer account is created within Active Directory. When Secure Web Gateway talks to the Domain controller to authenticate users, it uses this computer account.

Some users in Active Directory may have restrictions as to which workstations they are able to logon to. If the user is only allowed to log on to specific workstations, you will need to make sure the Secure Web Gateway computer account is also added as an allowed workstation. Failure to do so will cause authentication to fail and the user will be prompted to authenticate.

In this scenario, the Secure Web Gateway is joined to the domain with computer account 'Secure Web Gateway'. The user 'user1' is only able to logon to workstation 'Desktop1'.

clipboard_e39f9656ae5ff7e9ebb10124316f113f6.png

clipboard_e01da885fbf338c6dcb27add9a3c6d82b.png

The example below shows the Secure Web Gateway trying to authenticate 'user1' using the computer account 'Secure Web Gateway'. The domain controller responds with an error message indicating that authentication failed. The error the Domain Controller sends is STATUS_INVALID_WORKSTATION as seen in the screenshot below.

clipboard_e37284c3d69e5f6c68c7401784a5e1285.png

It is important to add the Secure Web Gateway's computer account into the user's allowed workstations or to allow the user to log on to all workstations for this to work properly.

clipboard_ef5ffadb49ddcffd9f9d042b010f760d7.png

Alerting

If you would like to get notifications in case issues arise with your domain membership, you can utilize some of the dashboards alerts the Secure Web Gateway produces. Please see the following article on incident alerting:

clipboard_e48c00e186ae5a54f17b2ed50807eb9c4.png

Last resorts

Hosts file entry

If DNS issues cannot be overcome (temporarily or permanently), an entry into the host file of each Secure Web Gateway will likely be required. It is required to change this in the GUI as seen below (do not make /etc/hosts changes on the command line).

clipboard_e0a610564257c6cab1b5f55817c15ca94.png

 

Rolling captures for intermittent issues

Log into the Secure Web Gateway with a tool like putty as the 'root' user. Browse to /var (cd /var)and verify that you have enough free space to store the captures using 'df -k'. With the syntax I've provided, you will need 2 GB of free space on var, but that can be changed, keeping in mind that if you reduce how many captures will be stored by too much you may have the worthwhile tcp dump deleted before you stop the rolling capture.

nohup tcpdump -Z root -s 0 -i any port 445 or port 53 -C 100 -W 20 -wcapturefilename.pcap & <press enter twice>

-C is how large the capture can be before a new one is started in MB

-W is how many captures will be stored before the oldest is deleted for a new capture to start.

-port 445 is for active directory and 53 is for dns

-the other parameters should remain unchanged

To stop the capture, run 'ps aux | grep tcpdump' and get the process ID for the rolling capture, then run 'kill -9 processID' to stop the rolling capture. The completed captures will be in /var/empty/

Takeaways

  • Always hard code the 'Configured Domain Controllers' field with the address of your Domain Controllers. Do NOT use a Virtual Hostname.
  • Secure Web GatewayWG needs both the IP and FQDN of the configured Domain Controller. You'll specify one in the field provided; the other needs to be resolved by DNS.
  • Remember to enable the 'Log Management Events' debugging option.
  •  
  • Was this article helpful?