Skip to main content

Check out Interactive Visual Stories to gain hands-on experience with the SSE product features. Click here.

Skyhigh Security

Network Down on Node in central management

Title: Network Down on Node in central management

Category / Product: Webgateway

Environment: Central Management

 

Central Management allows you to administer multiple Web Gateway appliances in your network as nodes in a common configuration. A configuration of multiple appliances administered through Central Management is also referred to as a cluster.

When administering a Central Management cluster, you are dealing mainly with:

 • Nodes — Appliances run as nodes that are connected to each other sending and receiving data to perform updates, backups, downloads, and other jobs.

 • Node groups — Nodes are assigned to diferent types of node groups that allow different ways of transferring data.

 • Scheduled jobs — Data can be transferred according to time schedules that you configure.

clipboard_e2dbe5be630bcdc176544fe375b91e5e8.png

 

In the Above Diagram,  SWG Cluster has 2 Network Groups i.e Network Alpha and Network Bravo.

Network Alpha has 3 nodes and 1 cluster Management Node:

YBALC1A956CM01
YBALC7A956UP01
YBALC7A956UP02
YBALC7A956UP03

Network Bravo has 3 nodes and 1 cluster management node:

YBBRC1A956CM01
YBBRC7A956UP01
YBBRC7A956UP02
YBBRC7A956UP03

 

 

Issue: Network Drop for the Node YBALC7A956UP01 (Alpha Group). This impact on the production traffic creating instability in the network.
In the below Figure we can observe that Traffic drops to Zero Every 15 minutess and regains back to normal.

 

clipboard_e8744a591b03fd6bbb93a1adbdddf55d8.png


Causes:

1.       Cluster Nodes if not in sync will create this problem. An in sync cluster will show the same Current Storage Timestamp for all nodes.

In below figure we can see that 3 nodes are not in sync with rest of the nodes . There is variation in the timestamp


clipboard_e6f49406524eeaa3b1213b6070d89dc64.png

 

2.       Secure Web Gateways experience an intermittent network issue while one node attempts to transmit shared data to another node (i.e., TCP handshake failed or node wasn't reachable).

clipboard_e535a62ac2ae6e1b400d4ff387bc191f4.png

 

Troubleshooting and checks to be carried out:

Verify the synchronization of nodes
The user interface displays, among other general information, a timestamp for each node in a Central Management Configuration, which allows you to verify whether all nodes are synchronized.

Task
1 Select Configuration | Appliances.
2 On the appliances tree, select Appliances (Cluster). Status and general information about the configuration and its nodes appear on the settings pane. Under Appliances Information, a list is shown that contains a line with information for every node. The timestamp is the last item in each line.
3 Compare the timestamps for all nodes. If they are same for all nodes, the Central Management configuration is synchronized.

 

You can check your NTP settings from the GUI under Configuration > Date & Time. If your time is too far out of sync (more than ~1000 seconds) then NTP will not sync automatically

Or from SSH Console using command

timedatectl

clipboard_e57bb9538175609c391ba750901f5b862.png


If not in sync then Force a sync (MWG 7.3 and above):

service ntpd stop
service ntpdate start
service ntpd start

 

Examples:
Time out of sync, Version mismatch, Network problems.


Here is an example error: Time out of sync

[2023-08-11 16:52:55.938 +05:30] [Coordinator] [MessageWillRespondWithError] Will respond to message <co_distribute_init_subscription> with error: (405 - 'time difference too high') - "time difference too high: 2023-08-11T16:52:00.732+05:30 / 2023-08-11T16:52:55.937+05:30".


Things to check for in cluster settings: distribution Timeout, distribution Timeout Multiplier, allowed Time Difference, check Version match value on all cluster members.

In our Example Since the other parameters were correct, we had to make change for time difference to 300 seconds so that configuration changes can be accepted. Post making the changes the nodes were synchronized and getting the updates, no network fluctuations observed for the reported node.

clipboard_ec4d4b0422ce1f7d840e07b40aedb935b.png

 

Considerations 

  • Once a cluster is established, you can only access one GUI at a time
  • All changes made by that GUI are synced to all other cluster members
  • The system time is important in the CM cluster.
  • if the times get out of sync (more than 2 minutes), CM communication will fail, and errors will be reported
  • All appliances should be running the same version & build. It is never recommended to mix versions in a Central Management cluster

 

 

 

 

 

 

Attachments:

  • Was this article helpful?