Network Down on Node in central management

Last updated
Save as PDF

Title: Network Down on Node in central management

Category / Product: Webgateway

Environment: Central Management

Central Management allows you to administer multiple Web Gateway appliances in your network as nodes in a common configuration. A configuration of multiple appliances administered through Central Management is also referred to as a cluster.

When administering a Central Management cluster, you are dealing mainly with:

• Nodes — Appliances run as nodes that are connected to each other sending and receiving data to perform updates, backups, downloads, and other jobs.

• Node groups — Nodes are assigned to diferent types of node groups that allow different ways of transferring data.

• Scheduled jobs — Data can be transferred according to time schedules that you configure.

In the Above Diagram, SWG Cluster has 2 Network Groups i.e Network Alpha and Network Bravo.

Network Alpha has 3 nodes and 1 cluster Management Node:

YBALC1A956CM01
YBALC7A956UP01
YBALC7A956UP02
YBALC7A956UP03

Network Bravo has 3 nodes and 1 cluster management node:

YBBRC1A956CM01
YBBRC7A956UP01
YBBRC7A956UP02
YBBRC7A956UP03

Issue: Network Drop for the Node YBALC7A956UP01 (Alpha Group). This impact on the production traffic creating instability in the network.
In the below Figure we can observe that Traffic drops to Zero Every 15 minutess and regains back to normal.

Causes:

1. Cluster Nodes if not in sync will create this problem. An in sync cluster will show the same Current Storage Timestamp for all nodes.

In below figure we can see that 3 nodes are not in sync with rest of the nodes . There is variation in the timestamp

2. Secure Web Gateways experience an intermittent network issue while one node attempts to transmit shared data to another node (i.e., TCP handshake failed or node wasn't reachable).

Troubleshooting and checks to be carried out:

Verify the synchronization of nodes
The user interface displays, among other general information, a timestamp for each node in a Central Management Configuration, which allows you to verify whether all nodes are synchronized.

Task
1 Select Configuration | Appliances.
2 On the appliances tree, select Appliances (Cluster). Status and general information about the configuration and its nodes appear on the settings pane. Under Appliances Information, a list is shown that contains a line with information for every node. The timestamp is the last item in each line.
3 Compare the timestamps for all nodes. If they are same for all nodes, the Central Management configuration is synchronized.

You can check your NTP settings from the GUI under Configuration > Date & Time. If your time is too far out of sync (more than ~1000 seconds) then NTP will not sync automatically

Or from SSH Console using command

timedatectl

If not in sync then Force a sync (MWG 7.3 and above):

service ntpd stop
service ntpdate start
service ntpd start

Examples:
Time out of sync, Version mismatch, Network problems.

Here is an example error: Time out of sync

[2023-08-11 16:52:55.938 +05:30] [Coordinator] [MessageWillRespondWithError] Will respond to message <co_distribute_init_subscription> with error: (405 - 'time difference too high') - "time difference too high: 2023-08-11T16:52:00.732+05:30 / 2023-08-11T16:52:55.937+05:30".

Things to check for in cluster settings: distribution Timeout, distribution Timeout Multiplier, allowed Time Difference, check Version match value on all cluster members.

In our Example Since the other parameters were correct, we had to make change for time difference to 300 seconds so that configuration changes can be accepted. Post making the changes the nodes were synchronized and getting the updates, no network fluctuations observed for the reported node.

clipboard_ec4d4b0422ce1f7d840e07b40aedb935b.png

Considerations

Once a cluster is established, you can only access one GUI at a time
All changes made by that GUI are synced to all other cluster members
The system time is important in the CM cluster.
if the times get out of sync (more than 2 minutes), CM communication will fail, and errors will be reported
All appliances should be running the same version & build. It is never recommended to mix versions in a Central Management cluster

Attachments:

Previous Document ID:

https://success.myshn.net/Skyhigh_Se...ral_Management