Connection Recovery

Overview

The Connection Recovery feature in flexiWAN helps to ensure that devices maintain connectivity to flexiManage, even when there are network issues or misconfigurations. It automatically attempts to restore connectivity when certain failures occur, minimizing downtime and reducing the need for manual intervention.

When Connection Recovery is Triggered

Connection Recovery is automatically activated when the following conditions are met:

  • WAN Failure: All monitored WAN interfaces are down while carrier is up, meaning pinging monitored IP addresses fails. This often occurs due to incorrect network configuration on the WAN interface, or a wrong configuration when stopping the vRouter.

  • Reconnection Attempts: At least two consecutive attempts to reconnect have been failed.

  • vRouter Status: The vRouter is not in the middle of a starting or stopping process.

How Connection Recovery Operates

  • Saving Configuration: Upon every successful connection, the system saves the WAN configuration to a recovery file. This configuration is called “Recovery Configuration”

  • Detection of Misconfiguration: If a new configuration is made and the connection is lost within a defined window, the recovery process assumes a misconfiguration has occurred.

  • Stopping the vRouter: In case of misconfiguration, the vRouter is stopped and revert to using the default configuration.

  • Recovery Configuration Fallback: If the default configuration fails, the system tries to utilize the recovery configuration saved during the last successful connection.

  • Monitoring Thread Failures: The system continuously monitors for stuck threads and will restart the flexiWAN Agent if such an event is detected.

  • Manual Intervention: If all recovery actions fail, manual intervention is required to resolve the issue.

  • Resetting the Recovery Process: A successful reconnection resets the recovery process and returns the system to normal operation.

Sample recovery cases are given below:

Handling Incorrect WAN IP Configurations

If the device lost connection in a defined window from last applied configuration. This can happen due to incorrect IP configuration on its WAN interface (e.g., incorrect IP address, gateway, or subnet), incorrect static route, etc. Connection Recovery will attempt to restore connectivity by:

  • Stopping the vRouter: The system will stop the vRouter to return to the default configuration

  • Apply default configuration: The system tries to apply the default configuration.

  • Switching to a Recovery Configuration: If reapplying the default configuration fails, Connection Recovery will attempt to use a recovery configuration saved during a previous connection.

  • Pausing Recovery: If the issue persists, further recovery attempts are paused, requiring manual intervention to resolve the problem.

Handling Issues in Default configuration

The default configuration is used when first setting up the device before vRouter is started. This configuration is used every time the vRouter is stopped. When a misconfiguration is detected in the default configuration, Connection Recovery will take the following steps:

  • Switching to a Recovery Configuration: If applying the default configuration fails, Connection Recovery will attempt to use a recovery configuration.

  • Pausing Recovery: If the issue persists, further recovery attempts are stopped, requiring manual intervention to resolve the problem.

What Happens After Recovery?

  • Successful Recovery: When the recovery process successfully resolves the issue, the device returns to normal operation, and the recovery state is reset to idle.

  • Failed Recovery: If recovery fails after all attempts (e.g., both the default and recovery configurations are unsuccessful), the system enters a “FAILED” state. In this state, further reconnection attempts are paused until the problem is manually resolved.

Disabling Connection Recovery

Connection Recovery is enabled by default to ensure automatic restoration of connectivity during outages or misconfigurations. However, it can be disabled if manual control over the connection process is preferred.

To disable Connection Recovery, update the /etc/flexiwan/agent/fwagent_conf.yaml configuration section as follows:

watchdog:
    deadlock:
      enabled: true  # Change to false stops the thread recovery
    connection:
      enabled: true  # Change to false stops the recovery process
      router_cfg_window: 240  # Change to 0 disables the configuration check to prevent mistakenly stopping the router

Afterwards restart the device or simply restart the service via command systemctl restart flexiwan-router. This will prevent the system from automatically attempting to recover connectivity, meaning any issues will require manual intervention.

Limitations

Connection Recovery does not work in case of:

  • Connection over VLAN

  • Connection over Wireless (LTE/5G/WiFi)

  • Connection over PPPoE