Support Support Downloads Knowledge Base Juniper Support Portal Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[Space Platform] Junos Space VMWare Best Practices

0

0

Article ID: KB35153 KB Last Updated: 30 Oct 2019Version: 2.0
Summary:

This article describes the best practices for using VMWare snapshots and vMotion with a Junos Space Fabric deployment.

Solution:

VMWare Snapshots

Single Node Space Deployment

  • Snapshot and restore of a standalone Space node system can be done at any time as there are no other node communications to consider.

  • Ensure "Snapshot the virtual machine's memory" is NOT selected. If memory is included, the system restores in an powered on state.  The system post snapshot restore suddenly finds the clock jumped a large amount of time, and all device connections are suddenly broken.

     

Multiple Node Space Deployment

  • Snapshots of Space nodes MUST be created and restored in such a way that all nodes revert with consistent data across all nodes.

  • There is no mechanism to reconcile data differences that the system believes to have replicated already.

  • Executing a snapshot across multiple VMs is nearly impossible without powering the Space node off first so that there is no chance of data changing. Even a 1 second difference in snapshot times could cause data problems that may not be discovered for weeks or months.

Creating a Snapshot
  1. Power down All Space Fabric nodes.
  2. Snapshot All nodes.

The VMWare team should confirm that all VMs are seen as off before running the snapshot.
  1. Power on the Space nodes.

Restoring a Snapshot
  1. Restore snapshots for all nodes in VMWare (nodes will move to a powered down state to match the snapshot).
  2. After the snapshots of ALL nodes have been reverted successfully, power on all nodes.

(Any nodes that do not revert to a snapshot taken at the same powered down timestamp successfully should be discarded and re-created).

Disaster Recovery (DR) Space Deployment

  • Similar to multiple nodes. Each DR site should be powered off and snapshots done at the same moment in time.

  • DR replication is not built to repair data that the system believes to already be synced between DR sites.

  • In the case where you are preparing for an upgrade, DR should be deactivated prior to powering off.

  • In the event of a rollback from an upgrade, you can restore one of the sites and operate a site independently, leaving one site for investigation of the upgrade failure.

jmp-dr stop / start is required to reset the replication between Fabric nodes.

Creating a Snapshot
  1. Stop DR: jmp-dr stop
  2. (Optional) Completely disable DR to make the two sites independent.

Make sure that you know the steps used to configure DR and reset the DR configuration:
jmp-dr reset
  1. Ensure that DR is off (or disabled) on both sites: jmp-dr health

  2. Power off all nodes on the site before running the snapshot.

The VMWare team should confirm that all the VMs are seen as off before running the snapshot.

  1. If you are running a snapshot of all nodes, power off both sites.
  2. Snapshot all nodes in powered off state.

  3. Power on All nodes.
Restoring a Snapshot
  • The VMWare snapshot restore needs to be completed across all nodes at a DR site prior to powering on the nodes.

 
Note: Other Space nodes such as Log Collector and Policy Enforcer should be powered off and a snapshot taken at the same moment in time for best data consistency between Space and the special function node.
 

Junos Space and vMotion (DRS)

Standalone

  • A Junos Space node does not have any problems with vMotion because data is not replicating across a Space Fabric.

Fabric

  • A Junos Space Fabric deployment can see performance problems during a vMotion operation. In a worst case, Space may enter an inconsistent state requiring all nodes to be rebooted to completely recover the environment, after a vMotion event.

  • Disabling DRS is preferred. Instead, distribute the Junos Space nodes across multiple VMWare hosts, allowing for the Junos Space node to go down with the VM host in the event of a VMWare node failure. Ensure that Space node distribution between VMWare hosts is appropriate based on the Space Fabric node role.

Modification History:
2019-10-30: Updated 2nd bullet under Single Note Space Deployment.
Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search