Manual Chapter :
Disaster Recovery Between ARX Clusters
Applies To:
Show VersionsARX
- 6.3.0
Use the activate configs command to enable all of the volumes and services in a pre-loaded global-config file. If this site is a backup for another site, this is the final step in performing a disaster-recovery operation. | |||||||||||
file-name (1-1024 characters) is the name of the global-config file. You can activate a replicated global-config from a remote cluster or a locally-defined global-config file. Use show replicated-configs for a list of replicated-config files from other clusters, or use show configs for a list of locally-defined configuration files. partial (optional) activates a subset of the full configuration. This can be any of these keywords, or any combination of them: global-server fqdn (1-128 characters) activates a particular global-server configuration, along with all of its component objects (such as its namespace and all the shares behind it). This is the fully-qualified domain name (for example, myserver.myorg.org) for one global server in the global-config. Use show replicated-configs file-name to view the global config and see all of its global servers. shares activates all shares. volumes activates all volumes. global-servers activates all global servers. policies activates all rules and share farms. If you specify any of the above, the command activates only the specified option(s). If you omit all of them, the command enables all global servers, volumes, and shares but does not activate any policy rules. take-ownership (optional, but recommended) causes the command to use the take-ownership option for every managed-volume share that it enables (see the documentation for enable (gbl-ns-vol-shr)). The filer-replication process typically copies hidden files from the back-end shares at the source site, including files that mark the back-end share as already imported by the source sites ARX. The managed volumes will not import these shares unless the take-ownership flag is raised. tentative (optional) creates a report to describe the how the command runs, but does not actually enable any configuration objects. The CLI shows the name of the report after you issue the command. You can use show reports report-name to review the report and confirm that the command implements the configuration as desired. activate replicated-configs file-name ... is an alternative syntax, with the same options as described above for activate configs. | |||||||||||
Then you begin the process of building a global configuration to be regularly copied from the active site to the backup site. The global configuration includes all storage-service parameters; see show global-config. The peers at the active site run the global configuration during normal service, and the peers at the backup site run it after a disaster. The ARX peers at both sites must share the same master key in order to share the same global configuration. The master key encrypts and decrypts all of the passwords and other Critical Security Parameters (CSPs) in the global-config file. Use the show master-key command to show an encrypted copy of the master key on the current ARX. The master keys should be synchronized during installation time, as described in the Hardware Installation manuals. If the backup peers have the wrong master key, reset the peers to factory defaults and reset the master key. They should only have their network parameters configured (show running-config), along with the correct master key. The documentation for show master-key provides details for this process. At the ARX CLI, use the ron tunnel command to connect both ARX peers at the active site to both peers at the backup site. That is, each ARX must have two tunnels, one to each peer at the other site. We recommend removing all namespaces, global servers, and front-end services from the backup cluster before you begin. This prevents any significant configuration conflicts between the active site and the backup site, which could possibly prevent a successful disaster recovery. Ideally, the backup cluster should only have network parameters configured (which you can see with show running-config), along with administrative accounts (show group users) and any AD forests that have been discovered in the network (show active-directory). If there are any storage services on the backup cluster, we recommend removing them. From the backup clusters CLI, use remove service namespace for each namespace in the backup clusters configuration. The remove service command removes the namespace configuration along with all associated external-filer definitions, and it also removes ARX metadata from the filers behind the namespace. Then use no global server, no cifs, and/or no nfs to remove all of the global servers and front-end services at the backup cluster. | |||||||||||
Once all ARX peers have the same master key and are connected by RON tunnels, use the cluster-name command to establish a cluster name for each site. The cluster-name command also assigns each peer to a cluster. You then expand the active clusters global configuration so that it can also include the backup clusters file servers and Virtual-IP addresses:
This prepares the active clusters global config so that it is valid at either the active cluster or the backup cluster. Then create a config-replication rule to regularly copy that global config from the active cluster to the backup cluster. If you use any snapshot rules to back up client data, and if the filer-replication application copies the snapshots to the backup site, we advise that you also prepare for snapshot reconstitution. You prepare for snapshot reconstitution by regularly copying snapshot reports off of the active cluster, along with a Perl script in the configs directory. The guidelines for snapshot rule describe this preparation process in detail: see Guidelines: Preparing for Snapshot Reconstitution. | |||||||||||
After the global-config is successfully copied to the backup cluster, we recommend an occasional test of the load configs command at that cluster. This loads all of the active clusters storage services into the database without enabling any shares or volumes. It also loads the Active Directory configuration, which may not be correct for the backup cluster; after the load operation, run active-directory update seed-domain there to confirm that the AD configuration is correct. If you are running CIFS services with constrained delegation, you require some additional configuration at the Domain Controller (DC) to support your CIFS clients. The back-end CIFS servers at the backup cluster must be on the delegate-to list for each front-end CIFS service; for each cifs service, run the probe delegate-to command to find any back-end servers that need to be added to the list, and add them at the DC. When you finish, the CIFS services delegate-to list includes the back-end servers behind both clusters. This configuration occurs at the DC, and persists after this load-configs test. Check the remaining configuration to confirm that it is correct. Then use remove service to remove each of the namespaces that you added with the load configs command. Finally, use no global server, no cifs, and/or no nfs to remove all of the global servers and front-end services. | |||||||||||
To recover from a disaster, go to the backup cluster and run load configs on the global-config file that was copied earlier. This loads the global-config into the database without enabling any shares, volumes, global servers, or policy rules. You can use this command, activate configs, to enable any or all of those objects. Unless you use the tentative option, the CLI prompts for confirmation before activating the global-config file. Enter yes to proceed.
Alternatively, you can bring up the shares, volumes, and global servers all at once by omitting any specific keywords. After confirming that the services are running properly, you can use activate configs ... policy to activate all rules. If you also prepared for snapshot reconstitution, play back the snapshot reconstitution script as described in Guidelines: Reconstituting ARX Snapshots. If the cluster at the failed site is still on the network after the disaster, log into the cluster and use no enable (gbl-cfg-repl) to stop copying the global configuration from the now-backup site. | |||||||||||
This command always creates a report to show its actions. The CLI shows the name of the report after you issue this command. You can use tail report-name follow to follow the report as the command runs, or show reports report-name to view the report after the command has finished. The configuration file is an ordered list of CLI commands that recreate the global configuration, including all volumes and storage services. The report shows these CLI commands as it runs them. If it reaches a configuration object that already exists in the local database, it shows an INFO message to explain that the object is being ignored. An X appears at the beginning of each ignored line. This includes the enable command for each activated configuration object (such as a share or volume); the load configs command does not enable any configuration objects. | |||||||||||
newptA# activate configs provSvcs.rcfg tentative newptA# activate configs provSvcs.rcfg shares take-ownership | |||||||||||
Figure 35.1 Sample Report: Activate Configs Report
newptA# show reports act_provSvcs.rcfg_20100709113618.rpt
A cluster is single ARX site, comprised of a redundant pair of ARX systems or a standalone ARX. Use this command to declare a cluster name for any ARX (or redundant-ARX pair) in the current Resilient Overlay Network (RON). This name is required for configuring disaster recovery between two ARX sites. Use no cluster-name to remove a cluster name from the ARX configuration. | |
cluster-name (1-64 characters) is the name you choose for this cluster. | |
This command establishes an ARX site as a cluster, capable of failing its global-configuration over to another cluster. You can create two clusters on your RON and set them up in a disaster-recovery configuration, so that one can take over service for the other in the event of a catastrophic failure. The ron tunnel command creates a RON tunnel between any two ARX systems on the same WAN; use that command to create a full mesh of RON tunnels between all four peers. Additionally, ensure that all four peers have the same master key, so that they can share their global configurations (see show master-key for details). To set up disaster recovery between two clusters, you designate one cluster as active and the other as the backup site. Use filer-replication facilities (for example, NetApps SnapMirror or EMCs SRDF) to copy all files and directories from the active sites back-end filers to their counterparts at the backup site. Once the files and directories are synchronized, add all of the backup clusters external-filers to the active clusters configuration. Use the filer, metadata share, and sam-reference commands to establish how those filers should be used in the backup cluster. Then use the virtual server command to establish a new Virtual-IP address and configuration for each global server in the backup cluster. Finally, create a config-replication rule to regularly copy this configuration from the active cluster to the backup cluster. This prepares the backup site to take over service from the active site. If the active site fails, you can then resume your storage services at the backup cluster. To accomplish this, go to the backup cluster and use load configs to load the active sites configuration into the backup clusters database, then use activate configs to enable all volumes and services at the backup cluster. | |
A cluster allows the configuration of a single metadata share per managed volume. If any managed volume has more than one metadata share, this command fails. A single metadata share per volume is required to ensure that the disaster-recovery software can reliably re-create the volume configuration on a backup cluster. | |
To remove a remote cluster along with all its associated filers, metadata shares, and virtual servers, use the remove cluster-config command. You can use the same command on the local cluster to remove the cluster object and all references to it, without actually removing the above objects. The no cluster-name command removes only the cluster definition; the CLI rejects this operation if the cluster name is referenced by any of the above objects. | |
provA(gbl)# cluster-name providence member provA provB provA(gbl)# cluster-name newport member newptA newptB | |
Use the config-replication command to create a rule for replicating the global configuration to an ARX cluster at another site. A cluster is a redundant pair of ARX systems. This prepares for a disaster-recovery scenario. If your back-end filers mirror their data between the sites, this rule mirrors the configuration data for managing those filers at each site. Use the no form of the command to delete the rule. | |||||||||||
name (1-32 characters) is the name you choose for the rule. cluster-name (1-64 characters) is the name of the source cluster. This cluster must be defined in the database; you can use the show cluster command for a complete list of pre-defined clusters. | |||||||||||
The master key for an ARX encrypts and decrypts the passwords in its global-config file, so it must be the same for all switches that share the global-config. The backup switches should have the same master key as their active counterparts; the master keys should be set accordingly when the backup switches are installed. Refer to the documentation for the show master-key command for information on showing the key and possibly changing it on the backup switches. Every switch in the active cluster must have a ron tunnel to every switch in the backup cluster. This provides multiple pathways for copying the global configuration between clusters. The backup cluster should not have any storage services configured, so that it accepts the latest storage-services configuration from the active cluster without any conflicts. If the backup cluster has any namespaces, use remove service to cleanly remove each of them. Also use no global server, no cifs, and/or no nfs to remove all of the global servers and front-end services at the backup site. | |||||||||||
The config-replication command creates a rule to regularly copy the active clusters global configuration and send it to the backup cluster. The command puts you into gbl-cfg-repl mode, where you have various configuration commands for configuring the config-replication rule. Use the target-cluster command to choose the ARX cluster to receive this configuration. Then use the user (gbl-cfg-repl) command to enter your administrative credentials at the remote cluster; this gives you permission to write the configuration file there. The target-file command provides a name for the file copy. Use the schedule (gbl-cfg-repl) command to choose a regular schedule for duplicating the configuration file. We recommend using the report (gbl-cfg-repl) command to create a report about each config-replication event. You can also use the description (gbl-cfg-repl) command to describe this rule in the output of show policy. Finally, use the enable (gbl-cfg-repl) command to start the rule. | |||||||||||
To create a schedule to be applied to the config-replication rule, use the gbl schedule command at the source cluster. | |||||||||||
As mentioned above, we recommend that you use the config-replication report feature to keep a detailed log of all config-replication events. The reports appear on the source cluster: the show reports command shows all reports on the switch, including config-replication reports. You can use the standard file-management commands with these reports: delete, rename, show reports file-name, tail, and/or grep. | |||||||||||
| |||||||||||
provA(gbl)# config-replication prov2newport cluster providence | |||||||||||
Use the optional description command to set a descriptive string for the current config-replication rule. This appears in show commands. Use the no form of the command to delete the description. | |
description text text (1-48 characters) is your description. Surround the text with quotation marks () if it contains any spaces. | |
provA(gbl-cfg-repl[prov2newport])# description send service config to Newport site | |
Use the enable command to enable the current config-replication rule. Use no enable to disable the rule. | |
You must enable the config-replication rule for the policy engine to use it. Once enabled, the config-replication rule follows its assigned schedule (set with the schedule (gbl-cfg-repl) command) and copies its global-config file to the target ARX cluster (target-cluster). The no form of the command is useful in a formerly-active cluster that is still on the network. Use the no enable command to stop the cluster at the failed site from copying its configuration to the newly-active site. | |
bstnA(gbl-cfg-repl[dr-test])# no enable | |
The global-config parameters are shared among both ARXes in a redundant pair. (The traditional running-config applies only to the local switch.) You can copy the global-config from one redundant pair to another, so that the second redundant pair (called an ARX cluster) can act as a backup for the first. Use this command, load configs, to load a global-config into the local database without enabling any configuration objects. | |||||
file-name (1-1024 characters) is the name of the global-config file. You can load a replicated global-config from a remote cluster or a locally-defined global-config file. Use show replicated-configs for a list of replicated-config files from other clusters, or use show configs for a list of locally-defined configuration files. fqdn (optional, 1-128 characters) selects a particular global-server configuration, along with all of its component objects (such as a namespace and all the external-filers behind it). This is the fully-qualified domain name (for example, myserver.myorg.org) for one global server in the global-config. Use show replicated-configs file-name to view the global config and see all of its global servers. tentative (optional) creates a report to describe the how the command runs, but does not actually commit any configuration changes to the database. The CLI shows the name of the report after you issue the command. You can use show reports report-name to review the report and confirm that the command implements the configuration as desired. load replicated-configs file-name ... is an alternative syntax, with the same options as described above for load configs. | |||||
After proper preparations have been made for disaster recovery (see the documentation for the config-replication command), you can use this command as a first step toward recovering from a disaster at a remote site. After you run this command to load the configuration, use activate configs to enable it and start client services at the current cluster. | |||||
This command always creates a report to show its actions. The CLI shows the name of the report after you issue this command. You can use tail report-name follow to follow the report as the command runs, or show reports report-name to view the report after the command has finished. The configuration file is an ordered list of CLI commands that recreate the global configuration, including all volumes and storage services. The report shows these CLI commands as it runs them. For every configuration object that the load operation ignores, an INFO message appears before the object to explain why it was skipped. An X appears at the beginning of each ignored line. | |||||
The load operation skips configuration objects that meet any of the following criteria:
| |||||
The load operation prompts for confirmation if it finds that the configuration is for a newer release. After you edit the configuration, enter yes to proceed. | |||||
We recommend running this command on a cluster without any storage services in its global configuration. If any object in the global-config file has the same name as an object that already exists, the load operation typically does not change the object that already exists. Therefore, if you load a configuration onto a backup cluster and leave it there, change the configuration on the active cluster, and copy the changed configuration to the backup cluster to load it again, many (possibly all) of the configuration changes are ignored at the backup cluster. If you run a test load before any disaster, we recommend using remove service for each loaded service after you finish. Also use no global server, no cifs, and/or no nfs to remove all of the global servers and front-end services. | |||||
newptA# load configs provSvcs.rcfg tentative newptA# load config provSvcs.rcfg | |||||
Figure 35.2 Sample Report: Load Configs Report
newptA# show reports load_provSvcs.rcfg_20100709113148.rpt
Use the remove cluster-config command to delete a cluster-name from the database along with any references to that cluster name. If you use this with the remote cluster name, this removes all filers, metadata shares, and/or virtual servers associated with that cluster. If you use this with the current cluster name, this only removes the cluster name from those configuration objects, leaving them otherwise intact. | |||||
remove cluster-config cluster-name cluster-name (1-64 characters) identifies the cluster. You can use the show cluster command for a complete list of configured clusters. | |||||
The CLI prompts for confirmation before removing or changing any configuration objects; enter yes to proceed.
If any of the above object types refer to a cluster name, you cannot use no cluster-name to remove the cluster name. This command exists as a convenience; it is a single command for removing the cluster name and all of its dependent configuration. To change the name of the local cluster, you can use this command to remove the current name and then use the cluster-name command to add the new cluster name to all of the above objects at once. | |||||
bstnA# remove cluster-config portland | |||||
Use no report to prevent progress reports. | |
report file-prefix [verbose] file-prefix (1-1024 characters) sets a prefix for all config-replication reports from this rule. Each report has a unique name in the following format: verbose (optional) enables verbose data in the reports. | |
Use show reports for a list of reports, or show reports file-name to show the contents of one report. | |
provA(gbl-cfg-repl[prov2newport])# report gblRepl2newport enables reports for the config-replication rule, prov2newport. For a sample report, see Figure 35.3. bstnA(gbl-cfg-repl[dr-test])# report drTest verbose newptA(gbl-cfg-repl[newpt2prov])# no report | |
Figure 35.3 Sample Report: Config-Replication Report
provA# show reports gblRepl2newport.rpt
Use this schedule command to assign a schedule to the current config-replication rule. Use no schedule to remove the rules schedule. | |
schedule name name (1-64 characters) identifies the schedule. Use show schedule for a list of configured schedules. | |
You cannot use a schedule with a fixed duration; a config-replication rule must always run to completion in order to succeed. | |
provA(gbl-cfg-repl[prov2newport])# schedule oncePerDay | |
Use the show cluster command to get a list of all ARX clusters, or redundant pairs, known to the current ARX. You can set up ARX clusters so that one cluster can act as a backup for another. | |
show cluster [cluster-name] cluster-name (optional, 1-64 characters) shows only the chosen cluster. | |
Cluster Name identifies the cluster. You set this when you use the cluster-name command to create the cluster. Member in the next column is for the above switchs redundant peer, if there is one. | |
provA# show cluster shows all clusters. See Figure 35.4 for sample output. provA# show cluster newport shows one cluster. See Figure 35.5 for sample output. | |
Figure 35.4 Sample Output: show cluster
provA# show cluster
Figure 35.5 Sample Output: show cluster newport
provA# show cluster newport
Use the show config-replication command to see the current status of one or more config-replication rules. A config-replication rule copies the global config (see show global-config) from the current site to an ARX at a disaster-recovery site. | |||||||||||||||||||
show config-replication [rule-name] rule-name (optional, 1-32 characters) provides configuration details for a particular config-replication rule. | |||||||||||||||||||
Name identifies the rule. You set this when you use the config-replication command to create the rule. Cluster is the name of the target ARX cluster (or redundant pair) where the rule sends its global configuration file. You can set this with the target-cluster command. Filename is the name of the configuration file that the rule creates at the remote cluster. You can change this with the target-file command. Schedule is the name of the schedule for the config-replication rule. Use the schedule (gbl-cfg-repl) command to assign a schedule to the rule. Admin State is Enabled or Disabled, depending on the setting of the enable (gbl-cfg-repl) command. | |||||||||||||||||||
Name identifies the rule, as described above. Description describes the rule, as specified with the description (gbl-cfg-repl) command. Cluster is the name of target cluster, as described for the summary output above. Target Filename is the name of the configuration file that the rule creates. This is the same as the Filename in the summary output. Report Prefix shows the prefix used in all of the rules reports. The reports appear on the source cluster. You can use the report (gbl-cfg-repl) command to change this prefix. Username is the administrative user name used for authentication at the remote cluster. These credentials confirm that the rule is authorized to write the configuration file. This is a valid administrative user at the ARX cluster that receives the copy. Admin State is Enabled or Disabled, as described above for the summary output. | |||||||||||||||||||
Status shows the results of the last attempt to copy the global-config:
Schedule is the schedules configured name, chosen with the schedule (gbl-cfg-repl) command. The fields in this table show the configuration of the schedule.
| |||||||||||||||||||
provA# show config-replication shows the status of all config-replication rules. See Figure 35.6 for sample output. provA# show config-replication prov2newport shows the detailed status for a particular rule. See Figure 35.7 for sample output. | |||||||||||||||||||
Figure 35.6 Sample Output: show config-replication
provA# show config-replication
provA# show config-replication prov2newport
Use the target-cluster command to choose a remote ARX cluster as a target for copies of the local clusters configuration. The current config-replication rule copies the global-configuration from the local cluster on a schedule. This prepares the remote cluster as a backup for the local cluster in the event of a disaster at the local site. | |
target-cluster cluster-name cluster-name (1-64 characters) is the target cluster for the copy of the global-config file. This cluster must be defined on the current ARX; use the show cluster command for a list of defined clusters, or the cluster-name command to define one. | |
This command sets a cluster target for the current config-replication rule. A cluster is a redundant pair of ARX peers (see redundancy), or possibly a standalone ARX. | |
provA(gbl-cfg-repl[prov2newport])# target-cluster newport | |
The current config-replication rule copies the global-configuration from the local cluster to a remote cluster on a schedule. Use the target-file command to choose the name of the remote copy. The copy operation prepares the remote cluster as a backup for the local cluster in the event of a disaster. | |
target-file file-name file-name (1-255 characters) is the file name for the destination file. | |
This command sets a file-name target for the current config-replication rule. A cluster is a redundant pair of ARX peers (see redundancy), or possibly a standalone ARX. A config-replication rule copies the local clusters global configuration (see show global-config) to a remote cluster. This prepares the remote cluster as a backup for the local cluster, in case of a disaster at the local site. | |
provA(gbl-cfg-repl[prov2newport])# target-file provSvcs.rcfg | |
When you use a config-replication rule to copy the global configuration to a remote ARX cluster, you require an administrative username and password for the remote peer. Use the user command to set this username and password. Use the no form of this command to remove the administrative credentials, effectively disabling the config-replication rule. | |
user name | |
provA(gbl-cfg-repl[prov2newport])# user admin Password: s3cr3tpasswd Validate Password: s3cr3tpasswd | |