This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Storage Based Replication

1: Introduction
2: Configuration
3: Failover and Failback
4: Management
5: Monitoring
6: FAQ
7: Best Practices
8: Examples
9: Troubleshooting

1 - Introduction

Storage Based Replication

Storage Based Replication (SBR) is a feature that lets you replicate storage volumes in IP4G between paired storage controllers between two regions.

This replication is performed asynchronously at the block level, and is transparent to the VM operating system. Using SBR can allow customers to perform a disaster recovery test or failover with a low recovery point objective (RPO), and minimal configuration changes.

Disaster recovery is always a complex topic, and while SBR can assist in replicating your data volumes, and potentially OS volumes, it is important to consider how you plan to handle operating system failover, networking, testing, and many other topics for a resilient disaster recovery plan.

Recovery Point and Time Objectives

Common terms in regards to Disaster Recovery are Recovery Point Objective, and Recovery Time Obective.

Recovery Point Objective: This can be thought of as the maximimum amount of data you might lose in the event of a failover.

In IP4G, with storage replication, customers can expect an RPO of roughly 25 minutes or less. This time is based on the cycle time of the asynchronous replication.

Recovery Time Objective: This can be thought of as the amount of time it takes to recover.

Customer RTO is based more on their decisions of Operating System recovery method, potential boot times, application start times, networking, and other considerations. Failing over SBR will take time, but is usually a minor factor in calculating the overall RTO.

Asynchronous Replication

Volumes configured for replication are replicated to the remote storage asynchronously using Change Volumes. Synchronous replication is not available in IP4G. Change volumes are used to increase the recoverability of the asynchronous volumes and optimize the replication. Keeping all of your volumes consistent using a single change volume Freeze time allows for a highly recoverable point of consistency across a large number of volumes, with efficient transport of change blocks.

The cycle rate of the change volumes is not customer configurable.

Volume Groups

Volumes can be grouped into a volume group which allows multiple volumes to share a single consistency point, allowing for recovery of all volumes in the volume group to the same point in time. Volumes are typically grouped into a volume group per VM.

Storage and Regions

Select storage pools in one region are paired with specific pools in a paired region. Customers creating a volume on a replication enabled storage pool can then enable replication for that volume, creating a matching volume in the storage pool in the other region.

Auxillary Volumes

When replication is enabled for a volume, it creates an “auxillary volume” in the paired storage pool in the remote region. Auxillary volumes must be onboarded into your Cloud instance in the second region.
This process registers the auxilary volumes with your cloud instance in the secondary region. Once onboarded, you can see the auxillary volume in that cloud instance, which is the replication target of the master volume.

You cannot read or write from an auxillary volume while replication is ongoing, instead, you must either clone the auxillary volume (typical for a DR test) or failover to the DR volume (typical for an actual DR event).

2 - Configuration

Pre-Requisites

Volumes to be configured for replication must exist in a replication enabled pool

Pools which currently support Storage Based Replication:

USE4: General-Flash-100 (Target: USC1: General-Flash-200)
USC1: General-Flash-200 (Target: USE4: General-Flash-100)

More regions and pools will be added over time.

You can see if there is a pool that supports Storage Based Replication in your Cloud Instance by using the pcloud CLI to list the pools:

pcloud compute volumes list-pools

If a pool supports replication it will say true in the Replication Capable column.

Your Cloud Instance must also be allowed to use storage based replication. (Currently this must be part of your private offer)

You must be using the latest version of the pcloud CLI.

Volume Creation

Your volume should be created in one of the replication enabled storage pools. (if a volume already exists and is in a replication enabled pool, you can skip this) To view a list of pools in your region use

pcloud compute volumes list-pools

To create a volume in a specific pool use

pcloud compute volumes create <name> -s <size> -T <type> -p <pool>

Enable Replication

To enable replication on a volume that exists in a replication enabled pool

pcloud compute volumes replication enable <volume>

This will create the auxillary volume at the target site, and begin the copying process.

Volume Replication Status

You should familiarize yourself with the status of the volume as it replicates.

pcloud compute volumes describe <volume>

You will see the following information:

replicationStatus: enabled
mirroringState: <state>
auxVolumeName: <aux volume>
masterVolumeName: <master volume>
groupID: <group ID>

Replication status will be enabled for volumes that are being replicated.

Mirroring state will bein as inconsistent_copying, when the volume has caught up, it will change to consistent_copying

The master volume is the source, the aux volume is the target.

Group ID is the name of the volume group the volume is a member of (initially they are not in a volume group)

Volume Groups

It it strongly recommended you group all replicated volumes from a single system into a Volume Group. This allows for a single recovery point across the entire VM, and also allows for easier management of VMs during failover.

Note: You cannot add volumes to a volume group until they are all in a consistent_copying state

pcloud compute volume-groups create <name> --volume <volume 1> --volume <volume 2>

You can list your volume groups with

pcloud compute volume-groups list

You can see the members of a volume group with

pcloud compute volume-groups describe <name>

You can expand a volume group with

pcloud compute volume-groups add-volumes <volume group> --volume <volume 1> (--volume <volume 2>)

Onboarding at the Target Site

This process registers the auxillary volume with your cloud instance in the secondary region, making it visible and manageable there.

In order to onboard volumes at the target site, it is recommended your volume groups be properly built in the source site first. Volume Groups are automatically created at the target site based on the volume group from the source site.

Once all of your volumes are created, and they are grouped, you should gather the following:

Your Source Cloud Instance ID (from pcloud config list)
Your Master and Auxillary Name for each volume (from pcloud compute volumes describe or pcloud compute volume-groups describe)

Onboard your volumes with:

pcloud compute volumes replication onboard --name <name> --source <source cloud ID> --volume <aux volume name 1>:<target volume name 1>  --volume <aux volume name 2>:<target volume name 2>

Target volume name is what you will see the volume named as in your Cloud Instance. It is recommended you have a naming convention for your target volume names, and follow it for all of your target volumes.

A volume group is automatically created when you onboard the target volumes.

pcloud compute volume-groups list

3 - Failover and Failback

Concepts

Failover and Failback fall into two domains: testing and disaster. There may be other cases, but primarily those are the two concerns customers need to address. Actual failover is disruptive to a customer’s DR posture and ability to recover, so is generally not used for testing. Instead, a cloning process is utilized to create copies of the DR volumes, which can be validated and removed after the test, without disrupting the ability to fail over should an actual emergency occur.

In order to perform any of these activities, the Auxillary volumes must be onboarded at the target site.

Cloning (DR Testing)

Validate the volume group you want to test.

pcloud compute volume-groups list

Validate the list of all volumes in the volume group.

pcloud compute volume-groups describe <volume-group>

Clone the volumes. Note, to keep the volumes at a consistent point, you want to clone them all in one command.

pcloud compute volumes clones create <clone name> -v <vol1> (-v volX)

Validate the cloning status with

pcloud compute volumes clones status <clone name>

When the status is completed, you’ll have volumes named clone-<clone name>-X, which can be attached to VMs for testing.

Failover

To stop a volume group so that the Target site volumes can be used, validate the volume group you want to fail over.

pcloud compute volume-groups list

Validate the list of all volumes in the volume group.

pcloud compute volume-groups describe <volume-group>

Stop the volume group, allowing the target Auxillary volume to be accessed.

pcloud compute volume-groups stop --allow-read-access

You can now attach the volumes directly to a VM.

Note: At this point the volumes at BOTH sites can be modified. To restart replication post failover, the volumes which will be the TARGET must not be attached to a VM. You also have to select a replication direction. In doing so, data at the specified TARGET site will be overwritten with the changes/data from the specified SOURCE.

Restarting Replication

To determine which direction you want, look at the volume group

pcloud compute volume-groups relationship <group>

There is a key primary - this is indicating if the primary (source) is currently the volume(s) listed as master or aux. In the start command you need to specify which is the primary once it starts.

Example - if primary shows: primary: master and you specify master when you restart the volume group, it will keep its original copy direction. Data on the Aux volume will be overwritten with the data from the Master volume.

To restart replication in the original direction (overwriting the target):

pcloud compute volume-groups start <group> master

Failback

To fail back to the original site, first restart your replication so that the original aux volume is now the source:

pcloud compute volume-groups start <group> aux

Note: This will copy all date from the aux volume to the master volume

Once the volume-group is in a consistent_copying state, use the same process as above to stop the replication, enable access to the master volumes in the original site, and access the volumes.

4 - Management

Resizing a Replicated Volume

When a volume is a member of a volume group, you can’t just resize it. Instead you must first remove it from the volume group. This does not disrupt the replication of the volume or volume group, but does change its consistency compared to the volume group for the time it is not a member of the group.

At the source site remove the volume from the volume group

pcloud compute volume-groups remove-volumes <volume group> --volume <volume>

Verify the volume is still consistent_copying

pcloud compute volumes describe <volume>

Resize the volume

pcloud compute volumes update <volume> --size <new size>

Note: If you get an error that the volume must be consistent_copying before updating, wait 1-2 minutes, and reattempt the resize. This can happen due to how change volumes work.

Add the volume back into the volume group

pcloud compute volume-groups add-volumes <volume group> --volume <volume>

Verify the volume is in the volume group

pcloud compute volume-groups describe <volume group>

Adding an Aux Volume Group After volumes are onboarded

Typically, you want your Master site volumes to be in a volume group prior to onboarding the volumes at the Aux site. If you do onboard the volumes prior to creating the volume group in the master site, if you then create a volume group there will not be a matching volume group at the Aux site.

To correct this, you can log in to the aux site, and use the command

pcloud compute volumes replication create-aux-group <source volume group name> --source <source cloud ID>

This will create the volume group at the Aux site, and place the correct Aux volumes in the group.

5 - Monitoring

To monitor the status of volumes and volume groups, use the pcloud CLI to see their status.

To see all configured volume-groups:

pcloud compute volume-groups list

This will reflect their status, and if Replication is enabled.

To show the volume group extended description, including volumes which belong to a volume group use:

pcloud compute volume-groups describe <volume-group>

To show the detailed relationship information for a volume group, including the status of each individual volume replication, use:

pcloud compute volume-groups relationship <volume-group>

This output contains a lot of detail, including the progress of the copy and the freeze time for the last change volume. This can be of particular use in investigating the copy status of a volume group. Volumes which are ready for a failover should be in a consistent_copying state.

You can also see the Auxillary volume, master volume, and mirroring state of an individual volume:

pcloud compute volumes describe <volume>

6 - FAQ

What happens if the inter-region link drops during replication?

Replication between regions flows over IP4G managed redundant interconnects. In the event that all connectivity between regions is down, replication between regions will be disrupted. The source volumes will continue to function as normal, and the target volumes will remain as they were when the disruption occurred. When connectivity is restored, the replication will need to be manually restarted.

How does SBR Handle split-brain scenarios if the communication is lost and then restored?

The target site will not automatically be writable when connectivity is lost, so no split-brain scenario will occur on its own, and replication will be stopped at both sites. A customer may choose to make the target site writable. When the original source site comes online, replication will be in a disabled state in both sites. When turning replication back on, the customer will be in control of which site is the source for the replication.

Is there a performance impact on the primary volumes when replicating?

There is a small delay to I/O operations that can occur during the application of the change volumes. Customers will see an increase in max service times for brief intervals. The overall delay times are minimal. Following best practices for volume group sizing and configuration will help minimize the IO impacts of SBR.

7 - Best Practices

Use a single volume-group per VM
- Can be more than one AIX VG or IBM i iASP
- Creates a single consistency group per VM
- Controls cost and failover complexity
Onboard all volumes as soon as the volume-group is created (or any time volumes are added to a volume-group)
- Volumes are easily “forgotten” if not onboarded promptly
- Volumes must be onboarded for failover to function
- Onboarding post-disaster is not possible
When cloning volumes at the target site for a DR test, clone all the volumes in one volume group with a single command.
- All volumes will be at the same consistency point.
For backup volumes (ie, volumes that are used as the target for your Database backups) it is best practice to replicate those using some alternate method, not SBR. Replicating via another method will help ensure that these volumes are recoverable in the event of an issue with SBR. Replicating them with SBR can also affect the RPO of all volumes in the volume-group they are in, or in extreme cases cause consistency issues or problems cycling the change volumes in a reasonable time.
- Examples may incude:
  - NFS
  - Google Storage Buckets
  - rsync
  - Oracle RMAN to a remote destination
Test and document your DR process
- OS Recovery
- Data volume recovery
- Network considerations (IP, Routing, DNS, etc)
- Application Recovery

8 - Examples

End to End Walkthrough of SBR

The following example will walk through

Creating 3 volumes at a source site
Enabling replication
Adding the volumes to a volume group
Onboarding the volumes at the target site
Cloning the Volumes for DR Testing
Failing over the volumes to allow access at the target site
Failing back the volumes (replicating data back to the original site)

See which pools in the region are Replication Capable.

❯ pcloud compute volumes list-pools
Pool Name           Storage Type  Max Allocation Size  Total Capacity  Replication Capable  
General-Flash-006   ssd           14695                20480           false                
General-Flash-001   ssd           35205                45056           true                 
General-Flash-006   standard      14695                20480           false                
General-Flash-001   standard      35205                45056           true

General-Flash-001 is used for this example, and 3 volumes are created with a type of SSD

❯ pcloud compute volumes create demo-vol1 -s 100 -T ssd -p General-Flash-001
VolumeID                              Name       Size  StorageType  State     Shareable  Bootable  Volume Pool        Replication Status  Mirroring State  
4dbdcd7e-7abb-4142-88f1-7986b7acb05e  demo-vol1  100   ssd          creating  false      false     General-Flash-001                                       
1 Volume(s) created
❯ pcloud compute volumes create demo-vol2 -s 100 -T ssd -p General-Flash-001
VolumeID                              Name       Size  StorageType  State     Shareable  Bootable  Volume Pool        Replication Status  Mirroring State  
bc15743f-3f7c-46f5-85a1-f6e58a500aaa  demo-vol2  100   ssd          creating  false      false     General-Flash-001                                       
1 Volume(s) created
❯ pcloud compute volumes create demo-vol3 -s 100 -T ssd -p General-Flash-001
VolumeID                              Name       Size  StorageType  State     Shareable  Bootable  Volume Pool        Replication Status  Mirroring State  
d58550e6-c2ee-449f-be94-60abf42b8f30  demo-vol3  100   ssd          creating  false      false     General-Flash-001                                       
1 Volume(s) created

Enable Replication on the new volumes

❯ pcloud compute volumes replication enable demo-vol1                           
replication enable request in process
❯ pcloud compute volumes replication enable demo-vol2
replication enable request in process
❯ pcloud compute volumes replication enable demo-vol3
replication enable request in process
❯ pcloud compute volumes describe demo-vol1

Describe the volumes, and see they’re enabled for replication, note the mirroringState: inconsistent_copying. Wait for that to be consistent_copying before proceeding.

volumeID: 4dbdcd7e-7abb-4142-88f1-7986b7acb05e
name: demo-vol1
cloudID: 75b10103873d4a1ba0d52b43159a2842
storageType: ssd
size: 100
shareable: false
bootable: false
state: available
instanceIDs: []
creationDate: "2025-05-09T20:24:42.000Z"
updateDate: "2025-05-09T20:25:46.000Z"
ioThrottleRate: 1000 iops
wwn: "600507681281026650000000000004E4"
volumePool: General-Flash-001
volumeType: SSD-General-Flash-001-DR
replicationStatus: enabled
mirroringState: inconsistent_copying
auxVolumeName: aux_volume-demo-vol1-4dbdcd7e-7abb84886785
masterVolumeName: volume-demo-vol1-4dbdcd7e-7abb
groupID: ""
❯ pcloud compute volumes describe demo-vol2
volumeID: bc15743f-3f7c-46f5-85a1-f6e58a500aaa
name: demo-vol2
cloudID: 75b10103873d4a1ba0d52b43159a2842
storageType: ssd
size: 100
shareable: false
bootable: false
state: available
instanceIDs: []
creationDate: "2025-05-09T20:25:03.000Z"
updateDate: "2025-05-09T20:25:52.000Z"
ioThrottleRate: 1000 iops
wwn: "600507681281026650000000000004E5"
volumePool: General-Flash-001
volumeType: SSD-General-Flash-001-DR
replicationStatus: enabled
mirroringState: inconsistent_copying
auxVolumeName: aux_volume-demo-vol2-bc15743f-3f7c84886785
masterVolumeName: volume-demo-vol2-bc15743f-3f7c
groupID: ""
❯ pcloud compute volumes describe demo-vol3
volumeID: d58550e6-c2ee-449f-be94-60abf42b8f30
name: demo-vol3
cloudID: 75b10103873d4a1ba0d52b43159a2842
storageType: ssd
size: 100
shareable: false
bootable: false
state: available
instanceIDs: []
creationDate: "2025-05-09T20:25:13.000Z"
updateDate: "2025-05-09T20:25:55.000Z"
ioThrottleRate: 1000 iops
wwn: "600507681281026650000000000004E7"
volumePool: General-Flash-001
volumeType: SSD-General-Flash-001-DR
replicationStatus: enabled
mirroringState: inconsistent_copying
auxVolumeName: aux_volume-demo-vol3-d58550e6-c2ee84886785
masterVolumeName: volume-demo-vol3-d58550e6-c2ee
groupID: ""

Look for consistent_copying for all volumes before proceeding.

volumeID: 4dbdcd7e-7abb-4142-88f1-7986b7acb05e
name: demo-vol1
cloudID: 75b10103873d4a1ba0d52b43159a2842
storageType: ssd
size: 100
shareable: false
bootable: false
state: available
instanceIDs: []
creationDate: "2025-05-09T20:24:42.000Z"
updateDate: "2025-05-09T20:25:46.000Z"
ioThrottleRate: 1000 iops
wwn: "600507681281026650000000000004E4"
volumePool: General-Flash-001
volumeType: SSD-General-Flash-001-DR
replicationStatus: enabled
mirroringState: consistent_copying
auxVolumeName: aux_volume-demo-vol1-4dbdcd7e-7abb84886785
masterVolumeName: volume-demo-vol1-4dbdcd7e-7abb
groupID: ""

Once all 3 volumes are in a state of consistent_copying, put them in a volume group. Note all the volumes were specified at once.

❯ pcloud compute volume-groups create demo-vg --volume demo-vol1 --volume demo-vol2 --volume demo-vol3
volume group create request in process

Validate the status of the volume group to make sure it looks correct.

❯ pcloud compute volume-groups describe demo-vg                                                       
name: demo-vg
id: ec3ecc1c-de75-4095-abc7-4c1cfa74665b
availabilityZone: nova
description: rccg-ec3e-4665b
groupType: dbab3433-a138-4a68-bdd7-fd01fc113d5e
cloudID: 75b10103873d4a1ba0d52b43159a2842
replicationStatus: enabled
status: available
volumeTypes:
- 00758ee7-d99a-4b56-9723-820fd2744c47
- 25df1544-6d64-42ab-bd2a-6d915ab34f5e
volumes:
- 4dbdcd7e-7abb-4142-88f1-7986b7acb05e
- bc15743f-3f7c-46f5-85a1-f6e58a500aaa
- d58550e6-c2ee-449f-be94-60abf42b8f30

Onboard the volumes at the target site. Switch to that region’s Cloud Instance

❯ pcloud auth login
Attention: pcloud currently has a valid configuration for "Demo User" user ("demo.user@converge" mail)

 If you would like to login as a different user, please run:
  pcloud auth revoke
  pcloud auth login

You have in your Billing Account several clouds, which cloud would you like to use: 
  [0]: Region1 (75b10103873d4a1ba0d52b43159a2842)
  [1]: Region2 (7960a7c7cb58481388129a8c6fbd79af)

2

You are successfully logged with the following configuration:
accountID: account-id
cloudID: c6880a140bca43ca952118762d2681e7
cloudName: Region2
region: us-east4

Switched to target Cloud Instance (Region2). Operations now affect the DR Site.

Onboard the Aux volumes. Refer back to the output of describing the volumes earlier to get the Master and Auxillary Volume Names. For the demo, the onboarded volumes were given a name matching the prod side, with the suffix of -aux.

❯ pcloud compute volumes replication onboard --name demo-onboard --source 75b10103873d4a1ba0d52b43159a2842 --volume aux_volume-demo-vol1-4dbdcd7e-7abb84886785:demo-vol1-aux --volume aux_volume-demo-vol2-bc15743f-3f7c84886785:demo-vol2-aux --volume aux_volume-demo-vol3-d58550e6-c2ee84886785:demo-vol3-aux
replication onboarding request in process

Validate the volume-group with the newly onboarded volumes.

❯ pcloud compute volume-groups list      
Name                                  Volume Group ID                       Status     Replication Status  
rccg-ec3e-4665b                       db1ad3c3-4ed4-4070-a43b-2a66830bfc98  available  enabled

Describe it to validate the state of the volume-group:

name: rccg-ec3e-4665b
id: db1ad3c3-4ed4-4070-a43b-2a66830bfc98
availabilityZone: ""
description: rccg-ec3e-4665b
groupType: 04feda3c-3e5e-4bde-abea-603ff8972aad
cloudID: 7960a7c7cb58481388129a8c6fbd79af
replicationStatus: enabled
status: available
volumeTypes:
- 3f39e8bc-874e-4611-aea5-55ee454fd58f
- 419ac126-a84e-4512-8074-cccebb701aaa
- 5762a360-7679-4e4f-b691-8a0a9b5baf8d
- b4ab63f8-3774-449f-9b0b-3af02d65b161
- f200b1e2-66d6-4219-a1d2-2ee12c22bcd8
volumes:
- 2e6ad217-6f21-46ef-9569-e85b9b68bd55
- 649bf262-d6ab-4185-ba46-324377f20142
- a5f6711c-c166-43ce-b7be-88f41dac06d9

Examine the relationship to get more information.

❯ pcloud compute volume-groups relationship rccg-ec3e-4665b
- id: "4837"
  name: rcrel93
  masterClusterID: 00000204A0409994
  masterClusterName: ""
  masterVdiskID: "465"
  masterVdiskName: volume-demo-vol1-4dbdcd7e-7abb
  auxClusterID: 00000204A0609988
  auxClusterName: stg2a1stor201
  auxVdiskID: "4837"
  auxVdiskName: aux_volume-demo-vol1-4dbdcd7e-7abb84886785
  primary: master
  consistencyGroupID: "27"
  consistencyGroupName: rccg-ec3e-4665b
  state: consistent_copying
  bgCopyPriority: "50"
  progress: "100"
  copyType: global
  cyclingMode: multi
  masterChangeVdiskID: "465"
  masterChangeVdiskName: chg_volume-demo-vol1-4dbdcd7e-7abb
  auxChangeVdiskID: "4838"
  auxChangeVdiskName: chg_aux_volume-demo-vol1-4dbdcd7e-7abb84886785
  freezeTime: 2025/05/09/16/44/20
  previousPrimary: ""
  channel: none
- id: "4839"
  name: rcrel94
  masterClusterID: 00000204A0409994
  masterClusterName: ""
  masterVdiskID: "466"
  masterVdiskName: volume-demo-vol2-bc15743f-3f7c
  auxClusterID: 00000204A0609988
  auxClusterName: stg2a1stor201
  auxVdiskID: "4839"
  auxVdiskName: aux_volume-demo-vol2-bc15743f-3f7c84886785
  primary: master
  consistencyGroupID: "27"
  consistencyGroupName: rccg-ec3e-4665b
  state: consistent_copying
  bgCopyPriority: "50"
  progress: "100"
  copyType: global
  cyclingMode: multi
  masterChangeVdiskID: "466"
  masterChangeVdiskName: chg_volume-demo-vol2-bc15743f-3f7c
  auxChangeVdiskID: "4840"
  auxChangeVdiskName: chg_aux_volume-demo-vol2-bc15743f-3f7c84886785
  freezeTime: 2025/05/09/16/44/20
  previousPrimary: ""
  channel: none
- id: "4841"
  name: rcrel95
  masterClusterID: 00000204A0409994
  masterClusterName: ""
  masterVdiskID: "467"
  masterVdiskName: volume-demo-vol3-d58550e6-c2ee
  auxClusterID: 00000204A0609988
  auxClusterName: stg2a1stor201
  auxVdiskID: "4841"
  auxVdiskName: aux_volume-demo-vol3-d58550e6-c2ee84886785
  primary: master
  consistencyGroupID: "27"
  consistencyGroupName: rccg-ec3e-4665b
  state: consistent_copying
  bgCopyPriority: "50"
  progress: "100"
  copyType: global
  cyclingMode: multi
  masterChangeVdiskID: "467"
  masterChangeVdiskName: chg_volume-demo-vol3-d58550e6-c2ee
  auxChangeVdiskID: "4854"
  auxChangeVdiskName: chg_aux_volume-demo-vol3-d58550e6-c2ee84886785
  freezeTime: 2025/05/09/16/44/20
  previousPrimary: ""
  channel: none

A DR test can be performed by creating a clone of the aux volumes.

❯ pcloud compute volumes list
a5f6711c-c166-43ce-b7be-88f41dac06d9  demo-vol3-aux                           100   ssd          available  false      false     General-Flash-004    enabled             consistent_copying  
649bf262-d6ab-4185-ba46-324377f20142  demo-vol2-aux                           100   ssd          available  false      false     General-Flash-004    enabled             consistent_copying  
2e6ad217-6f21-46ef-9569-e85b9b68bd55  demo-vol1-aux                           100   ssd          available  false      false     General-Flash-004    enabled             consistent_copying  

❯ pcloud compute volumes clones create demo-dr-test -v demo-vol1-aux -v demo-vol2-aux -v demo-vol3-aux
cloneName: demo-dr-test
volume(s): [demo-vol1-aux demo-vol2-aux demo-vol3-aux]
Initiated cloneTaskID e0a525d6-7b4a-4f63-8775-740accebc666
Check status with:
	pcloud compute volumes clones status e0a525d6-7b4a-4f63-8775-740accebc666

❯ pcloud compute volumes list 
d127f9e2-7cbb-4700-a725-711a4adfcaf7  clone-demo-dr-test-93902-2              100   ssd          available  false      false     SSD-General-Flash-004  not-capable                             
cdb5f7f7-6ba4-460a-ab35-f4df8bf52174  clone-demo-dr-test-93902-1              100   ssd          available  false      false     SSD-General-Flash-004  not-capable                             
098a8eb8-b6c1-460f-9671-3526aec653fc  clone-demo-dr-test-93902-3              100   ssd          available  false      false     SSD-General-Flash-004  not-capable                             
a5f6711c-c166-43ce-b7be-88f41dac06d9  demo-vol3-aux                           100   ssd          available  false      false     General-Flash-004      enabled             consistent_copying  
649bf262-d6ab-4185-ba46-324377f20142  demo-vol2-aux                           100   ssd          available  false      false     General-Flash-004      enabled             consistent_copying  
2e6ad217-6f21-46ef-9569-e85b9b68bd55  demo-vol1-aux                           100   ssd          available  false      false     General-Flash-004      enabled             consistent_copying

The clone volumes can then be attached to a VM, used, and when the testing is complete they can be deleted. These clone-demo-dr-test-X volumes are independent copies. Attaching and using them for testing does not impact the ongoing replication of the original demo-volX-aux volumes.

A fail over can be performed, and the DR volumes can be used, by stopping replication and allowing access.

❯ pcloud compute volume-groups stop rccg-ec3e-4665b --allow-read-access                
volume group stop request in process

At this point volumes at BOTH sites can be modified and used

Re-enable replication, first decide which site should be the new source. For this example, the volumes will fail back, meaning the aux volume will be the new primary volume.

❯ pcloud compute volume-groups start rccg-ec3e-4665b aux                
volume group start request in process

This is now copying data from the “Aux” volumes back to the “Master” volumes.

The state can be checked by looking at the relationship. The copy direction can be determined by looking at the primary field.

❯ pcloud compute volume-groups relationship rccg-ec3e-4665b
- id: "4837"
  name: rcrel93
  masterClusterID: 00000204A0409994
  masterClusterName: ""
  masterVdiskID: "465"
  masterVdiskName: volume-demo-vol1-4dbdcd7e-7abb
  auxClusterID: 00000204A0609988
  auxClusterName: stg2a1stor201
  auxVdiskID: "4837"
  auxVdiskName: aux_volume-demo-vol1-4dbdcd7e-7abb84886785
  primary: aux
  consistencyGroupID: "27"
  consistencyGroupName: rccg-ec3e-4665b
  state: consistent_copying
  bgCopyPriority: "50"
  progress: "100"
  copyType: global
  cyclingMode: multi
  masterChangeVdiskID: "465"
  masterChangeVdiskName: chg_volume-demo-vol1-4dbdcd7e-7abb
  auxChangeVdiskID: "4838"
  auxChangeVdiskName: chg_aux_volume-demo-vol1-4dbdcd7e-7abb84886785
  freezeTime: 2025/05/09/16/57/56
  previousPrimary: ""
  channel: none
- id: "4839"
  name: rcrel94
  masterClusterID: 00000204A0409994
  masterClusterName: ""
  masterVdiskID: "466"
  masterVdiskName: volume-demo-vol2-bc15743f-3f7c
  auxClusterID: 00000204A0609988
  auxClusterName: stg2a1stor201
  auxVdiskID: "4839"
  auxVdiskName: aux_volume-demo-vol2-bc15743f-3f7c84886785
  primary: aux
  consistencyGroupID: "27"
  consistencyGroupName: rccg-ec3e-4665b
  state: consistent_copying
  bgCopyPriority: "50"
  progress: "100"
  copyType: global
  cyclingMode: multi
  masterChangeVdiskID: "466"
  masterChangeVdiskName: chg_volume-demo-vol2-bc15743f-3f7c
  auxChangeVdiskID: "4840"
  auxChangeVdiskName: chg_aux_volume-demo-vol2-bc15743f-3f7c84886785
  freezeTime: 2025/05/09/16/57/56
  previousPrimary: ""
  channel: none
- id: "4841"
  name: rcrel95
  masterClusterID: 00000204A0409994
  masterClusterName: ""
  masterVdiskID: "467"
  masterVdiskName: volume-demo-vol3-d58550e6-c2ee
  auxClusterID: 00000204A0609988
  auxClusterName: stg2a1stor201
  auxVdiskID: "4841"
  auxVdiskName: aux_volume-demo-vol3-d58550e6-c2ee84886785
  primary: aux
  consistencyGroupID: "27"
  consistencyGroupName: rccg-ec3e-4665b
  state: consistent_copying
  bgCopyPriority: "50"
  progress: "100"
  copyType: global
  cyclingMode: multi
  masterChangeVdiskID: "467"
  masterChangeVdiskName: chg_volume-demo-vol3-d58550e6-c2ee
  auxChangeVdiskID: "4854"
  auxChangeVdiskName: chg_aux_volume-demo-vol3-d58550e6-c2ee84886785
  freezeTime: 2025/05/09/16/57/56
  previousPrimary: ""
  channel: none

9 - Troubleshooting

Fixing “Phantom” Onboarded Volumes

Warnings

These instructions are for the deletion of “phantom onboarded volumes” only. A phantom onboarded volume would have been created by the following process:
- The source volume would have had replication enabled at the primary cloud instance
- The auxiliary volume would have been previously onboarded at the target cloud instance
- Replication would have been disabled on the source volume
- Replication would have been re-enabled on the source volume
Volumes matching this condition will be in an Error state in the target site only
DO NOT DO THIS against an onboarded volume that is associated with a source volume
- This will potentially cause permanent data loss of the volume at BOTH source and target sites
As long as you are deleting a phantom onboarded volume, there is no risk of the source volume being deleted

At the Source & Target: list the volumes and find the matching source volume and target auxiliary volume. Compare and note the size does not match if the volume was resized.

pcloud compute volumes list

At the Target: Verify the phantom onboarded volume is in an error state

pcloud compute volumes describe <phantom-onboarded-volume> | grep state
state: error

At the Target: If you are at all unsure this is the correct target volume, you can attempt to clone it. The clone operation will fail.

pcloud compute volumes clones <clone-name> -v <phantom-onboarded-volume>

At the Target: Delete the onboarded target volume. Warning, doing this against a valid, replicating volume may cause data loss

pcloud compute volumes delete <phantom-onboarded-volume>

At the Source: Ensure the volume is in a volume group

pcloud compute volume-groups describe <volume-group>

At the Source: Gather the aux volume name

pcloud compute volumes describe <source volume>

At the Target: Onboard the volume (use an appropriate naming convention)

pcloud compute volumes replication onboard --name <name> --source <cloud instance ID> --volume <source volume>:<target volume name>

At the Source & Target: Check the volume group and ensure both source and target are in a volume group.

pcloud compute volumes describe <volume>