Veeam Off-host Proxy, DataCore VSS and MS Hyper-V - lessons learned

Veeam is a perfect backup tool when it comes to making backups efficient and fast. DataCore is one of my favorite storage solutions and Hyper-V.... well, I still prefer VMware vSphere way over Hyper-V. They might have catched up with VMware in most features and probably even superceeded them in special functions but when it comes to a very, very important area they still are years behind: ease of use. Not the hypervisor itself or the management tools like SCVMM or even failover cluster manager. A long-year windows admin probably can handle this kind of construct, although I think VMware has smarter solutions but this is my personal opinion.

The problem begins when it comes to efficient backup and 3rd party integration. Hyper-V currently completely relies on VSS to make consistent snapshots of VMs. In the guest OS this approach is okay but using VSS on the Hyper-V host as well putting VSS on top of VSS is not really a good thing. MS has acknowledged this problem and will probably find another way to replace VSS at least at the hypervisor level. VMware does this for a very long time with their snapshots and that's what makes offloading backup processes to other systems that easy. Even with a Windows based backupserver it's much easier to offload the VMware backups to an external system (that isn't aware of a filesystem called VMFS) than it is for any Microsoft based hypervisor environment.

But back to the main point of this article. We have several customers running Hyper-V on FC or iSCSI based DataCore storage and want to use Veeam as a backup product. This setup isn't quite hard to install but when it comes to make the backup process as efficient as it can be, you will stumble across some problems sooner or later.

The main problem in this scenario is the way you have to setup off-host proxying. Why "off-host proxying"? Well, the answer is quite simple: you don't want your hypervisor server to do all the compression and deduplication work during a backup session. This can be okay in smaller or overpowered environments but normally your hypervisor host has enough to do with running the VMs. It shouldn't have to do all the backup work. That's the reason why Veeam supports "off-host proxying". In this special case, the compression and dedup load is transferred to an external backup server. Sounds clear, sounds cool and is as simple as hell in a VMware based environment where you simply map the VMFS datastores to your Windows based backup server. That's all. One could think: hm, I use Hyper-V, I have Windows based backupserver as well, this should be easy too. Well, it isn't...... and the reason behind this is the way Hyper-V handles the backup process. As the backup process relies on VSS on the Hyper-V host you can't simply map the CSV to your backupserver and let Veeam do the rest. You need to worry about permissions, Windows features installed on the backupserver, OS version on the backupserver, storage configuration and last but not least hardware VSS provider that are able to create "transportable" snapshots. I don't want to bore you with details and that's why I setup a test lab to find out what is really needed on the Veeam side, the Hyper-V side and the storage side to get things to work. One note here: this setup guide is only validated on this special environment. Probably it can be easy transported to other environemnts where other storage solutions are used but I can't guarantee. I will also do the same tests with environments based on MSA storage solutions from HP but this will take some time and I will post a new article once I've finished my tests.

 

How is my setup look like:

I use a fully virtualized environment but you can also do this on physical hardware. I have two virtualized DataCore SANsymphony-V 10 servers running latest PSP4 that use iSCSI for mirror and frontend traffic. Both nodes provide two iSCSI FE ports to have MPIO work correctly. I serve a single mirrored vDisk to the Hyper-V host.

This Hyper-V host is a nested hypervisor running Server 2012 R2 on top of VMware vSphere 6 (send me a note if you want to know how you can achieve this or simply google for it). It is running Server 2012 R2 with latest patches. DataCore WIK 3.0 is installed (MPIO and VSS) on the host. The host is part of a single node failover cluster (this is supported as well and is only be done to reflect a realistic real-world scenario). The mirrored vDisk is setup as cluster ressource and forms a CSV. On top of that CSV there is a single VM running Windows XP. This VM is used as backup source within Veeam.

The backupserver is a Windows Server 2012 R2 based VM as well. Connected to iSCSI and configured with WIK 3.0. Veeam B&R v8 Update 3a is installed. The Hyper-V host is imported into Veeam as cluster ressource.

 

Detailed configuration. This step shows you how to configure the servers and where you have to pay special attention to get things working.

  1. The DataCore nodes were initially joined to the domain the Hyper-V and the backupserver where joined to as well. This is allowed by DataCore but was one of the main reasons the whole setup didn't work. I will explain this later in this article but for now simply DO NOT JOIN the DCS to a domain. Follow the best practices from DataCore and put them into their own workgroup. Also make sure Windows doesn't append a domain to the name. So make sure the DCS are only known by their short names (eg. DCS01 and not DCS01.demo.lab). 
  2. Configure map stores for both DCS nodes becasue without map stores you won't be able to create snapshots. The following screenshot shows how each DCS should look like for mapstore and computer name

    VSS01

  3. Add the backupserver to DataCore without using the FQDN. Make sure you only use NETBIOS (short) names.

    VSS02

  4. Do not map the vDisk used as CSV to the backupserver! Even the DataCore support will tell you to do so but this isn't neccessary. This would be the way if you use Veeam in combination with vSphere but not in a Hyper-V environment
  5. Make sure your backupserver runs the same OS version as your Hyper-V host. The only important thing here is to use the same major release (e.g. R2 or R1), different patch levels can be ignored
  6. Patch all Windows Servers to the latest patch level!!!! Nothing is more unstable than an unpatched Hyper-V server.
  7. Install the Hyper-V role on the backupserver. Sounds silly but Veeam needs this role to act as an off-host backup proxy. You don't have to configure anything in the Hyper-V role on the backupserver, simply make sure it is installed.
  8. Install WIK (MPIO and VSS) on the Hyper-V server and the backupserver. Later only the backupserver will use the VSS component but be sure to have it on all hosts.
  9. Add the backupserver to Veeam as an off-host backup proxy and manually edit the connected volumes

    VSS03

    Add all CSVs you want to backup VMs from.
  10. Check to see if Veeam recognizes the DataCore VSS hardware provider, that it is configured to be used by Veeam and the failover to software VSS is disabled

    VSS09
  11. Change the account your VSS provider runs on your backupserver and all Hyper-V hosts. There is a FAQ from Datacore that deals with this problem
    1. Open a command prompt with adminstrative rights
    2. enter dcomcnfg and press ENTER
    3. goto Component services -> Computers -> My Computer -> COM+ Applications
    4. right-click DataCore VSS Hardware Provider and select "Properties"
    5. goto Identity tab and change the account to an administrative user account on the backupserver. In my special case I selected a domain admin account but it should also work with a local admin account.

      VSS04
  12. Create a backup job that includes the VMs you want to backup. Nothing special to consider here, simply make sure you have "Off-host backup" selected in the backup proxy menu. If you want to make sure you only see a success note when off-host proxying did work make sure you disable fallback to on-host mode in the advanced settings.
  13. Fire the backup job and wait until it completes. In my lab it took a reasonable amount of time to create the hardware snapshot or better said, the snapshot creation was instant but the time it takes to see Veeam jumping to the next step was quite long (1,38min). Don't care about it.

    VSS05
  14. During backup you will see a snapshot created in the SSYV GUI.

    VSS06

    This snapshot will be removed after the backup job finishs successfully. I've seen the snapshots remaining after a backup job failed so make sure you check the snapshot state after a failed backup.

That's it. Quite simple if you know where to look at. Unfortunately neither Veeam support nor Datacore support can really help you in this case. Believe me, we tried several times. Both had no idea about how to combine them. Sometimes I even got definetly wrong information from them so I hope I can help you with this article.

A last word about the FQDN/short name problem mentioned above. I mentioned to have the DCS initially joined to a domain. That's the reason why both nodes were displayed with their FQDN in the DC GUI as well as with their short name.

VSS07

Whenever I tried to do a backup with this settings, the snapshot of the vDisk within DataCore was created but Veeam/DataCore failed to mount the snapshot to the backupserver. The error was "Unable to create snapshot DataCore VSS Hardware Provider on Hyper-V Off-Host Backup Proxy (mode: Crash consistent). Details: Failed to import snapshot: Import snapshot task has failed. No shadow copies were successfully imported."

This error is caused by the VSS provider on the backupserver. For explanation reasons, the VSS provider on the Hyper-V server is responsible for creating and deleting the snapshot of the CSV, the VSS provider on the backupserver is responsible for the mapping of the snapshot to the backupserver. That's why both (or ALL Hyper-V and backupservers) need the DataCore VSS installed on configured to use a user account as mentioned above.

Back to the problem. Whenever the backupserver tried to connect to the DCS nodes (it will use the one you set as preferred node for the backupserver in the DC GUI) to issue the mapping command, it searched for the node with the FQDN DCS02.demo.lab. 

 Snip of the VSSlog at C:\Program Files\DataCore\Host Integration Kit\VSSlog.txt:

[08.12.2015 14:47 opened]
Info:    [14:47:19.467] OnLoad called...
Info:    [14:47:19.529] LocateLuns enter...
Info:    [14:47:19.529]    Calling from importing.
Info:    [14:47:19.529]       DeviveIdDescriptor.Identifiers.Identifier = DCS02.demo.lab,
Info:    [14:47:19.529]       DeviveIdDescriptor.Identifiers.Identifier = 4b6d6c9d-9b6f-4d6f-bf28-862afe6e83ae,
Info:    [14:47:19.529]    Connecting to server DCS02.demo.lab...
Info:    [14:47:22.170]    Connected to server DCS02.demo.lab successfully.
Error:   [14:47:22.279] System.Exception: Host DCS02.demo.lab is not found.
Error:   [14:47:22.279] System.Exception: Host DCS02.demo.lab is not found.
   at DataCore.VSSHardwareProvider.VssServiceAgent.GetServerData(String name)
   at DataCore.VSSHardwareProvider.VssSnapshot.Init(_VDS_LUN_INFORMATION lunInfo)
   at DataCore.VSSHardwareProvider.VssSnapshot..ctor(VssServiceAgent vssAgent, _VDS_LUN_INFORMATION lunInfo)
   at DataCore.VSSHardwareProvider.HWProvider.LocateLuns(Int32 lLunCount, _VDS_LUN_INFORMATION[] luns)
Error:   [14:47:22.295] Failed to locate LUNS : Host DCS02.demo.lab is not found.
Info:    [14:50:22.446] OnUnLoad called...
Info:    [14:50:22.446]    DeleteAbortedSnapshots enter...
Info:    [14:50:22.446]       snapshot set id 00000000-0000-0000-0000-000000000000 is deleted.
Info:    [14:50:22.446]    DeleteAbortedSnapshots exit...
[14:50:22.461] Logger closed.

I don't know why and the DataCore support couldn't tell me either, it tries to use the FQDN but the node is only known by it's short name (see picture above, computer name is FQDN, node name is DCS01, the short name). As you can't change the computer name of a DC node within the GUI I dis-joined them from the domain one after each other (this sin't a problem at all, simply stop the virtualization, disjoin the computer, reboot and start the virtualization). After both had rebooted I saw them displayed with their short name (see screenshot at step 2 above). Suddenly the mapping process worked and backup could be done successfully. One more reason to not join a DCS node to a domain.

The same problem by the way will occur if you have your backupserver added to DC GUI with the FQDN. This name can be changed during runtime and DC support first asked me to change the host registration from short name to FQDN. When you do this you will see this error in VSSlog:

 

nfo:    [15:43:18.302] OnLoad called...
Info:    [15:43:18.334] LocateLuns enter...
Info:    [15:43:18.334]    Calling from importing.
Info:    [15:43:18.334]       DeviveIdDescriptor.Identifiers.Identifier = DCS02,
Info:    [15:43:18.334]       DeviveIdDescriptor.Identifiers.Identifier = 4b6d6c9d-9b6f-4d6f-bf28-862afe6e83ae,
Info:    [15:43:18.334]    Connecting to server DCS02...
Info:    [15:43:20.208]    Connected to server DCS02 successfully.
Info:    [15:43:20.818]    Serving snapshot HVTestDisk @ 15.12.2015 15:41:44 (a97d0bf93650475c96c64c8a662779bc) to VVBR8P
Error:   [15:43:20.990] System.Exception: Host VVBR8P is not registered.
Error:   [15:43:20.990] System.Exception: Host VVBR8P is not registered.
   at DataCore.VSSHardwareProvider.VssServiceAgent.ServeVirtualDisk(String clienHostName, String virtualDiskId, String token)
   at DataCore.VSSHardwareProvider.HWProvider.LocateLuns(Int32 lLunCount, _VDS_LUN_INFORMATION[] luns)
Error:   [15:43:21.005] Failed to locate LUNS : Host VVBR8P is not registered.

 

This time the VSS provider seaches again for the short name of the system and not the FQDN. So changing the host registration to short names solved the problem.

Leave your comments

Post comment as a guest

0
Your comments are subjected to administrator's moderation.

People in this conversation

  • Guest - Jany

    Hello, Do you have this scenario at a productive Datacenter? I configure this at our Datacenter and the Datacore is making very Big Snapshots and after big data are writen to this Snapshot. When the Snapshot process is done, the releasing from Snapshot is making some lun Reclemation. Do you have this problem to? Do you have to set 1 Snapshot per Lun to at the Veeam Manage Volumes because failures during Backups? Kind regards Roger

  • Guest - Oliver Krehan

    Hi Jany,

    no, at the moment this setup is only in our lab but we will proceed to implement it at our datacenter within the next few weeks. Curious where the big snapshots came from because there shouldn't be more traffic to the vdisk as in normal situations. Is there an I/O intensive application working? We often see large snapshots on database servers where someone runs big batch jobs over the night where normally the backup happens.

    So to help you I need a bit more information.

    Regards,
    Oliver

Powered by Komento
joomla templatesfree joomla templatestemplate joomla
2017  v-strange.de   globbers joomla template