VMware CBT bug - additional information

User Rating: 4 / 5

Star ActiveStar ActiveStar ActiveStar ActiveStar Inactive
Published: Wednesday, 12 November 2014 21:12

Update 12/05/2014

Veeam Support offeres a tiny litte script to reset CBT on specific VMs in a very convenient way. You can download the 7zipped file here.

The file includes a PowerShell script that has to be run inside the VMware PowerShell. Syntax is "resetCBT.ps1 -vcenter example.vcenter.local -vms "vmname1,vmname2,vmname3,vmname4" ". The script first checks for virtual disks attached to the VMs mentioned, then disables CBT, creates a snapshot, deletes the snapshot immediately and activates CBT. Output is in a very user friendly layout.

Update 11/13/2014

Things seem to be even worse. Initially it was assumed that only disks are affected by the bug that was resized beyond the 128GB line. This is not correct anymore. All disks that have ever been enlarged by any size (currently only extensions bigger than 50GB are seen to be affected by the bug but this value isn't hard-defined so it could also affect smaller extensions) can be hitten by this bug. As there is no official size given you have to suppose the worst case so it's elementary to reset CBT on ALL VMs and ALL disks. If you are absolutely sure that you have never extended any disk then you're fine, but can you really be sure? This bug exists since vsphere 4.x and this is from ~2009. I simply can't remember every single action on any VM I've ever touched.

Original article from 11/12/2014

As published a few day ago VMware released a technical bulletin regarding a bug in the Changed block tracking (CBT) feature. The KB article from VMware lacks the most important information and only shows a workaround that users have to implement on their own.

Veeam support that helped VMware to identify and inspect the bug published some information too in their own KB but this article lacked most of the important information too.

Now Veeam did an update to their KB article available at http://www.veeam.com/kb1940 including some scripting to disable and reenable CBT on VMs - the only workaround currently available.

You should better follow this article as VMware seems to need a lot more time to publish a bug fix and even with the bugfix only new problems will be avoided, existing ones still remain.

If you look at the Veeam KB article you will find a hint at the bottom to use SureBackup to detect issues with your VMs. DO NOT RELY ON THIS HINT. SureBackup can only be used to detect errors on system disks that are expanded. If theses disks are affected by the CBT bug SureBackup will detect it because the VM won't boot. If you have data disks attached to VMs that are checked by SureBackup but the OS disk is NOT affected by this bug (most of your VMs will be from this category as the system disk is rather static and won't be increased over the VMs lifetime), the VM will boot up perfectly and SureBackup will tell you everything is okay. This is because SureBackup only pings and checks for VMware Tools heartbeat in the default settings. These checks will report OK no matter if your data on the data disk is corrupted or not. If you want SureBackup to check data disks you have to write some scripts and add these scripts to your SB job. 

The Veeam forum has a very good post on that issue where all affected users post their experience as currently it's not 100% clear what triggers this bug and how to avoid it. Additionally check scripts should be available there. Here is the link: VMware CBT bug KB 2090639 

Keep in mind to check this forum post regularly so you are up to date what you can do and which VMs are affected.