HP StoreVirtual manager requirements

User Rating: 4 / 5

Star ActiveStar ActiveStar ActiveStar ActiveStar Inactive

During the last two years I installed several HP StoreVirtual (formerly Lefthand) P4000 storage systems. These systems are quite easy to setup and manage and provide very decent performance. The only really annoying thing is the misleading information about how many regular/virtual managers or failover managers are required to tolerate system outages.

I checked all available papers about this, spoke to HP storage employees and also made some tests and finally I can give you some information on how to setup your StoreVirtual environment to keep it fault tolerant.

First a bit of background information. SV systems operate in clusters. That is, two or more systems are "bundled" together to join their storage ressources. You can define fault tolerance levels on a volume level just like you would do with a RAID controller. So if you only have two storage systems within your cluster you can choose between NRAID0 (this is no fault tolerance at all) or NRAID10 which is just like RAID1. With NRAID10 you have two copies of each block. If you add a third node to the cluster you can choose from NRAID5 (two copies and one parity information) or NRAID10+1 which is three copies of all data. Adding a fourth node will give you the option of using NRAID10+2 which is four copies of all data.

On each node runs a so called regular manager. That's a service that takes care about the communication to the other nodes in the management group. You need these managers to create a quorum. Each cluster has it's own quorum that is used to avoid split brain scenarios in an event of failure. Beside regular managers there are so called virtual managers and failover managers. The difference between these manager types are the way they work. A regular manager runs on the storage node itself and is normaly active all the time. A virtual manager is a "on-demand" manager that you can manually start if you need it. Virtual managers more or less are only used in two node cluster environments where you have no chance to add a FOM. A failover manager (FOM) is a special system that runs part of the Lefthand OS but only acts as an additional manager. A FOM can never be used to serve storage.

Why do you need these managers? Well, you need them to avoid split brain scenarios. Managers are responsible to decide which storage system is still alive and can act as a storage ressource and which is faulty and access to this system has to be denied. Best practice is to have any combination of the three managers above as long as there is an odd number of managers running. So running 4 regular managers and a FOM is okay, running 3 managers and a FOM is not. You should always prefer regular managers over FOMs, and FOMs over virtual managers.

One or more clusters are organized within a management group. At the management group you define servers that connect to the storage or define things like alarming, NTP and so on. One important thing to know is that if you have two or more clusters within a management group, each cluster can be used as a witness (manager) for the other clusters. So, for example, if you have two two node clusters you can use their managers all together. This way you have four regular managers. One thing to mention here is that you can only have one single FOM per management group. Adding additional FOMs is not supported.

So here is my list of manager combinations for several setups and requirements:

  • one two-node-cluster: run regular manager on each node and additionally a FOM if you need automatic failover. If you can tolerate outage you can also use a virtual manager instead of a FOM but then you have to start the virtual manager manually and decide which node will be accessible yourself (not recommended)
  • one four-node-cluster or two two-node-clusters within the same management group and installed within the same datacenter: either run three of the nodes with regular manager and stop the fourth manually or use four regular managers and a FOM. Using only three managers will keep your clusters accessible as long as you only loose one single node or during updates where the CMC will take care about which node can be safely restarted. Using a FOM will tolerate a node failure per cluster.
  • one four-node-cluster or two two-node-clusters within the same management group and evenly spread across two datacenters: that is what HP calls a multi-site-SAN. Multi-site-SAN takes care about the placement of the data depending on the RAID-Level you choose. For example, choosing NRAID10 will place one copy in DC1 and one copy in DC2. Without this site awareness, it could also be possible that both copies are placed within the same DC. Running a multi-site-SAN requires using a FOM.
  • six nodes in a 2/4 node cluster setup within the same management group and datacenter: you don't have to use a FOM, simply start regular managers on all nodes.

Why is there a need to distinguish about running all nodes within the same DC and running them spread across two or more DCs? If you run all nodes within the same DC and you get an power outage then all systems will be down. There is no benefit from running FOMs for this setup. So the main problem you have to address with your fault tolerant setup is either standard hardware errors that tear down a single host or maintenance works where you can plan the downtime of each node.

In a two DC setup, power outage in one DC will tear down all nodes in this DC but the other nodes in the second DC can keep working. In this scenario it is very important to have an automatic failover mechanism implemented and this can only be done if you have a FOM installed.

joomla templatesfree joomla templatestemplate joomla
2019  v-strange.de   globbers joomla template