vSphere Clustering Service (vCLS) is a new capability that is introduced in the vSphere 7 Update 1 release. It’s first release provides the foundation to work towards creating a decoupled and distributed control plane for clustering services in vSphere.

The basic architecture for the vCLS control plane consists of maximum 3 virtual machines (VM), also referred to as system or agent VMs which are placed on separate hosts in a cluster. These are lightweight agent VMs that form a cluster quorum. On smaller clusters with less than 3 hosts, the number of agent VMs is equal to the numbers of ESXi hosts. The agent VMs are managed by vSphere Cluster Services. Users are not expected to maintain the life-cycle or state for the agent VMs, they should not be treated like the typical workload VMs.

Cluster Service Health

The agent VMs that form the cluster quorum state, are self correcting. This means that when the agent VMs are not available, vCLS will try to instantiate or power-on the VMs automatically.

There are 3 health states for the cluster services:

  • Healthy – The vCLS health is green when at least 1 agent VM is running in the cluster. To maintain agent VM availability, there’s a cluster quorum of 3 agent VMs deployed.
  • Degraded – This is a transient state when at least 1 of the agent VMs  is not available but DRS has not skipped it’s logic due to the unavailability of agent VMs. The cluster could be in this state when either vCLS VMs are being re-deployed or getting powered-on after some impact to the running VMs.
  • Unhealthy – A vCLS unhealthy state happens when a next run of the DRS logic (workload placement or balancing operation) skips due to the vCLS control-plane not being available (at least 1 agent VM).

 vSphere client and click the view where you can see all the VMs, you’ll find there is a new folder created called vCLS that contains the vCLS VMs. You should not rename the vCLS folder or rename the vCLS VM(s).

 Automation and vCLS

For customer using scripts to automate tasks, it’s important to build in awareness to ignore the agent VMs in, for example clean-up scripts to delete stale VMs. Identifying the vCLS agent VMs is quickly done in the vSphere Client where the agent VMs are listed in the vCLS folder. Also, examining the VMs tab under Administration > vCenter Server Extensions > vSphere ESX Agent Manager lists the agent VMs from all clusters managed by that vCenter Server instance.

Every agent VM has additional properties so they can be ignored with specific automated tasks. These properties can also be found using the Managed Object Browser (MOB). The specific properties include:

    ManagedByInfo
        extensionKey == “com.vmware.vim.eam”
        type == “cluster-agent”

    ExtraConfig keys
        “eam.agent.ovfPackageUrl”
        “eam.agent.agencyMoId”
        “eam.agent.agentMoId”

vCLS Agent VMs have an additional data property key “HDCS.agent” set to “true”. This property is automatically pushed down to the ESXi host along with the other VM ExtraConfig properties explicitly.

VMware vSphere Cluster Service, which is responsible for maintaining DRS operations in the event of vCenter Server unavailability. There will be more services added to future releases. I imagine that vSphere would be capable of managing not only vSphere services, but probably also some networking services, storage, or application services.

I hope this helps 🙂