Recently, I was working on NSX-T v2.5.1 and came across this problem.
This alert came up everytime I went to NSX-T UI -> System -> Transport Nodes -> Host to display Transport Node status for one of the Hosts. This specific Host displayed 9 tunnels, instead of 11 showing up on others, which is fine as there was no workload traffic on that specific node.
When the alert triggered, found corresponding entries in the manager logs
Fails to get TunnelInfo filtered by <node_uuid> and throws exception com.vmware.nsx.management.messaging.exceptions.MessagingException: Stub is not available for client <node_uuid>, application AggSvc
At the same time, in the Transport node (nsx-syslog.*) noticed:
comp=”nsx-esx” subcomp=”nsx-proxy”…RemoteService[vmware.nsx.agg_service.messaging.AggSvcHostService] Failed to resolve service: 6-No such device or address
I am sure such issues can be resolved by below 2 options, however, I consider these as last resort.
- Putting the host in Maintenance Mode and reboot the host
- If the problem persists, un-prepare and re-prepare the host for NSX-T
Explored some APIs to get more information and found below 2 useful ones.
2nd command would get us high-level summary of a transport node. As the name suggests, pnic-bond-status would show whether the PNIC to NSX-T NVDS Bond Status is UP. Output of this command is interesting in our case.
For the affected Host, got below response for pnic-bond-status API
“details”: “Stub is not available for client <node_uuid>, application AggSvc”,
“error_message”: “General error has occurred.”
Listed network ip connection list using below command and saw for the affected host, we had 1 CLOSED & 2 ESTABLISHED connection, however, for the working Host, we had all 3 in ESTABLISHED state.
esxcli network ip connection list | grep ‘^tcp’ | grep ‘nsx-exporter’
The nsx-exporter is responsible to keep the managers up to date.
As this doesn’t impact the data plane, went ahead to restart nsx-exporter service using below command
Post this, checked that all 3 connections started showing up in ESTABLISHED state & the alert in UI went away.
I hope this article would help you if you face similar issues. 🙂