This is second part of the article. Read the first part here.
In the first part we have established why it is important to monitor resource utilization of workloads deployed in the public cloud and hinted how we can get some hypervisor metrics. In the second part I want to present an architecture example how it can be achieved.
Architecture
The above diagram describes the whole setup. I am monitoring three workloads (VMs) deployed in a public cloud – in this case vCloud Hybrid Service. I have also deployed vFabric Hyperic into the same cloud, which collects the workload metrics through Hyperic agents. vCenter Operations Manager which is installed on premise is then collecting the metrics from Hyperic database and is displaying them in custom dashboards while doing its own magic (super metrics, dynamic thresholding and analytics).
I have also created custom Hyperic plugin which collects Windows VM performance metrics through VMware Tools Guest SDK.
Deployment and Configuration
vFabric Hyperic Deployment
- Create Org VDC network in public cloud for Hyperic Deployment.
- Upload vFabric Hyperic 5.7.1 Server installation and vFabric Hyperic 5.7.1 DB installation in OVF format to a public cloud catalog.
- Deploy Hyperic DB first to Org VDC network first and note the IP address it has been allocated.
- Deploy Hyperic Server and enter the Hyperic DB IP address when asked.
- Install Hyperic Agent to all VMs that are going to be monitored.
vFabric Hyperic Configuration
- Connect to Hyperic GUI (http://<hyperic server IP>:7080). As I have used jumpbox in the public cloud I did not needed to open the port to the internet, otherwise create the appropriate DNAT and firewall rule to reach the server from outside.
- Go to Administration > Plugin Manager > and click Add/Update Plugin(s) button and upload vmPerfMon-plugin.xml custom plugin:
<plugin> <server name="VM Performance Counters" platforms="Win32"> <plugin type="measurement" class="org.hyperic.hq.product.MeasurementPlugin"/> <!-- You always need availability metric, so just pick some service --> <metric name="Availability" template="win32:Service=Eventlog:Availability" indicator="true"/> <!-- Template filter is passed to metrics --> <filter name="template" value="win32:Object=${object}:${alias}"/> <!-- Using object filter to reduce amount of xml --> <filter name="object" value="VM Memory"/> <metric name="Memory Reservation" alias="Memory Reservation in MB" units="MB"/> <metric name="Memory Limit" alias="Memory Limit in MB" units="MB"/> <metric name="Memory Shares" alias="Memory Shares"/> <metric name="Memory Active" alias="Memory Active in MB" units="MB"/> <metric name="Memory Ballooned" alias="Memory Ballooned in MB" units="MB"/> <metric name="Memory Swapped" alias="Memory Swapped in MB" units="MB"/> <!-- Win perf object is changed, using new one --> <filter name="object" value="VM Processor"/> <!-- Processor object needs instance information to access --> <filter name="instance" value="_Total"/> <!-- Giving new template since we now need instance info --> <filter name="template" value="win32:Object=${object},Instance=${instance}:${alias}"/> <metric name="CPU Reservation in MHz" alias="Reservation in MHz"/> <metric name="CPU Limit in MHz" alias="Limit in MHz"/> <metric name="CPU Shares" alias="Shares"/> <metric name="Effective VM Speed in MHz" alias="Effective VM Speed in MHz" indicator="true"/> <metric name="Host processor speed in MHz" alias="Host processor speed in MHz"/> </server> </plugin>
- The plugin should be automatically distributed through agents to the workloads. However we need to configure it. In the Resources tab browse to the workload and in the Tools Menu select New Server. Type a name, in the Server Type drop down find Windows VM Performance Counters and type something in the Install path field.
- After clicking OK immediately click on Configuration Properties and check Auto-discover services.
- To start collecting data we need to configure collection interval for the metrics. Go to Monitor > Metric Data and click Show All Metrics button. Select the metrics and at the bottom input collection interval and submit.
- Now when we are collecting data we could create indicators, create alerts, etc., However Hyperic is just a collection engine for us. We will feed its data to vCenter Operations Manager.
vCenter Operations Manager Configuration
- Assuming there is already an existing on-premise installation of vCenter Operations Manager we need to configure it to collect data from the cloud Hyperic instance. To do this first we need to open Edge Gateway firewall and create DNAT and firewall rule to the Hyperic DB server (TCP port 5432).
- Now we need to download Hyperic vCOps Adapter from here.
- To install go to vC Ops admin interface and install it through Update tab.
- Then go to the vC Ops custom interface and in Admin > Support click Describe icon.
- Next we need to configure the adapter. Go to Environment > Configuration > Adapter Instances and add new Hyperic instance for the public cloud. Then configure the (public) IP address of Hyperic DB, port and credentials.
Dashboards
When finished with all configurations we can create custom dashboards from the collected metrics in vC Ops. This is however out of scope of this article as it depends on the use case.
As an example above I am showing real CPU usage of one cloud VM as reported by the hypervisor. The host physical core speed is 2.2 GHz, however the effective VM speed was varying between 2.5 – 2.7 GHz (max turbo frequency of Intel E5-2660 CPU is 3 GHz). If I would look at the internal Windows GuestOS task manager metric I would see just meaningless 100% CPU utilization.