How to get a list of hard disks in ESXi host system ?

That’s the only way so far i have figured out how to do it. Getting the information through CIM.

$esxi = 'myhost'
$CIMOpt = New-CimSessionOption -SkipCACheck -SkipCNCheck -SkipRevocationCheck -Encoding Utf8 –UseSsl
$Session = New-CimSession -Authentication Basic -Credential $cred -ComputerName $esxi -port 443 -SessionOption $CIMOpt
Get-CimInstance -CimSession $Session -Namespace 'root/cimv2' -ClassName 'CIM_StorageExtent' | ?{$_.CreationClassName -eq 'HPVC_SAStorageExtent'} | Select PSComputerName, Caption, ElementName
$session.close()

In return i get this:

PSComputerName Caption ElementName
-------------- ------- -----------
esxi01.local.lan Disk 1 on HPSA1 Disk 1 on HPSA1 : Port 1I Box 1 Bay 1 : 136GB : Data Disk
esxi02.local.lan Disk 2 on HPSA1 Disk 2 on HPSA1 : Port 1I Box 1 Bay 2 : 136GB : Data Disk

This will work only for HP in this version as i am filtering output with HPVC_SAStorageExtent
Well, better this than nothing 😉

Find CPU hogging vms using PowerCLI

Hello,
this time i would like to show how can we check if we have any vms that are hogging cpu for too long. I bet you are already preventing this using VC alarms. So lets build a simple script that helps us out in getting list of vms that have cpu usage at near 100% for some time.
What we have to do here is:
1) figure out where alarms are triggered , for which vms
2) figure out how to get the root folder of alarms
3) figure out how to get definitions of alarms
4) figure out how to select vms that are using cpu too much for some predefined period of time
5) figure out how to get the cpu stats for vms from point number 4

Let’s go:
Lets first get the service instance object:

$si=get-view serviceinstance

I have decided to put more explanations in this post.
So, serviceinstance, what is that ? I think that at best it is described in the documentation. “The ServiceInstance managed object is the singleton root object of the inventory on both vCenter servers and servers running on standalone host agents. The server creates the ServiceInstance automatically, and also automatically creates the various manager entities that provide services in the virtual environment. Some examples of manager entities are LicenseManager, PerformanceManager, and ViewManager. You can access the manager entities through the content property.”
Make sure to use that link to read service instance description, it will really help to understand how it works.
We will be using alarm manager in order to achieve our goal, let’s have a look how we get there.
sicontent
And ? What to do now ? We see Alarm-AlarmManager. If we would like to return it, we will just get something with Type,Value properties. So how this can help us?
Let’s see what is this by checking the type, and then use get-help on get-view parameter called ID. This should give us some hint what to do next. So This magic Alarm-AlarmManager is actually a moref/ManagedObjectReference , having that we can get its view using get-view.
amn

Then define from when we should build the statistics for vm cpu hogging

$days=-7

Let’s get the VC root folder , and then find out which alarms are triggered there on.

$rootfolderviewalarms=(get-view -id $si.Content.RootFolder).TriggeredAlarmState

Then let’s get the alaram manager object:

$am=get-view -id $si.Content.AlarmManager

Let’s get defined alarms ids in our root vc folder.
We can do this for example using GetAlarm method from our AlarmManager. But..but..but.. how ?! Ok, lets take few steps back. What if we do not know how to do this, or if there even is a method that can do this ? First thing we can do is to inspect the alarm manager, to see what he can do for us.
On the screenshot below, you can see our alarm manager object $am. We want to see what it can do for us, we use get-member to get list of his methods. From there we can see that he has a method called GetAlarm. In order to check what it does we can use the documentation.
From the output of this method, we know that it will be returning morefs(defined alarm ids on particular entity), and in order to use it we need to give it a entity moref, a place where we look for alarms. Now the documentation for GetAlarm method also states that if the entity will not be set, then it will return all visible alarms. If you would like to use it like that, you would have to run it with $null as argument.
amm
In this example we will get only alarms defined on the root of our Virtual Center server.

$alarmids=$am.GetAlarm($si.Content.rootfolder)

Once again, what is it that was returned ? Morefs ! correct. As such they don’t hold too much information. We can get that information though. Using what ? Get-View , connected to that id. Screenshot below shows how to get from ids to actual alarm definition objects with information.
Lets take information about those alarms.

$alarmdefinitions=get-view -id $alarmids

alarmsa1
Let’s find the alarm id that describes the alarm for virtual machine cpu usage.
Now this part is bit tricky. I am making assumption here that we have only 1 alarm defined for vm cpu usage, and that it was defined in the root virtual center folder. This script will not work if you have defined more alarms than 1 for vm cpu usage because i am searching only for alarm system name ‘alarm.VmCPUUsageAlarm’, and i am not filtering by its name. So we are looking at $alarmdefinitions array that holds definitions of alarms with .info object that has a systemname property. We filter it so we can get in result the VM cpu usage alarm, and selecting its alarm moref.

$vmcpuusagealarmid=($alarmdefinitions|Where-Object {$_.Info.Systemname -eq 'alarm.VmCPUUsageAlarm' }).info.alarm
$vmcpuusagealarmid
Type                                              Value
----                                              -----
Alarm                                             alarm-6

So the alarm id that is about vm cpu usage is Alarm-alarm-6.
Let’s get now ids of virtual machines that have currently triggered alarm that we have found previously. Our root VC container had a property called TriggeredAlarmState that holds morefs of entities and alarm ids triggered on those entities. We will now filter them them to get only those that have vm cpu usage alarm triggered.
alarmsa2

$vmswithcpualarms=$rootfolderviewalarms|?{$_.Alarm -eq $vmcpuusagealarmid}

So now we have in $vmswithcpualarms only those triggered alarm states that match our vms with alarm of vm cpu usage.
Let’s change ids to vm view objects, so that we can grab vm names. So far we have only in $vmsswithcpualarms properties called Entity which is only a moref.

$cpuhoggingVMs=get-view -property name -id ($vmswithcpualarms | %{$_.Entity})

Now let’s build statistics from a ‘cpu.usage.average’ metric. We will use 2h intervals and we would like to get data from -7 days until now as stated in $days variable. PowerCLI gives us get-stat cmdlet that we will use. It accepts entity names. I am using $cpuhoggingVMS|%{$_.name} in order to return only names directly. Same as if you would type : -Entity ‘vm1′,’vm2′,’vm3′,’vm4’

$result=Get-Stat -Entity ($cpuhoggingVMS|%{$_.name}) -Start (Get-Date).AddDays($days) -Finish (get-date) -Stat 'cpu.usage.average' -IntervalMins 120

We have now our data stored in $result variable, we have a lot of data there, for each virtual machine statistics about its cpu usage.
Ok what if you say that you can not distinguish from which vm is that statistic data ? I say : “We need to go deeper” 🙂
So gm or get-member on the result entry shows that there are more properties than only those which are displayed.

alarmsa3

And the last line! We will group statistics for the vms by their name, and then for each of them we will measure their average cpu usage during that period of time.

$reportVMcpu=$result | select value,Entity | Group-Object -Property entity | % {$temp=$_; $temp.group | Measure-Object -Property value -average | select @{n='CPU % Average usage';e={[math]::round($_.average,3)}}, @{n='entity';e={$temp.name} }} | Sort-Object -Propert 'CPU % Average usage' -Descending

Now, if we will display our report, we will get a summary of vms and its corresponding average cpu usage through last 7 days in our example.
vmcpuhoggers
We can now tell that some vms here have an average of cpu usage for lat 7 days at 99%, that would indicate that something went bad inside this vm, and we need to investigate it. We do not like vms that hog cpus without any reason for too long 😉
Why would we want this report ?
Ok, i bet you are using alarms for VM cpu usage, and the alarm kicks in after 5..15..30.. minutes for example. You might assume that something went wrong inside the vm, but there are vms that for example are working really hard only during some specific time window. For example systems that are doing calculations at end of the month, or that use cpu only for few days in week/month as per design/function. Each time that alarms is triggered you would have to go to vm performance, and check if this is abnormal situation/call the vm owner/ or look for any pattern in its cpu usage. If you will see that this vm behaves as expected because it is normal to consume that amount of cpu only on Mondays, you would ignore that alarm and just wait to see if alarm gets cleared as previously.

I hope that this post will help you start using alarm manager and other managers, as well as understand morefs and using get-view.

Below is the code without any comments.

$si=get-view serviceinstance
$days=-7
$rootfolderviewalarms=(get-view -id $si.Content.RootFolder).TriggeredAlarmState
$am=get-view -id $si.Content.AlarmManager
$alarmids=$am.GetAlarm($si.Content.rootfolder)
$alarmdefinitions=get-view -id $alarmids
$vmcpuusagealarmid=($alarmdefinitions|Where-Object {$_.Info.Systemname -eq 'alarm.VmCPUUsageAlarm' }).info.alarm
$vmswithcpualarms=$rootfolderviewalarms|?{$_.Alarm -eq $vmcpuusagealarmid}
$cpuhoggingVMs=get-view -property name -id ($vmswithcpualarms | %{$_.Entity})
$result=Get-Stat -Entity ($cpuhoggingVMS|%{$_.name}) -Start (Get-Date).AddDays($days) -Finish (get-date) -Stat 'cpu.usage.average' -IntervalMins 120
$reportVMcpu=$result | select value,Entity | Group-Object -Property entity | % {$temp=$_; $temp.group | Measure-Object -Property value -average | select @{n='CPU % Average usage';e={[math]::round($_.average,3)}}, @{n='entity';e={$temp.name} }} | Sort-Object -Propert 'CPU % Average usage' -Descending

Enjoy!

Get-VMHostTimeReport Reporting time from vmhost system

I wanted to check if all of my vmhost system are keeping time with ntp. I wrote recently how to query time from vmhost system using esxcli. Today i want to show how to do this without esxcli, and also produce nice report and a summary which will help investigate issue with host time. I am comparing host time with local time o/s that is converted to UTC. If you are not sure whether your time is accurate then it might be a problem. Solution for this is to query remote utc time. In internet you can find sites that offer time web services that can be easily integrated to this script.
Have in mind that property DiffToUTC is in seconds.
I have tried to optimize this report as much as i can. It was possible to execute this report in 1 minute and 8 seconds. But it was not that readable and did not contain all of this information. On 100 vmhost+ it should fit in 1 – 2 minutes. Before i had approach with obtaining time using esxcli, but that was slower then this approach.
If you know how to query it faster, post a comment. Have in mind that this function is not perfect. I believe that there are better ways to show weather vmhost system is out of time or not. Anyway, this function helps a lot in diagnosing problems time on esx/esxi host systems.
It will state if ntpd is running or not, or for example if you have >100 host you can spot a pattern for ntp servers. Sometimes it is easy to overlook something, for example ntp servers:
172.16.x.y
72.16.x.y
Where you can easily see that you had a typo.
Sometimes you are just not aware that ntp service is running.
Or sometimes… 😉
Few screenshots from running this function:
From this screenshot where the report was stored in $myreport, we can see that we have some issues with host time, as there is a difference of 76 second compared to my local os time.
srh2
If you see DiffToUTC like 0.0xxxxx that would mean that you are ok. If you get readings > 1 seconds that could indicate something is wrong.
We can also do report for a single vmhost system.
srh1
We can do a full report and using the -summary switch, make function to return some description about the report.
trsum
Documentation for
HostDateTimeSystem -> Managed Object – HostDateTimeSystem
HostServiceSystem -> Managed Object – HostServiceSystem
I have added now option to check your local time which will be converted to UTC and compare it to UTC time taken from internet. Internet connection should be in place in order to use it. That was added if you suspect that your local time might not be correct and you would like to check it with other source. Please also read disclaimer,usage restrictions before using this time web service. So if you spot that your time differs to much from time taken from that ‘internet returned time’, it might indicate some issues, maybe the web service has issues, or our system.
Function below:

function Get-VMHostTimeReport {
<#
	.SYNOPSIS
		Gets time from VM Hosts and checks with local time(to utc).

	.DESCRIPTION
		This function might help investigating issues with vm host time. If it is running without any
		parameter it checks time for all hosts registered in virtual center to which user is currently
		connected. Using parameter SingleVMHost will produce report for single vm host system.
		VMHost should be the name of the host, string.
		Report returns colums : Name (vmhost system name), VMHostTime (Time from vmhost),
		UTCTime (this is the utc time from our local os), NTPServers ( if any are in the vmhost
		configuration), NTPServiceRunning (checks if the ntp service is running on the vmhost),
		DiffToUTC (that's the difference in seconds between time reported by vmhost and our os)
		By default it sorts report by from lowest to highest time difference reported.
		

	.PARAMETER  SingleVMHost
		Specify single vmhost name that is registered in VC. This should be a string.
		
	.PARAMETER  Summary
		Indicate if you would like to receive short summary about produced report.

	.PARAMETER  CheckTimeFromInternet
		Indicate if you would like to see in summary information about your local time and remote 
		time. Time from remote web service will be checked and comapred to your system utc time.

	.EXAMPLE
		PS C:\> Get-VMHostTimeReport
		Will produce report for all vmhosts that are registered within VirtualCenter to which user
		is currently connected. It is possible to close the report into a variable for example:
		$timereport=Get-VMHostTimeReport
		You can then export this report to csv if needed for example, or view it again :
		$timereport | format-table -autosize
		For viewing convenience
		
		
	.EXAMPLE
		PS C:\> Get-VMHostTimeReport -SingleVMHost 'myesxhost.local.lan'
		Will produce report for given vmhost that is registered within VirtualCenter to which user
		is currently connected.
		

	.EXAMPLE
		PS C:\> Get-VMHostTimeReport -Summary
		Will produce report . and show a small summary which might indicate if there is a problem
		with time sync on vmhost systems. Example of the summary below:
		UTC time from the current system :Min and Max times during reporting period
		Min: 9/3/2013 4:17:38 PM
		Max: 9/3/2013 4:19:20 PM
		While function was creating this report, first date that was returned by local os was the Min
		and the last date that was returned by local os was the Max value.
		VMHosts reported Times :Min and Max date / time while creating report
		Min: 9/3/2013 4:16:35 PM
		Max: 9/3/2013 4:19:20 PM
		Our vmhost systems reported their date/time. If this time span is too big, this might indicate 
		issues with vmhost ntp sync.
		Time Difference between VMHost and UTC from local os time Min, Max, Avg
		Min: 0.03075
		Max: 76.20985
		Avg: 1.36604993421053
		This is summary for comparing UTC vm host time to UTC time from local os. If you see 
		big Max value >1/2 sec that it might indicate that there is an issue with vmhost ntp time sync
		If a switch parameter CheckTimeFromInternet is present in the summary section there will
		be small report generated about your local time converted to UTC and time taken from
		http://www.earthtools.org
		You can then quickly see if there is an issue with your local time

	.NOTES
		NAME:  Get-VMHostTimeReport
		
		AUTHOR: Grzegorz Kulikowski
		
		NOT WORKING ? #powercli @ irc.freenode.net 
		
		THANKS: http://www.earthtools.org

	.LINK

https://psvmware.wordpress.com

#>

   param(
   [string]$SingleVMHost,
   [switch]$Summary,
   [switch]$CheckTimeFromInternet
   )
$TimeReport=@()
if ($SingleVMHost) { $VMHosts=Get-View -ViewType HostSystem -Filter @{'name'=$SingleVMHost} }
else{
$VMHosts=get-view -viewtype hostsystem -property name,ConfigManager.DateTimeSystem,ConfigManager.ServiceSystem -Filter @{'runtime.ConnectionState'='connected'}
}
Foreach($VMHost in $VMHosts){
$VMHostDateTimeSystem=get-view -id $VMHost.ConfigManager.DateTimeSystem
$VMHostServiceSystem=get-view -id $VMHost.ConfigManager.ServiceSystem
$VMHostTime=$VMHostDateTimeSystem.QueryDateTime()
$NtpServiceState=($VMHostServiceSystem.ServiceInfo.Service|Where-Object {$_.Key -eq 'ntpd'}).Running
$NtpServers=$VMHostDateTimeSystem.DateTimeInfo.NtpConfig.Server
$UTCTime=(Get-Date).ToUniversalTime() 
$TimeReport+=$vmhost| Select-Object -Property Name, @{n='VMHostTime';e={$VMHostTime}},@{n='UTCTime';e={$UTCTime}},@{n='NTPServers';e={$NtpServers}},@{n='NTPServiceRunning';e={$NtpServiceState}},@{n='DiffToUTC';e={[Math]::Round([math]::abs(($VMHostTime - $UTCTime).TotalSeconds),5)}}
}
if($Summary){
$SummaryUTCTime=$TimeReport|Measure-Object -Property UTCTime -Min -Max
$SummaryVMHostTime=$TimeReport|Measure-Object -Property VMHostTime -Min -Max
$SummaryDiffToUTC=$TimeReport|Measure-Object -Property DiffToUTC -Min -Max -Average
Write-Host "UTC time from the current system :Min and Max date/time while creating this report."
Write-Host "Min: $($SummaryUTCTime.Minimum.ToString())"
Write-Host "Max: $($SummaryUTCTime.Maximum.ToString())"
Write-Host "VMHosts reported Times :Min and Max date/time while creating this report"
Write-Host "Min: $($SummaryVMHostTime.Minimum.ToString())"
Write-Host "Max: $($SummaryVMHostTime.Maximum.ToString())"
Write-Host "Time Difference between VMHost and UTC from local os time Min, Max, Avg"
Write-Host "Min: $($SummaryDiffToUTC.Minimum.ToString())"
Write-Host "Max: $($SummaryDiffToUTC.Maximum.ToString())"
Write-Host "Avg: $($SummaryDiffToUTC.Average.ToString())"
if($CheckTimeFromInternet){
[datetime]$TimeFromInternet=(Invoke-RestMethod -Uri 'http://www.earthtools.org/timezone/52.35000/4.86660').timezone.utctime
$CurrentSystemTimeUTC=(Get-Date).ToUniversalTime()
Write-Host "We took UTC time from internet and compared it to your local time converted to UTC time"
$TimeDiff=[math]::abs(($CurrentSystemTimeUTC-$TimeFromInternet).TotalSeconds)
Write-Host "Reported local time converted to UTC $CurrentSystemTimeUTC"
Write-Host "Reported time taken from internet(http://www.earthtools.org/webservices.htm) $TimeFromInternet"
Write-Host "Difference: $TimeDiff seconds"
}
}
return $TimeReport | Sort-Object -Property DiffToUTC
}

Failed to launch the MKS client: The system cannot find the file specified

Evening troubleshooting…
I was sitting on #vmware irc channel today and i was interested in problem that one person had. He had fresh installation of esxi 5.1 on a brand new box, few vms, but for some reason he could not connect to vm remote console using vsphere client. We went through a lot of checking, many KBs and so on and on, cables, cards, nics, fw, dns… still nothing.
What we did at the end is that we were watching with ‘ tail -f /var/log/hostd.log ‘ the log, while he tried to open the console. And we finally saw :

2013-08-06T20:55:00.177Z [47696B90 verbose ‘Default’] Timed out reading between HTTP requests. : Read timeout after approximately 50000ms. Closing stream TCP(local=127.0.0.1:8309, peer=127.0.0.1:50050)
2013-08-06T20:55:00.177Z [470E2B90 error ‘Solo.HttpSvc.HTTPService’] Failed to read request; stream: TCP(), error: N7Vmacore16TimeoutExceptionE(Operation timed out)
2013-08-06T20:55:02.061Z [47738B90 verbose ‘SoapAdapter’] Responded to service state request

So there had to be something with the connection/port.

vSphere client log was showing:

2013-08-06T15:07:51.557-05:00| vmrc| I120: VMClient_ConnectMksClientEx – connecting the MKS client
2013-08-06T15:07:51.557-05:00| vmrc| I120: VMClientConnectMKSClientEx
2013-08-06T15:07:51.558-05:00| vmrc| I120: cui::MKS::OnSetAttachedError
2013-08-06T15:07:51.558-05:00| vmrc| I120: cui::vmrc::DlgMgrImpl: “Unable to connect to the MKS: Failed to launch the MKS client: The system cannot find the file specified

vmware.log from vm had a lot of those entires:

2013-08-06T02:47:41.373Z| mks| I120: SSL Error: error:140780E5:SSL routines:SSL23_READ:ssl handshake failure
2013-08-06T02:47:41.373Z| mks| I120: SOCKET 6 (140) recv error 0: Success
2013-08-06T02:47:41.373Z| mks| W110: SOCKET 6 (140) Error during authd-VNC negotiation: (1) Asyncsocket error.

Firewall in windows was said to be disabled.

Solution: i asked if it possible to try to connect with vSphere client from another workstation which is not in Active Directory, as that computer from which that person was connecting to that esxi host was in active directory.
B.I.N.G.O 😉

That system was in AD where there were very strict policies, which were causing problems with making connections on 903 tcp port which is utilized when accessing the vm remote console. As soon as he made the connection from workstation that was not part of that AD/policy, VM console was opened without any issues.

23:33 -Krazypoloc- Holy shit the non domain machine works fine!
23:34 -Krazypoloc- Thanks for all your help man
23:35 -Krazypoloc- nostrovia
23:35 -Krazypoloc- 🙂
23:35 -gregu– no problem

🙂 I’m pretty sure that he was trying to say:
Na zdrowie !
here it is 😉
na_zdrowie

VAAI NFS Netapp plugin for vSphere unable to install using VSC

Right,
normally we should be able to install plugin from netapp for NFS VAAI support using the VSC console. I managed to do it for some hosts, then i started to update my hosts to latest build and from this moment i could no longer install the nfs vaai plugin for hosts using vsc. All hosts were updated to VMware ESXi 5.0.0 build-914586. Update manager could not help, vsc could not help… What helped :
1) download from netapp website the zip bundle
2) put it directly to host for example
3) using esxcli install it, so ssh to your esxi host

cd /vmfs/volumes/your_datastore_where_the_zip_is
/vmfs/volumes/2d1303c2-270be8c4 # esxcli software vib install -d file://$PWD/NetAppNasPlugin.v18.zip
Installation Result
Message: The update completed successfully, but the system needs to be rebooted for the changes to be effective.
Reboot Required: true
VIBs Installed: NetApp_bootbank_NetAppNasPlugin_1.0-018
VIBs Removed:
VIBs Skipped:
4) Reboot host
5) Let’s party !!! vaai is now [ON] 🙂

You can then check if everything is correct
1) esxcli software vib list -> check for netapp part
2) vmkfstools -Ph /vmfs/volumes/your_datastore -> check for : “NAS VAAI Supported: YES”
3) check in vSphere client -> configuration/storage if this datastore has now Supported value in Hardware acceleration
4) After installation you should see file in here on your esxi host:
/altbootbank/NetAppNa.v00
After host reboot you should see file in here on your esxi host:
/bootbank/NetAppNa.v00

When troubleshooting plugin installation using netapp vsc, you can check log : /var/log/esxupdate.log to see why in your case it is not installing it correctly

Issues with vmmonitoring in HA clusters while using vsphere client 5.0.0 build 455964, vm monitoring with vmware tools down will not restart failed vm

I wanted to share with everybody this ‘news’.

While trying to create vm monitoring within HA cluster using vm monitoring only i went into some issues. VM monitoring was not working. I thought there is not that much to configure to have this working right ? Open cluster settings, go to vm monitoring section

Continue reading

Locate vms that are in particular cluster and are using particular datastore in powercli

A quick one :
Trying to find which vms are residing on datastore :”datastore_name” and are located in cluster “your_cluster”

Get-Cluster "your_cluster"|Get-vm |?{($_.extensiondata.config.datastoreurl|%{$_.name}) -contains "datastore_name"}
or
Get-Cluster "your_cluster"|Get-vm |?{($_.extensiondata.config.datastoreurl|?{$_.name -eq "datastore_name"})}

It’s easier if the vms reside on particular datastore and you do not care in which cluster they are:

(Get-Datastore -Name 'datastore_name').Extensiondata.Vm|%{(Get-View -Id $_.toString()).name}

And approach where you want to check all datastores in specific cluster:
So i was thinking about this and came out with that 😉 :

get-vmhost -Location "Cluster_01"|get-datastore|%{$ds=$_; $ds.Extensiondata.Vm|%{$_|select @{n='vm name';e={(Get-View -property name -Id $_.toString()).name}},@{n='ds name';e={$ds.name}}  }}

In case you want to check serveral datastores with known names for vms you can do this, assuming that for example names are written in file : c:\dsfile.txt

Get-Datastore -Name (get-content c:\dsfile.txt) | %{$temp=$_; $temp.extensiondata.vm | % { $_ | select @{n='VMName';e={(get-view -property name -id $_).Name}},@{n='ds';e={$temp.name}}}}