Hello,
I’m an Exchange Server VM that’s going down to suspend without possibility of recovery. I
need to click on shutdown and them power on. I can’t find anything useful on the logs,
except on “dmesg” of the host:
[47807.747606] *** Guest State ***
[47807.747633] CR0: actual=0x0000000000050032, shadow=0x0000000000050032,
gh_mask=fffffffffffffff7
[47807.747671] CR4: actual=0x0000000000002050, shadow=0x0000000000000000,
gh_mask=fffffffffffff871
[47807.747721] CR3 = 0x00000000001ad002
[47807.747739] RSP = 0xffffc20904fa3770 RIP = 0x0000000000008000
[47807.747766] RFLAGS=0x00000002 DR7 = 0x0000000000000400
[47807.747792] Sysenter RSP=0000000000000000 CS:RIP=0000:0000000000000000
[47807.747821] CS: sel=0x9100, attr=0x08093, limit=0xffffffff, base=0x000000007ff91000
[47807.747855] DS: sel=0x0000, attr=0x08093, limit=0xffffffff, base=0x0000000000000000
[47807.747889] SS: sel=0x0000, attr=0x08093, limit=0xffffffff, base=0x0000000000000000
[47807.747923] ES: sel=0x0000, attr=0x08093, limit=0xffffffff, base=0x0000000000000000
[47807.747957] FS: sel=0x0000, attr=0x08093, limit=0xffffffff, base=0x0000000000000000
[47807.747991] GS: sel=0x0000, attr=0x08093, limit=0xffffffff, base=0x0000000000000000
[47807.748025] GDTR: limit=0x00000057, base=0xffff80817e7d5fb0
[47807.748059] LDTR: sel=0x0000, attr=0x10000, limit=0x000fffff, base=0x0000000000000000
[47807.748093] IDTR: limit=0x00000000, base=0x0000000000000000
[47807.748128] TR: sel=0x0040, attr=0x0008b, limit=0x00000067, base=0xffff80817e7d4000
[47807.748162] EFER = 0x0000000000000000 PAT = 0x0007010600070106
[47807.748189] DebugCtl = 0x0000000000000000 DebugExceptions = 0x0000000000000000
[47807.748221] Interruptibility = 00000009 ActivityState = 00000000
[47807.748248] *** Host State ***
[47807.748263] RIP = 0xffffffffc0c65024 RSP = 0xffff9252bda5fc90
[47807.748290] CS=0010 SS=0018 DS=0000 ES=0000 FS=0000 GS=0000 TR=0040
[47807.748318] FSBase=00007f46d462a700 GSBase=ffff9252ffac0000 TRBase=ffff9252ffac4000
[47807.748351] GDTBase=ffff9252ffacc000 IDTBase=ffffffffff528000
[47807.748377] CR0=0000000080050033 CR3=000000105ac8c000 CR4=00000000001627e0
[47807.748407] Sysenter RSP=0000000000000000 CS:RIP=0010:ffffffff8f196cc0
[47807.748435] EFER = 0x0000000000000d01 PAT = 0x0007050600070106
[47807.748461] *** Control State ***
[47807.748478] PinBased=0000003f CPUBased=b6a1edfa SecondaryExec=00000ceb
[47807.748507] EntryControls=0000d1ff ExitControls=002fefff
[47807.748531] ExceptionBitmap=00060042 PFECmask=00000000 PFECmatch=00000000
[47807.748561] VMEntry: intr_info=00000000 errcode=00000006 ilen=00000000
[47807.748589] VMExit: intr_info=00000000 errcode=00000000 ilen=00000001
[47807.748618] reason=80000021 qualification=0000000000000000
[47807.748645] IDTVectoring: info=00000000 errcode=00000000
[47807.748669] TSC Offset = 0xfffff9b8c8d943b6
[47807.748699] TPR Threshold = 0x00
[47807.748715] EPT pointer = 0x000000105cd5601e
[47807.748735] PLE Gap=00000080 Window=00001000
[47807.748755] Virtual processor ID = 0x0003
So something really went crazy. The VM is going down at least two times a day for the last
5 days.
At first I thought it would be an hardware issue, so I restarted the VM on other host, and
the same thing happened.
About the VM it’s configured with 10 CPUs, 48GB of RAM running oVirt 4.3.10 with iSCSI
storage to a FreeNAS box, where the VM disks are running; there are a 300GB disc for C:\
and 2TB disk for D:\.
Any ideia on how to start troubleshooting it?
Thanks,