[JIRA] (OVIRT-609) Jenkins snapshot creation failed
by Evgheni Dereveanchin (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-609?page=com.atlassian.jira... ]
Evgheni Dereveanchin reassigned OVIRT-609:
------------------------------------------
Assignee: Evgheni Dereveanchin (was: infra)
> Jenkins snapshot creation failed
> --------------------------------
>
> Key: OVIRT-609
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-609
> Project: oVirt - virtualization made easy
> Issue Type: Bug
> Reporter: Evgheni Dereveanchin
> Assignee: Evgheni Dereveanchin
>
> [~ngoldin(a)redhat.com] issued a live snapshot creation on the Jenkins VM to prepare it for cluster move. This failed and it's not really clear why. Relevant event logs below, suggesting that the hypervisor started dumping VM memory to the snapshot which caused a storage slowdown.
> {quote}2016-Jun-23, 18:06 Snapshot 'ngoldin_before_cluster_move' creation for VM 'jenkins-phx-ovirt-org' was initiated by admin.
> 2016-Jun-23, 18:09 Failed to create live snapshot 'ngoldin_before_cluster_move' for VM 'jenkins-phx-ovirt-org'. VM restart is recommended. Note that using the created snapshot might cause data inconsistency.
> 2016-Jun-23, 18:13 Host ovirt-srv02 has network interface which exceeded the defined threshold [95%] (em1: transmit rate[100%], receive rate [0%])
> 2016-Jun-23, 18:13 Storage domain Production experienced a high latency of 18.7802 seconds from host ovirt-srv11. This may cause performance and functional issues. Please consult your Storage Administrator.{quote}
--
This message was sent by Atlassian JIRA
(v1000.98.4#100004)
8 years, 5 months
[JIRA] (OVIRT-609) Jenkins snapshot creation failed
by Evgheni Dereveanchin (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-609?page=com.atlassian.jira... ]
Evgheni Dereveanchin commented on OVIRT-609:
--------------------------------------------
The memory dump file is 33498670080 bytes (32GB) in size.
It took 11 minutes to copy it means the average speed was
around 40 megabytes per second. As the logs show storage
latency errors on other hosts during this time, it means
the storage was overwhelmed again - just not by builds,
but by this single consecutive write during snapshotting.
Similar messages can be seen during snapshotting artifactory
earlier the same day, but as that VM has less RAM it managed
to dump RAM within 3 minutes and succeeded.
> Jenkins snapshot creation failed
> --------------------------------
>
> Key: OVIRT-609
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-609
> Project: oVirt - virtualization made easy
> Issue Type: Bug
> Reporter: Evgheni Dereveanchin
> Assignee: infra
>
> [~ngoldin(a)redhat.com] issued a live snapshot creation on the Jenkins VM to prepare it for cluster move. This failed and it's not really clear why. Relevant event logs below, suggesting that the hypervisor started dumping VM memory to the snapshot which caused a storage slowdown.
> {quote}2016-Jun-23, 18:06 Snapshot 'ngoldin_before_cluster_move' creation for VM 'jenkins-phx-ovirt-org' was initiated by admin.
> 2016-Jun-23, 18:09 Failed to create live snapshot 'ngoldin_before_cluster_move' for VM 'jenkins-phx-ovirt-org'. VM restart is recommended. Note that using the created snapshot might cause data inconsistency.
> 2016-Jun-23, 18:13 Host ovirt-srv02 has network interface which exceeded the defined threshold [95%] (em1: transmit rate[100%], receive rate [0%])
> 2016-Jun-23, 18:13 Storage domain Production experienced a high latency of 18.7802 seconds from host ovirt-srv11. This may cause performance and functional issues. Please consult your Storage Administrator.{quote}
--
This message was sent by Atlassian JIRA
(v1000.98.4#100004)
8 years, 5 months
[JIRA] (OVIRT-609) Jenkins snapshot creation failed
by Evgheni Dereveanchin (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-609?page=com.atlassian.jira... ]
Evgheni Dereveanchin commented on OVIRT-609:
--------------------------------------------
Here are host logs. As suspected, I did not find any errors, the VM was suspended at 09:06:47 MST and resumed at 09:17:26 MST with Thread-7164775 returning successfully after copying 33498670080 bytes of RAM to the storage domain.
{quote}Thread-7164775::DEBUG::2016-06-23 09:06:45,911::BindingXMLRPC::1133::vds::(wrapper) client [66.187.230.60]::call vmSnapshot with ('e7a7b735-0310-4f88-9ed9-4fed85835a01', [{'baseVolumeID': 'f37836c6-4bbe-4c8d-abf4-275cf461262e', 'domainID': 'ba023ff2-4e0e-4a32-86f3-923414206667', 'volumeID': '3b105e9b-53fe-4452-be71-2ac2182ecfec', 'imageID': '140adf46-fce4-4dba-980d-37d91416b12b'}], 'ba023ff2-4e0e-4a32-86f3-923414206667,00000002-0002-0002-0002-000000000150,2beb0ee6-b70b-4f48-bdd9-d89650383d61,daef68b9-5967-4047-9b17-1f55b68e5d8a,3580f2a1-a55a-47d0-9e67-627afbc0f2da,6c20093d-a5f3-407a-8986-ca26a488cb20') {}
...
Thread-7164775::DEBUG::2016-06-23 09:06:47,459::vm::4432::vm.Vm::(snapshot) vmId=`e7a7b735-0310-4f88-9ed9-4fed85835a01`::<domainsnapshot>
<disks>
<disk name="vda" snapshot="external" type="file">
<source file="/rhev/data-center/00000002-0002-0002-0002-000000000150/ba023ff2-4e0e-4a32-86f3-923414206667/images/140adf46-fce4-4dba-980d-37d91416b12b/3b105e9b-53fe-4452-be71-2ac2182ecfec" type="file"/>
</disk>
</disks>
<memory file="/rhev/data-center/00000002-0002-0002-0002-000000000150/ba023ff2-4e0e-4a32-86f3-923414206667/images/2beb0ee6-b70b-4f48-bdd9-d89650383d61/daef68b9-5967-4047-9b17-1f55b68e5d8a" snapshot="external"/>
</domainsnapshot>
...
libvirtEventLoop::DEBUG::2016-06-23 09:06:47,645::vm::5571::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`e7a7b735-0310-4f88-9ed9-4fed85835a01`::event Suspended detail 0 opaque None
...
Thread-7164775::DEBUG::2016-06-23 09:17:26,338::outOfProcess::169::Storage.oop::(padToBlockSize) Truncating file /rhev/data-center/00000002-0002-0002-0002-000000000150/ba023ff2-4e0e-4a32-86f3-923414206667/images/2beb0ee6-b70
b-4f48-bdd9-d89650383d61/daef68b9-5967-4047-9b17-1f55b68e5d8a to 33498670080 bytes
...
libvirtEventLoop::DEBUG::2016-06-23 09:17:26,317::vm::5571::vm.Vm::(_onLibvirtLifecycleEvent) vmId=`e7a7b735-0310-4f88-9ed9-4fed85835a01`::event Resumed detail 0 opaque None
...
Thread-7164775::DEBUG::2016-06-23 09:17:26,450::BindingXMLRPC::1140::vds::(wrapper) return vmSnapshot with {'status': {'message': 'Done', 'code': 0}, 'quiesce': False}
{quote}
On Engine the process timed out after 3 minutes and in reality it took 11 minutes. This suggests the snapshot is likely completely healthy, I'll take a sosreport from the host just in case we need to further investigate this, maybe [~landgraf] can check the logs for more clues.
> Jenkins snapshot creation failed
> --------------------------------
>
> Key: OVIRT-609
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-609
> Project: oVirt - virtualization made easy
> Issue Type: Bug
> Reporter: Evgheni Dereveanchin
> Assignee: infra
>
> [~ngoldin(a)redhat.com] issued a live snapshot creation on the Jenkins VM to prepare it for cluster move. This failed and it's not really clear why. Relevant event logs below, suggesting that the hypervisor started dumping VM memory to the snapshot which caused a storage slowdown.
> {quote}2016-Jun-23, 18:06 Snapshot 'ngoldin_before_cluster_move' creation for VM 'jenkins-phx-ovirt-org' was initiated by admin.
> 2016-Jun-23, 18:09 Failed to create live snapshot 'ngoldin_before_cluster_move' for VM 'jenkins-phx-ovirt-org'. VM restart is recommended. Note that using the created snapshot might cause data inconsistency.
> 2016-Jun-23, 18:13 Host ovirt-srv02 has network interface which exceeded the defined threshold [95%] (em1: transmit rate[100%], receive rate [0%])
> 2016-Jun-23, 18:13 Storage domain Production experienced a high latency of 18.7802 seconds from host ovirt-srv11. This may cause performance and functional issues. Please consult your Storage Administrator.{quote}
--
This message was sent by Atlassian JIRA
(v1000.98.4#100004)
8 years, 5 months
[JIRA] (OVIRT-609) Jenkins snapshot creation failed
by Evgheni Dereveanchin (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-609?page=com.atlassian.jira... ]
Evgheni Dereveanchin commented on OVIRT-609:
--------------------------------------------
[~eedri] we're working on this with Anton in OVIRT-604 and so far the results are good. I'll prepare more hosts to make use of SSDs on them and we can proceed with Prod migration next week after deciding the safest way to do so. In any case, as the VM was already rebooted and came up, there should be no issues with the disk image chain.
> Jenkins snapshot creation failed
> --------------------------------
>
> Key: OVIRT-609
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-609
> Project: oVirt - virtualization made easy
> Issue Type: Bug
> Reporter: Evgheni Dereveanchin
> Assignee: infra
>
> [~ngoldin(a)redhat.com] issued a live snapshot creation on the Jenkins VM to prepare it for cluster move. This failed and it's not really clear why. Relevant event logs below, suggesting that the hypervisor started dumping VM memory to the snapshot which caused a storage slowdown.
> {quote}2016-Jun-23, 18:06 Snapshot 'ngoldin_before_cluster_move' creation for VM 'jenkins-phx-ovirt-org' was initiated by admin.
> 2016-Jun-23, 18:09 Failed to create live snapshot 'ngoldin_before_cluster_move' for VM 'jenkins-phx-ovirt-org'. VM restart is recommended. Note that using the created snapshot might cause data inconsistency.
> 2016-Jun-23, 18:13 Host ovirt-srv02 has network interface which exceeded the defined threshold [95%] (em1: transmit rate[100%], receive rate [0%])
> 2016-Jun-23, 18:13 Storage domain Production experienced a high latency of 18.7802 seconds from host ovirt-srv11. This may cause performance and functional issues. Please consult your Storage Administrator.{quote}
--
This message was sent by Atlassian JIRA
(v1000.98.4#100004)
8 years, 5 months
[JIRA] (OVIRT-609) Jenkins snapshot creation failed
by eyal edri [Administrator] (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-609?page=com.atlassian.jira... ]
eyal edri [Administrator] commented on OVIRT-609:
-------------------------------------------------
I suggest to halt work on production dc until we move at least a few
hypervisors to use the vdsm scratch pad hook for local disk and migrate
thier vms to use it, so we'll see a significant improvement in storage
performance before moving on with production dc.
On Jun 24, 2016 11:01 AM, "Evgheni Dereveanchin (oVirt JIRA)" <
jira(a)ovirt-jira.atlassian.net> wrote:
[
https://ovirt-jira.atlassian.net/browse/OVIRT-609?page=com.atlassian.jira...
]
Evgheni Dereveanchin commented on OVIRT-609:
--------------------------------------------
Here are some relevant messages from engine.log:
{quote}
grep 1394b752 /var/log/ovirt-engine/engine.log
2016-06-23 09:06:34,099 INFO
[org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand]
(ajp--127.0.0.1-8702-1) [1394b752] Lock Acquired to object EngineLock
[exclusiveLocks= key: e7a7b735-0310-4f88-9ed9-4fed85835a01 value: VM
2016-06-23 09:06:35,708 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(org.ovirt.thread.pool-8-thread-15) Correlation ID: 1394b752, Job ID:
a8fab0bf-d45e-46eb-8314-e22db8e6a3f4, Call Stack: null, Custom Event ID:
-1, Message: Snapshot 'ngoldin_before_cluster_move' creation for VM
'jenkins-phx-ovirt-org' was initiated by admin.
2016-06-23 09:09:46,038 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(org.ovirt.thread.pool-8-thread-14) Correlation ID: 1394b752, Job ID:
a8fab0bf-d45e-46eb-8314-e22db8e6a3f4, Call Stack:
org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException:
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
java.util.concurrent.TimeoutException (Failed with error VDS_NETWORK_ERROR
and code 5022)
2016-06-23 09:09:47,859 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(org.ovirt.thread.pool-8-thread-14) Correlation ID: 1394b752, Job ID:
a8fab0bf-d45e-46eb-8314-e22db8e6a3f4, Call Stack:
org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException:
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
java.util.concurrent.TimeoutException (Failed with error VDS_NETWORK_ERROR
and code 5022){quote}
Looks like VDSM was slow to respond (probably due to storage slowness)
while the snapshot is likely to have completed fine. I'll review host logs
and share my findings.
to prepare it for cluster move. This failed and it's not really clear why.
Relevant event logs below, suggesting that the hypervisor started dumping
VM memory to the snapshot which caused a storage slowdown.
for VM 'jenkins-phx-ovirt-org' was initiated by admin.
'ngoldin_before_cluster_move' for VM 'jenkins-phx-ovirt-org'. VM restart is
recommended. Note that using the created snapshot might cause data
inconsistency.
the defined threshold [95%] (em1: transmit rate[100%], receive rate [0%])
of 18.7802 seconds from host ovirt-srv11. This may cause performance and
functional issues. Please consult your Storage Administrator.{quote}
--
This message was sent by Atlassian JIRA
(v1000.98.4#100004)
_______________________________________________
Infra mailing list
Infra(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra
> Jenkins snapshot creation failed
> --------------------------------
>
> Key: OVIRT-609
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-609
> Project: oVirt - virtualization made easy
> Issue Type: Bug
> Reporter: Evgheni Dereveanchin
> Assignee: infra
>
> [~ngoldin(a)redhat.com] issued a live snapshot creation on the Jenkins VM to prepare it for cluster move. This failed and it's not really clear why. Relevant event logs below, suggesting that the hypervisor started dumping VM memory to the snapshot which caused a storage slowdown.
> {quote}2016-Jun-23, 18:06 Snapshot 'ngoldin_before_cluster_move' creation for VM 'jenkins-phx-ovirt-org' was initiated by admin.
> 2016-Jun-23, 18:09 Failed to create live snapshot 'ngoldin_before_cluster_move' for VM 'jenkins-phx-ovirt-org'. VM restart is recommended. Note that using the created snapshot might cause data inconsistency.
> 2016-Jun-23, 18:13 Host ovirt-srv02 has network interface which exceeded the defined threshold [95%] (em1: transmit rate[100%], receive rate [0%])
> 2016-Jun-23, 18:13 Storage domain Production experienced a high latency of 18.7802 seconds from host ovirt-srv11. This may cause performance and functional issues. Please consult your Storage Administrator.{quote}
--
This message was sent by Atlassian JIRA
(v1000.98.4#100004)
8 years, 5 months
Re: [JIRA] (OVIRT-609) Jenkins snapshot creation failed
by Eyal Edri
I suggest to halt work on production dc until we move at least a few
hypervisors to use the vdsm scratch pad hook for local disk and migrate
thier vms to use it, so we'll see a significant improvement in storage
performance before moving on with production dc.
On Jun 24, 2016 11:01 AM, "Evgheni Dereveanchin (oVirt JIRA)" <
jira(a)ovirt-jira.atlassian.net> wrote:
[
https://ovirt-jira.atlassian.net/browse/OVIRT-609?page=com.atlassian.jira...
]
Evgheni Dereveanchin commented on OVIRT-609:
--------------------------------------------
Here are some relevant messages from engine.log:
{quote}
grep 1394b752 /var/log/ovirt-engine/engine.log
2016-06-23 09:06:34,099 INFO
[org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand]
(ajp--127.0.0.1-8702-1) [1394b752] Lock Acquired to object EngineLock
[exclusiveLocks= key: e7a7b735-0310-4f88-9ed9-4fed85835a01 value: VM
2016-06-23 09:06:35,708 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(org.ovirt.thread.pool-8-thread-15) Correlation ID: 1394b752, Job ID:
a8fab0bf-d45e-46eb-8314-e22db8e6a3f4, Call Stack: null, Custom Event ID:
-1, Message: Snapshot 'ngoldin_before_cluster_move' creation for VM
'jenkins-phx-ovirt-org' was initiated by admin.
2016-06-23 09:09:46,038 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(org.ovirt.thread.pool-8-thread-14) Correlation ID: 1394b752, Job ID:
a8fab0bf-d45e-46eb-8314-e22db8e6a3f4, Call Stack:
org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException:
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
java.util.concurrent.TimeoutException (Failed with error VDS_NETWORK_ERROR
and code 5022)
2016-06-23 09:09:47,859 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(org.ovirt.thread.pool-8-thread-14) Correlation ID: 1394b752, Job ID:
a8fab0bf-d45e-46eb-8314-e22db8e6a3f4, Call Stack:
org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException:
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
java.util.concurrent.TimeoutException (Failed with error VDS_NETWORK_ERROR
and code 5022){quote}
Looks like VDSM was slow to respond (probably due to storage slowness)
while the snapshot is likely to have completed fine. I'll review host logs
and share my findings.
> Jenkins snapshot creation failed
> --------------------------------
>
> Key: OVIRT-609
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-609
> Project: oVirt - virtualization made easy
> Issue Type: Bug
> Reporter: Evgheni Dereveanchin
> Assignee: infra
>
> [~ngoldin(a)redhat.com] issued a live snapshot creation on the Jenkins VM
to prepare it for cluster move. This failed and it's not really clear why.
Relevant event logs below, suggesting that the hypervisor started dumping
VM memory to the snapshot which caused a storage slowdown.
> {quote}2016-Jun-23, 18:06 Snapshot 'ngoldin_before_cluster_move' creation
for VM 'jenkins-phx-ovirt-org' was initiated by admin.
> 2016-Jun-23, 18:09 Failed to create live snapshot
'ngoldin_before_cluster_move' for VM 'jenkins-phx-ovirt-org'. VM restart is
recommended. Note that using the created snapshot might cause data
inconsistency.
> 2016-Jun-23, 18:13 Host ovirt-srv02 has network interface which exceeded
the defined threshold [95%] (em1: transmit rate[100%], receive rate [0%])
> 2016-Jun-23, 18:13 Storage domain Production experienced a high latency
of 18.7802 seconds from host ovirt-srv11. This may cause performance and
functional issues. Please consult your Storage Administrator.{quote}
--
This message was sent by Atlassian JIRA
(v1000.98.4#100004)
_______________________________________________
Infra mailing list
Infra(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/infra
8 years, 5 months
[JIRA] (OVIRT-609) Jenkins snapshot creation failed
by Evgheni Dereveanchin (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-609?page=com.atlassian.jira... ]
Evgheni Dereveanchin commented on OVIRT-609:
--------------------------------------------
Here are some relevant messages from engine.log:
{quote}
grep 1394b752 /var/log/ovirt-engine/engine.log
2016-06-23 09:06:34,099 INFO [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand] (ajp--127.0.0.1-8702-1) [1394b752] Lock Acquired to object EngineLock [exclusiveLocks= key: e7a7b735-0310-4f88-9ed9-4fed85835a01 value: VM
2016-06-23 09:06:35,708 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-8-thread-15) Correlation ID: 1394b752, Job ID: a8fab0bf-d45e-46eb-8314-e22db8e6a3f4, Call Stack: null, Custom Event ID: -1, Message: Snapshot 'ngoldin_before_cluster_move' creation for VM 'jenkins-phx-ovirt-org' was initiated by admin.
2016-06-23 09:09:46,038 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-8-thread-14) Correlation ID: 1394b752, Job ID: a8fab0bf-d45e-46eb-8314-e22db8e6a3f4, Call Stack: org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: java.util.concurrent.TimeoutException (Failed with error VDS_NETWORK_ERROR and code 5022)
2016-06-23 09:09:47,859 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-8-thread-14) Correlation ID: 1394b752, Job ID: a8fab0bf-d45e-46eb-8314-e22db8e6a3f4, Call Stack: org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: java.util.concurrent.TimeoutException (Failed with error VDS_NETWORK_ERROR and code 5022){quote}
Looks like VDSM was slow to respond (probably due to storage slowness) while the snapshot is likely to have completed fine. I'll review host logs and share my findings.
> Jenkins snapshot creation failed
> --------------------------------
>
> Key: OVIRT-609
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-609
> Project: oVirt - virtualization made easy
> Issue Type: Bug
> Reporter: Evgheni Dereveanchin
> Assignee: infra
>
> [~ngoldin(a)redhat.com] issued a live snapshot creation on the Jenkins VM to prepare it for cluster move. This failed and it's not really clear why. Relevant event logs below, suggesting that the hypervisor started dumping VM memory to the snapshot which caused a storage slowdown.
> {quote}2016-Jun-23, 18:06 Snapshot 'ngoldin_before_cluster_move' creation for VM 'jenkins-phx-ovirt-org' was initiated by admin.
> 2016-Jun-23, 18:09 Failed to create live snapshot 'ngoldin_before_cluster_move' for VM 'jenkins-phx-ovirt-org'. VM restart is recommended. Note that using the created snapshot might cause data inconsistency.
> 2016-Jun-23, 18:13 Host ovirt-srv02 has network interface which exceeded the defined threshold [95%] (em1: transmit rate[100%], receive rate [0%])
> 2016-Jun-23, 18:13 Storage domain Production experienced a high latency of 18.7802 seconds from host ovirt-srv11. This may cause performance and functional issues. Please consult your Storage Administrator.{quote}
--
This message was sent by Atlassian JIRA
(v1000.98.4#100004)
8 years, 5 months
[JIRA] (OVIRT-609) Jenkins snapshot creation failed
by Evgheni Dereveanchin (oVirt JIRA)
[ https://ovirt-jira.atlassian.net/browse/OVIRT-609?page=com.atlassian.jira... ]
Evgheni Dereveanchin updated OVIRT-609:
---------------------------------------
Description:
[~ngoldin(a)redhat.com] issued a live snapshot creation on the Jenkins VM to prepare it for cluster move. This failed and it's not really clear why. Relevant event logs below, suggesting that the hypervisor started dumping VM memory to the snapshot which caused a storage slowdown.
{quote}2016-Jun-23, 18:06 Snapshot 'ngoldin_before_cluster_move' creation for VM 'jenkins-phx-ovirt-org' was initiated by admin.
2016-Jun-23, 18:09 Failed to create live snapshot 'ngoldin_before_cluster_move' for VM 'jenkins-phx-ovirt-org'. VM restart is recommended. Note that using the created snapshot might cause data inconsistency.
2016-Jun-23, 18:13 Host ovirt-srv02 has network interface which exceeded the defined threshold [95%] (em1: transmit rate[100%], receive rate [0%])
2016-Jun-23, 18:13 Storage domain Production experienced a high latency of 18.7802 seconds from host ovirt-srv11. This may cause performance and functional issues. Please consult your Storage Administrator.{quote}
was:
[~ngoldin(a)redhat.com] issued a live snapshot creation on the Jenkins VM to prepare it for cluster move. This failed and it's not really clear why. Relevant event logs below, suggesting that the hypervisor started dumping VM memory to the snapshot which caused a storage slowdown.
2016-Jun-23, 18:06 Snapshot 'ngoldin_before_cluster_move' creation for VM 'jenkins-phx-ovirt-org' was initiated by admin.
2016-Jun-23, 18:09 Failed to create live snapshot 'ngoldin_before_cluster_move' for VM 'jenkins-phx-ovirt-org'. VM restart is recommended. Note that using the created snapshot might cause data inconsistency.
2016-Jun-23, 18:13 Host ovirt-srv02 has network interface which exceeded the defined threshold [95%] (em1: transmit rate[100%], receive rate [0%])
2016-Jun-23, 18:13 Storage domain Production experienced a high latency of 18.7802 seconds from host ovirt-srv11. This may cause performance and functional issues. Please consult your Storage Administrator.
> Jenkins snapshot creation failed
> --------------------------------
>
> Key: OVIRT-609
> URL: https://ovirt-jira.atlassian.net/browse/OVIRT-609
> Project: oVirt - virtualization made easy
> Issue Type: Bug
> Reporter: Evgheni Dereveanchin
> Assignee: infra
>
> [~ngoldin(a)redhat.com] issued a live snapshot creation on the Jenkins VM to prepare it for cluster move. This failed and it's not really clear why. Relevant event logs below, suggesting that the hypervisor started dumping VM memory to the snapshot which caused a storage slowdown.
> {quote}2016-Jun-23, 18:06 Snapshot 'ngoldin_before_cluster_move' creation for VM 'jenkins-phx-ovirt-org' was initiated by admin.
> 2016-Jun-23, 18:09 Failed to create live snapshot 'ngoldin_before_cluster_move' for VM 'jenkins-phx-ovirt-org'. VM restart is recommended. Note that using the created snapshot might cause data inconsistency.
> 2016-Jun-23, 18:13 Host ovirt-srv02 has network interface which exceeded the defined threshold [95%] (em1: transmit rate[100%], receive rate [0%])
> 2016-Jun-23, 18:13 Storage domain Production experienced a high latency of 18.7802 seconds from host ovirt-srv11. This may cause performance and functional issues. Please consult your Storage Administrator.{quote}
--
This message was sent by Atlassian JIRA
(v1000.98.4#100004)
8 years, 5 months