Hi Benny,
I used the tool to track one of the illegal volumes:
image: e05874d2-fb8a-4fd2-94ff-2f4bc6438d47
[...]
- 887f486b-15cf-4083-9b35-8b7821a7841a
status: ILLEGAL, voltype: LEAF, format: COW, legality: ILLEGAL, type: SPARSE
So I tracked 887f486b-15cf-4083-9b35-8b7821a7841a in the logs and I saw:
2018-06-16 04:46:20,818+01 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVolumeInfoVDSC ommand] (pool-5-thread-3) [cfc392ec-dc9f-418d-8156-d05c8 e7ab9f8] START, GetVolumeInfoVDSCommand(HostNa me = host.domain.es, GetVolumeInfoVDSCommandParamet ers:{expectedEngineErrors='[Vo lumeDoesNotExist]', runAsync='true', hostId='b2dfb945-d767-44aa-a54 7-2d1a4381f8e3', storagePoolId='75bf8f48-970f-4 2bc-8596-f8ab6efb2b63', storageDomainId='110ea376-d789 -40a1-b9f6-6b40c31afe01', imageGroupId='e05874d2-fb8a-4f d2-94ff-2f4bc6438d47', imageId='887f486b-15cf-4083-9b 35-8b7821a7841a'}), log id: 2a795424
2018-06-16 04:46:22,256+01 ERROR [org.ovirt.engine.core.bll.DestroyImageCheckCommand] (pool-5-thread-3) [cfc392ec-dc9f-418d-8156-d05c8 e7ab9f8] The following images were not removed: [887f486b-15cf-4083-9b35-8b782 1a7841a]
2018-06-16 04:47:44,900+01 ERROR [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDis kLiveCommand] (DefaultQuartzScheduler10) [cfc392ec-dc9f-418d-8156-d05c8 e7ab9f8] Snapshot '7b6f43ac-d3ad-47b2-8882-f5dcc d74cf07' images '887f486b-15cf-4083-9b35-8b782 1a7841a'..'538600a5-31ab-40af- b326-d56bfc92bb0b' merged, but volume removal failed. Some or all of the following volumes may be orphaned: [887f486b-15cf-4083-9b35-8b782 1a7841a]. Please retry Live Merge on the snapshot to complete the operation.
Can you provide some additional steps?
Thank you!
El 2018-06-18 18:27, Benny Zlotnik escribió:
We prevent starting VMs with illegal images[1]
You can use "$ vdsm-tool dump-volume-chains"
to look for illegal images and then look in the engine log for the
reason they became illagal,
if it's something like this, it usually means you can remove them:
63696:2018-06-15 09:41:58,134+01 ERROR
[org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDis kLiveCommand]
(DefaultQuartzScheduler2) [6fa97ea4-8f61-4a48-8e08-a8bb1b9de826]
Merging of snapshot 'e609d6cc-2025-4cf0-ad34-03519131cdd1' images
'1d01c6c8-b61e-42bc-a054-f04c3f792b10'..'ef6f732e-2a7a-4a14- a10f-bcc88bdd805f'
failed. Images have been marked illegal and can no longer be previewed
or reverted to. Please retry Live Merge on the snapshot to complete
the operation.
On Mon, Jun 18, 2018 at 5:46 PM, <nicolas@devels.es> wrote:
Indeed, when the problem started I think the SPM was the host Iorg.springframework.jdbc.datas
added as VDSM log in the first e-mail. Currently it is the one I
sent in the second mail.
FWIW, if it helps to debug more fluently, we can provide VPN access
to our infrastructure so you can access and see whateve you need
(all hosts, DB, etc...).
Right now the machines that keep running work, but once shut down
they start showing the problem below...
Thank you
El 2018-06-18 15:20, Benny Zlotnik escribió:
I'm having trouble following the errors, I think the SPM changed or
the vdsm log from the right host might be missing.
However, I believe what started the problems is this transaction
timeout:
2018-06-15 14:20:51,378+01 ERROR
[org.ovirt.engine.core.bll.tasks.CommandAsyncTask]
(org.ovirt.thread.pool-6-thread-29)
[1db468cb-85fd-4189-b356-d31781461504] [within thread]: endAction
for
action type RemoveSnapshotSingleDisk threw an exception.:
org.springframework.jdbc.CannotGetJdbcConnectionException: Could
not
get JDBC Connection; nested exception is java.sql.SQLException:
javax.resource.ResourceException: IJ000460: Error checking for a
transaction
at
ource.DataSourceUtils.getConne ction(DataSourceUtils.java:80)
[spring-jdbc.jar:4.2.4.RELEASEorg.springframework.jdbc.core.]
at
JdbcTemplate.execute(JdbcTempl ate.java:615)
[spring-jdbc.jar:4.2.4.RELEASEorg.springframework.jdbc.core.]
at
JdbcTemplate.query(JdbcTemplat e.java:680)
[spring-jdbc.jar:4.2.4.RELEASEorg.springframework.jdbc.core.]
at
JdbcTemplate.query(JdbcTemplat e.java:712)
[spring-jdbc.jar:4.2.4.RELEASEorg.springframework.jdbc.core.]
at
JdbcTemplate.query(JdbcTemplat e.java:762)
[spring-jdbc.jar:4.2.4.RELEASEorg.ovirt.engine.core.dal.dbbr]
at
oker.PostgresDbEngineDialect$P ostgresSimpleJdbcCall.executeC allInternal(PostgresDbEngineDi alect.java:152)
[dal.jar:]https://wetransfer.com/downloa
This looks like a bug
Regardless, I am not sure restoring a backup would help since you
probably have orphaned images on the storage which need to be
removed
Adding Ala
On Mon, Jun 18, 2018 at 4:19 PM, <nicolas@devels.es> wrote:
Hi Benny,
Please find the SPM logs at [1].
Thank you
[1]:
ds/62bf649462aabbc2ef21824682b 0a08320180618131825/036b7782f5 8d337baf909a7220d8455320180618 131825/5550ee
[1]https://wetransfer.com/downloa
[1]
El 2018-06-18 13:19, Benny Zlotnik escribió:
Can you send the SPM logs as well?
On Mon, Jun 18, 2018 at 1:13 PM, <nicolas@devels.es> wrote:
Hi Benny,
Please find the logs at [1].
Thank you.
[1]:
ds/12208fb4a6a5df3114bbbc10af1 94c8820180618101223/647c066b7b 91096570def304da86dbca20180618 101223/583d3d
https://lists.ovirt.org/archivhttps://www.ovirt.org/communit[2]
[2]
[1]
El 2018-06-18 09:28, Benny Zlotnik escribió:
Can you provide full engine and vdsm logs?
On Mon, Jun 18, 2018 at 11:20 AM, <nicolas@devels.es> wrote:
Hi,
We're running oVirt 4.1.9 (we cannot upgrade at this time) and
we're having a major problem in our infrastructure. On friday, a
snapshots were automatically created on more than 200 VMs and as
this was just a test task, all of them were deleted at the same
time, which seems to have corrupted several VMs.
When trying to delete a snapshot on some of the VMs, a "General
error" is thrown with a NullPointerException in the engine log
(attached).
But the worst part is that when some of these machines is powered
off and then powered on, the VMs are corrupt...
VM myvm is down with error. Exit message: Bad volume specification
{u'index': 0, u'domainID': u'110ea376-d789-40a1-b9f6-6b40c31afe01',
'reqsize': '0', u'format': u'cow', u'bootOrder': u'1', u'address':
{u'function': u'0x0', u'bus': u'0x00', u'domain': u'0x0000',
u'type': u'pci', u'slot': u'0x06'}, u'volumeID':
u'1fd0f9aa-6505-45d2-a17e-859bd5dd4290', 'apparentsize':
'23622320128', u'imageID': u'65519220-68e1-462a-99b3-f0763c78eae2',
u'discard': False, u'specParams': {}, u'readonly': u'false',
u'iface': u'virtio', u'optional': u'false', u'deviceId':
u'65519220-68e1-462a-99b3-f0763c78eae2', 'truesize': '23622320128',
u'poolID': u'75bf8f48-970f-42bc-8596-f8ab6efb2b63', u'device':
u'disk', u'shared': u'false', u'propagateErrors': u'off', u'type':
u'disk'}.
We're really frustrated by now and don't know how to procceed... We
have a DB backup (with engine-backup) from thursday which would
have
a "sane" DB definition without all the snapshots, as they were all
created on friday. Would it be safe to restore this backup?
Any help is really appreciated...
Thanks.
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/ [3]
[3]
[2]
[1]
oVirt Code of Conduct:
y/about/community-guidelines/ [4] [4]
[3]
[2]
List Archives:
es/list/users@ovirt.org/messag e/P5OOGBL3BRZIQ2I46FYELBUIIWT5 QK4C/
[5]https://lists.ovirt.org/archiv
[5]
[4]
[3]
Links:
------
[1] https://www.ovirt.org/site/privacy-policy/ [3] [3] [2]
[2] https://www.ovirt.org/community/about/community-guidelines/ [4]
[4]
[3]
[3]
es/list/users@ovirt.org/messag e/P5OOGBL3BRZIQ2I46FYELBUIIWT5 QK4C/
[5]https://wetransfer.com/downloa
[5]
[4]
Links:
------
[1]
ds/12208fb4a6a5df3114bbbc10af1 94c8820180618101223/647c066b7b 91096570def304da86dbca20180618 101223/583d3d
[2]https://lists.ovirt.org/archiv
[2]
[2] https://www.ovirt.org/site/privacy-policy/ [3] [3]
[3] https://www.ovirt.org/community/about/community-guidelines/ [4]
[4]
[4]
es/list/users@ovirt.org/messag e/P5OOGBL3BRZIQ2I46FYELBUIIWT5 QK4C/
[5]https://wetransfer.com/downloa
[5]
Links:
------
[1]
ds/62bf649462aabbc2ef21824682b 0a08320180618131825/036b7782f5 8d337baf909a7220d8455320180618 131825/5550ee
[1]https://wetransfer.com/downloa
[2]
ds/12208fb4a6a5df3114bbbc10af1 94c8820180618101223/647c066b7b 91096570def304da86dbca20180618 101223/583d3d
[2]https://lists.ovirt.org/archiv
[3] https://www.ovirt.org/site/privacy-policy/ [3]
[4] https://www.ovirt.org/community/about/community-guidelines/ [4]
[5]
es/list/users@ovirt.org/messag e/P5OOGBL3BRZIQ2I46FYELBUIIWT5 QK4C/
[5]
Links:
------
[1]
https://wetransfer.com/downloads/62bf649462aabbc2ef21824682b 0a08320180618131825/036b7782f5 8d337baf909a7220d8455320180618 131825/5550ee
[2]
https://wetransfer.com/downloads/12208fb4a6a5df3114bbbc10af1 94c8820180618101223/647c066b7b 91096570def304da86dbca20180618 101223/583d3d
[3] https://www.ovirt.org/site/privacy-policy/
[4] https://www.ovirt.org/community/about/community-guidelines/
[5]
https://lists.ovirt.org/archives/list/users@ovirt.org/messag e/P5OOGBL3BRZIQ2I46FYELBUIIWT5 QK4C/