Hi,

We did compare engine backups and found some differences in the LUNs

"public"."luns"  (restored db from 2020.04.09)
------------------------------------------------------------------------
physical_volume_id                      lun_id                volume_group_id                           serial                                 lun_mapping  vendor_id   product_id   device_size    discard_max_size
wEx3tY-OELy-gOtD-CFDp-az4D-EyYO-1SAAqd  repl_HanaDB_osd_01    a1q5Jr-Bd7h-wEVJ-9b0C-Ggnr-M1JI-kyXeDV    SHUAWEI_XSG1_2102350RMG10HC0000210053  6            HUAWEI      XSG1         4096           268435456
DPUtaW-Q5zp-aZos-HriP-5Z0v-hiWO-w7rmwG  repl_HanaLogs_osd_01  4TCXZ7-R1l1-xkdU-u0vx-S3n4-JWcE-qksPd1    SHUAWEI_XSG1_2102350RMG10HC0000200035  7            HUAWEI      XSG1         2048           268435456

"public"."luns"  (current db)
------------------------------------------------------------------------
physical_volume_id                      lun_id                volume_group_id                           serial                                 lun_mapping  vendor_id   product_id   device_size    discard_max_size
wEx3tY-OELy-gOtD-CFDp-az4D-EyYO-1SAAqd  repl_HanaDB_osd_01    a1q5Jr-Bd7h-wEVJ-9b0C-Ggnr-M1JI-kyXeDV    SHUAWEI_XSG1_2102350RMG10HC0000210053  6            HUAWEI      XSG1         4096           268435456
                                        repl_HanaLogs_osd_01                                            SHUAWEI_XSG1_2102350RMG10HC0000210054  7            HUAWEI      XSG1         2548           268435456

We observed that the physical_volume_id and volume_group_id is missing from the corrupt SD.
We also observed that the serial has changed on the corrupted SD/LUN. 
Is the serial calculated or read somewhere?
Would it be possible to inject the missing values in the engine DB to recover to a consistent state?

Thanks for any help.
Arsene
 

On Wed, 2020-07-15 at 13:24 +0300, Ahmad Khiet wrote:
Hi Arsène, 

can you please send which version are you referring to?

as shown in the log: Storage domains with IDs [6b82f31b-fa2a-406b-832d-64d9666e1bcc] could not be synchronized. To synchronize them, please move them to maintenance and then activate.
can you put them in maintenance and then activate them back so it will be synced?
I guess that it is out of sync, that's why the "Add" button appears to already added LUNs



On Tue, Jul 14, 2020 at 4:58 PM Arsène Gschwind <arsene.gschwind@unibas.ch> wrote:
Hi Ahmad,

I did the following:

1. Storage -> Storage Domains
2 Click the existing Storage Domain and click "Manage Domain"
and then I see next to the LUN which is already part of the SD the "Add" button

I do not want to click add since it may destroy the existing SD or the content of the LUNs.
In the Engine Log I see the following:

020-07-14 09:57:45,131+02 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-20) [277145f2] EVENT_ID: STORAGE_DOMAINS_COULD_NOT_BE_SYNCED(1,046), Storage domains with IDs [6b82f31b-fa2a-406b-832d-64d9666e1bcc] could not be synchronized. To synchronize them, please move them to maintenance and then activate.

Thanks a lot

On Tue, 2020-07-14 at 16:07 +0300, Ahmad Khiet wrote:
Hi Arsène Gschwind, 

it's really strange that you see "Add" on a LUN that already has been added to the database.
to verify the steps you did make at first,
1- Storage -> Storage Domains
2- New Domain - [ select iSCSI ]
3- click on "+" on the iscsi target, then you see the "Add" button is available
4- after clicking add and ok, then this error will be shown in the logs
is that right?

can you also attach vdsm log?




On Tue, Jul 14, 2020 at 1:15 PM Arsène Gschwind <arsene.gschwind@unibas.ch> wrote:
Hello all,

I've checked all my multipath configuration and everything seems korrekt.
Is there a way to correct this, may be in the DB?

I really need some help, thanks a lot.
Arsène

On Tue, 2020-07-14 at 00:29 +0000, Arsène Gschwind wrote:
HI,

I'm having a strange behavior with a SD. When trying to manage the SD I see they "Add" button for the LUN which should already be the one use for that SD.
In the Logs I see the following:

2020-07-13 17:48:07,292+02 ERROR [org.ovirt.engine.core.dal.dbbroker.BatchProcedureExecutionConnectionCallback] (EE-ManagedThreadFactory-engine-Thread-95) [51091853] Can't execute batch: Batch entry 0 select * from public.insertluns(CAST ('repl_HanaLogs_osd_01' AS varchar),CAST ('DPUtaW-Q5zp-aZos-HriP-5Z0v-hiWO-w7rmwG' AS varchar),CAST ('4TCXZ7-R1l1-xkdU-u0vx-S3n4-JWcE-qksPd1' AS varchar),CAST ('SHUAWEI_XSG1_2102350RMG10HC0000200035' AS varchar),CAST (7 AS int4),CAST ('HUAWEI' AS varchar),CAST ('XSG1' AS varchar),CAST (2548 AS int4),CAST (268435456 AS int8)) as result was aborted: ERROR: duplicate key value violates unique constraint "pk_luns"
  Detail: Key (lun_id)=(repl_HanaLogs_osd_01) already exists.
  Where: SQL statement "INSERT INTO LUNs (
        LUN_id,
        physical_volume_id,
        volume_group_id,
        serial,
        lun_mapping,
        vendor_id,
        product_id,
        device_size,
        discard_max_size
        )
    VALUES (
        v_LUN_id,
        v_physical_volume_id,
        v_volume_group_id,
        v_serial,
        v_lun_mapping,
        v_vendor_id,
        v_product_id,
        v_device_size,
        v_discard_max_size
        )"
PL/pgSQL function insertluns(character varying,character varying,character varying,character varying,integer,character varying,character varying,integer,bigint) line 3 at SQL statement  Call getNextException to see other errors in the batch.
2020-07-13 17:48:07,292+02 ERROR [org.ovirt.engine.core.dal.dbbroker.BatchProcedureExecutionConnectionCallback] (EE-ManagedThreadFactory-engine-Thread-95) [51091853] Can't execute batch. Next exception is: ERROR: duplicate key value violates unique constraint "pk_luns"
  Detail: Key (lun_id)=(repl_HanaLogs_osd_01) already exists.
  Where: SQL statement "INSERT INTO LUNs (
        LUN_id,
        physical_volume_id,
        volume_group_id,
        serial,
        lun_mapping,
        vendor_id,
        product_id,
        device_size,
        discard_max_size
        )
    VALUES (
        v_LUN_id,
        v_physical_volume_id,
        v_volume_group_id,
        v_serial,
        v_lun_mapping,
        v_vendor_id,
        v_product_id,
        v_device_size,
        v_discard_max_size
        )"
PL/pgSQL function insertluns(character varying,character varying,character varying,character varying,integer,character varying,character varying,integer,bigint) line 3 at SQL statement
2020-07-13 17:48:07,293+02 INFO  [org.ovirt.engine.core.utils.transaction.TransactionSupport] (EE-ManagedThreadFactory-engine-Thread-95) [51091853] transaction rolled back
2020-07-13 17:48:07,293+02 ERROR [org.ovirt.engine.core.bll.storage.domain.SyncLunsInfoForBlockStorageDomainCommand] (EE-ManagedThreadFactory-engine-Thread-95) [51091853] Command 'org.ovirt.engine.core.bll.storage.domain.SyncLunsInfoForBlockStorageDomainCommand' failed: ConnectionCallback; ]; ERROR: duplicate key value violates unique constraint "pk_luns"
  Detail: Key (lun_id)=(repl_HanaLogs_osd_01) already exists.
  Where: SQL statement "INSERT INTO LUNs (
        LUN_id,
        physical_volume_id,
        volume_group_id,
        serial,
        lun_mapping,
        vendor_id,
        product_id,
        device_size,
        discard_max_size
        )
    VALUES (
        v_LUN_id,
        v_physical_volume_id,
        v_volume_group_id,
        v_serial,
        v_lun_mapping,
        v_vendor_id,
        v_product_id,
        v_device_size,
        v_discard_max_size
        )"

It looks like the engine will add a LUN to an SD and it already exist...
Any Idea how to resolve that problem?

Thanks a lot


-- 

Arsène Gschwind <arsene.gschwind@unibas.ch>
Universitaet Basel
_______________________________________________
Users mailing list -- 
users@ovirt.org

To unsubscribe send an email to 
users-leave@ovirt.org

Privacy Statement: 
https://www.ovirt.org/privacy-policy.html

oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/

List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/2YE7ZX53W4WDLHJW34P5CQTGTHW4RJGY/

-- 

Arsène Gschwind <arsene.gschwind@unibas.ch>
Universitaet Basel
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-leave@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/
List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/4UNSPGYZPYRQDZNF4GMFROK4GIDYPHS7/


-- 

Arsène Gschwind
Fa. Sapify AG im Auftrag der universitaet Basel
IT Services
Klinelbergstr. 70 | CH-4056 Basel | Switzerland
Tel: +41 79 449 25 63 | http://its.unibas.ch
ITS-ServiceDesk: support-its@unibas.ch | +41 61 267 14 11


-- 
Arsène Gschwind
Fa. Sapify AG im Auftrag der universitaet Basel
IT Services
Klinelbergstr. 70 | CH-4056 Basel | Switzerland
Tel: +41 79 449 25 63 | http://its.unibas.ch
ITS-ServiceDesk: support-its@unibas.ch | +41 61 267 14 11