On Sun, Aug 30, 2020 at 8:46 PM Nir Soffer <nsoffer@redhat.com> wrote:


On Fri, Aug 28, 2020, 08:36 Germano Veit Michel <germano@redhat.com> wrote:


On Fri, Aug 28, 2020 at 9:29 AM Nir Soffer <nsoffer@redhat.com> wrote:


On Thu, Aug 27, 2020, 16:38 Tal Nisan <tnisan@redhat.com> wrote:


On Fri, Aug 21, 2020 at 4:34 AM Germano Veit Michel <germano@redhat.com> wrote:
Hi,

Is there a reliable way to figure out if a snapshot is in preview only using information obtained from the storage domain metadata?
I'm trying to find a way to distinguish a problematic snapshot chain (double parent) from a snapshot in preview in order to improve dump-volume chains.

Currently dump-volume-chains throws an error (DuplicateParentError) if a snapshot is in preview for the image, as there is a 'Y' shape split in the chain
with 2 volumes (previous chain + preview) pointing to a common parent:

   image:    dff0a0c0-b731-4e5b-9f32-d97310ca40de

             Error: more than one volume pointing to the same parent volume e.g: (_BLANK_UUID<-a), (a<-b), (a<-c)

             Unordered volumes and children:

             - e6c7bec0-53c6-4729-a4a0-a9b3ef2b8c38 <- 5eb2b29d-82d6-4337-8511-3c86705d566e
               status: OK, voltype: LEAF, format: COW, legality: LEGAL, type: SPARSE, capacity: 1073741824, truesize: 1073741824

             - e0475853-4514-4464-99e7-b185cce9b67d <- deceff83-9d88-4f87-8304-d5bf74d119b1
               status: OK, voltype: LEAF, format: COW, legality: LEGAL, type: SPARSE, capacity: 1073741824, truesize: 1073741824

             - e6c7bec0-53c6-4729-a4a0-a9b3ef2b8c38 <- e0475853-4514-4464-99e7-b185cce9b67d
               status: OK, voltype: INTERNAL, format: COW, legality: LEGAL, type: SPARSE, capacity: 1073741824, truesize: 1073741824

             - 00000000-0000-0000-0000-000000000000 <- e6c7bec0-53c6-4729-a4a0-a9b3ef2b8c38
               status: OK, voltype: INTERNAL, format: RAW, legality: LEGAL, type: PREALLOCATED, capacity: 1073741824, truesize: 1073741824

From the engine side it's easy, but I'd need to solve this problem using only metadata from the storage.

The only thing I could think of is that one of the volumes pointing to the common parent has voltype LEAF. Any better ideas?
don't think that there is any, Engine is the orchestrator and due to that the info is only in the database

There is no good way, but you can look at the length of the chain, and the "ctime" value.

For example if this was the original chain:

a <- b <- c

if we preview a:

a <- b <- c
a <- d

You know that d is a preview volume.

If we preview b, we will have two chains of same length:

a <- b <- c
a <- b <- d

But the ctime value of d will be larger, since preview is created after
the leaf was created.

ctime is using time.time() so it is not affected by time zone changes
but it may be wrong due to host time changes, so it is not reliable.

Can you open a bug for this?

I have a prototype working with some code I pasted in the bugzilla, but I don't think it's reliable and an overcomplication of what should be simple.

I don't think the code in the bug is the way to handle this.

It will be simpler and more useful to:
1. Find leaves
2. Follow the chain from each leaf, until the base (volume with no parent).
3. Display a tree instead of list, like lsblk.

For example:

dbf1e90c-41d5-4c2d-a8d2-15f2f04d3561 
├─ea6af566-922c-4ca2-af17-67f7cd08826c
└─aa5643ef-8c74-4b28-91e0-8d45d6ee426b
  └─30c4f6d1-7f1d-470b-96ae-7594cf367dfa
I like the idea of this visual representation, but it does not fix the problem.

The problem is dump-volume-chains throwing incorrect errors in case there is a snapshot in preview.

Error: more than one volume pointing to the same parent volume e.g: (_BLANK_UUID<-a), (a<-b), (a<-c)

There is still a double parent on the representation above. So if the analysis is done (text output), there will
be an error detected no matter how we print it. If there is no way to distinguish a preview from a double parent
problem without leaving any doubt based only on storage metadata only then we can improve the
representation but ultimately the problem remains there.

Ideally I'd like to keep DoubleParentError logic and detect Previews to eliminate the false errors.

The analysis should be done in the image discrepancy tool on the engine, which has dump-volume-chains
output (json - no analysis) and the engine db. And we are already doing some basic checks there. Maybe
we should even move the entire analysis logic there and make dump-volume-chains just print and dump
data without doing analysis if the analysis cannot be done based on partial data.

The main idea here was to simply stop false errors for those who look for them in dump-volume-chains
text output.
 

Users of the tool will have to check engine db to understand how to fix the disk.

Even if it was easy to detect a volume in preview, how do you know which chain
should be kept? Did it fail just after the user asked to commit the preview?

This tool is not used to diagnose and correct issues on its own. It is used for 2 things, but mainly the first:
a) Nice readable way to see volumes and their metadata, plus chain
b) Any obvious errors

The duplicate parent is printing false problems during preview, breaking the tool for B.

The main use is still A, use of dump-volume-chains is to stop collecting /dev/VG/metadata LV or *.meta files
and have this info for the volumes in the sosreport.

I'm not aware of anyone using just the output of the tool to perform chain changes, every failure
also requires checking the DB too and most importantly the logs (unless rotated).


Storage format does not have a way to store info about the state of the disk, or make atomic
changes like remove one chain when committing after a preview. This is also the reason we
have trouble with removing snapshots.

Which means we cannot know for sure what is happening in the chain, right?
With this in mind, any suggestion to stop the false errors?

Since we cannot be sure of this based just on SD metadata, maybe the simplest is to remove the
duplicate parent error string and/or add some warning that it could be a snapshot in preview and just
print the unordered volumes.

The improved visual representation could be handled separate from this. I've thought of something
similar in the past but found hard to print the volume metadata in a nice way (and we need to handle
big chains of several dozen snapshots).

Thanks,
Germano
 

Nir