[Users] GSoC 14 Idea Discussion - virt-sparsify integration

Wed Mar 19 23:33:43 UTC 2014

----- Original Message -----
> From: "Utkarsh Singh" <utkarshsins at gmail.com>
> To: users at ovirt.org
> Cc: fsimonce at redhat.com
> Sent: Friday, March 7, 2014 6:16:46 PM
> Subject: GSoC 14 Idea Discussion - virt-sparsify integration
> 
> Hello,
> 
> I am Utkarsh, a 4th year undergrad from IIT Delhi and a GSoC-14
> aspirant. I have been lately involved in an ongoing project Baadal
> Cloud Computing Platform in my institute, which has got me interested
> in oVirt for a potential GSoC project.
> 
> I was going through the virt-sparsify integration project idea. I have
> gone through the architecture documentation on the oVirt website. As
> far as I understand, the virt-sparsify integration needs to be done on
> the VDSM daemon, and it's control is either going to be completely
> independent of ovirt-engine (for example running it once every 24
> hours), or it's something that is centrally controlled by the
> ovirt-engine through XML/RPC calls. The details are not specified in
> the project ideas page. I would like to ask -

The request to sparsifying the image is controlled by ovirt-engine.
The user will pick one (or eventually more) disk(s) that are not in use
(vm down) and he'll request to sparsify it/them.

> 1. What would be the proposed ideal implementation? (Central-Control
> or Independent-Control)

Central-Control

> 2. Is virt-sparsify usage going to be automated or
> administrator-triggered, or a combination of both?

administrator-triggered

> There are some aspects of the idea, which I would like to discuss
> before I start working on a proposal.
> 
> It's not necessary that an automated usage of virt-sparsify is limited
> to any simple idea. Architecture documentation states that
> ovirt-engine has features like Monitoring that would allow
> administrators (and possibly users) to be aware of vm-guest
> performance as well as vm-host performance. I am not very sure about
> how this data is collected, Is it done through MoM, or Is this
> directly done by VDSM, or is someone else doing this (for hosts). It
> would be great if someone can explain that to me. This information
> about vm-guest usage and vm-host health can help in determining how
> virt-sparsify is to be used.

The vm/hosts statistics are gathered and provided by VDSM.
Anyway I would leave this part out at the moment. The virt-sparsify
command is a long running task and in the current architecture it
can be only an SPM task.
There is some ongoing work to remove the pool and the SPM (making
virt-sparsify operable by any host) but I wouldn't block on that.

> I am also not very clear about the Shared Storage component in the
> architecture. Does oVirt make any assumptions about the Shared
> Storage. For example, the performance difference between running
> virt-sparsify on NFS as compared to running it (if possible) directly
> on storage hardware. If the Storage solution is necessarily a NAS
> instance, then virt-sparsify on NFS mount is the only option.

The storage connections are already managed by vdsm and the volume
chains are maintained transparently in /rhev/data-center/...
There are few differences between image files on NFS/Gluster and images
stored on LVs but with regard to how to reach them it is transparent
(similar path).

> Right now, I am in the process of setting up oVirt on my system, and
> getting more familiar with the architecture. Regarding my experience.
> I am acquainted with both Java and Python. I have little experience
> with JBoss, but I have worked on some other Web Application Servers
> like web2py and Play Framework. My involvement in Baadal Platform has
> got me acquainted with libvirt/QEMU, the details of which I have
> mentioned below (if anyone is interested).

Depending on the amount of time that you can dedicate to this project
it seems that you could tackle both the vdsm and ovirt-engine parts.

-- 
Federico