Thin pool best proctices
by jeremy_tourville@hotmail.com
After a week of downtime from a thin pool that had 100% full metadata I decided to rebuild my single node. My post was formerly titled: Ovirt 4.2.7 won't start and drops to emergency console. I have backups of nearly all my VMs and I am working on importing them. I went from version 4.2.7.1 to 4.3.6. I am comfortable with LVM but I am new to administering thin pool LVM. I am asking myself what can I do to avoid this headache in the future? What practical steps should I take?
-Regular backups
-Monitoring
-Proper configuration
It's the last one that is the most difficult for me. You don't know what you don't know.... Can anyone share tips and advice?
I learned that /etc/LVM has backups. It should be part of the backup strategy. I now know that thin pools have a special metadata volume that should be monitored for disk space.
At the time I built the system I used the gedploy gluster config file. I had assumed the values in there were "safe". As I found out the hard way, reality proved otherwise. I was reading elsewhere about allowing the thin volume to grow 20% as needed but no real mention of how to do that or a good example. As I understand it, the config to make that happen exists in lvm.conf. Is that right?
Have there been any changes in Ansible GlusterFS setup since 4.2.7 to choose different "safe" values with newer versions?
I hope my comments won't be taken as critical by any Dev Team members regarding Ansible. My sole intent for this post is so I learn what to do better/right next time. I do have most of my VMs and the few I lost are not critical. It was lucky that I learned a hard lesson the easy way.
Thanks for your input!