Anyone using gluster storage domain with WAN geo-rep?
by Steve Dainard
I'm currently using a two node combined virt/storage setup with Ovirt 3.3.4
and Gluster 3.4.2 (replica 2, glusterfs storage domain). I'll call this
pair PROD.
I'm then geo-replicating to another gluster replica pair on the local net,
btrfs underlying storage, and volume snapshots so I can recover my storage
domain from different points in time if necessary. Its also local so
restore time is much better than off-site. I'll call this pair BACKUP.
I'm planning on setting up geo-replication from BACKUP to an EC2 gluster
target. I'll call this host EC2HOST.
PROD ---geo-rep-lan---> BACKUP ---geo-rep-wan---> EC2HOST
I'd like to avoid saturating my WAN link during office hours. I have some
ideas (or combination of):
1. limit bandwidth during certain hours to the offsite hosts. But
realistically the bandwidth I would allocate is so low I don't see the
purpose of this. Also with 8 guests running, I'm noticing quite a bit of
data transfer to the local backup nodes (avg 6-8MB/s), and I'm thinking
there is a lot of thrashing going on which isn't useful to backup offsite
anyways.
2. stop WAN geo-replication during office hours, and restart for
overnight/weekend hours.
3. Not use geo-rep between BACKUP ---> EC2HOST, use rsync on one of the
btrfs volume snapshots so we avoid the thrashing. In this case I could
limit WAN speed to 1MB/s which should be fine for most differences
throughout the day.
So my question is, how do you off-site your storage domains, what
constraints have you identified and how have you dealt with them? And of
course how would you deal with the scenario I've oulined above?
Thanks,
*Steve*
10 years, 5 months
Hosted Engine error -243
by Kevin Tibi
Hi all,
I have a probleme with my hosted engine. Every 10 min i have a event in
engine :
VM HostedEngine is down. Exit message: internal error Failed to acquire
lock: error -243
My data is a local export NFS.
Thx for you help.
Kevin.
10 years, 5 months
difference between thin/depentend and clone/dependent vm virtual machine
by Tamer Lima
Hello,
I created VMs by two ways :
1) on tab virtual machines > new vm > template (centos_65_64bits)
1.1 configuration : I do not select stateless checkbox
1.2 this process takes a 1h30 to create each machine.
2) on tab pools > new vm > template (centos_65_64bits)
2.1 default configuration : stateless
2.2 Here I created 3 virtual machines at once
2.3 this process takes only one minute
On the tab virtual machines I can see all virtual machines.
Pooled machines have different icon image
and description is different too:
machines generated from tab VM are described as clone/dependent
- clone is a phisical copy?
machines generated from tab POOL are described as thin/independent
- thin is a just a reference to template vm ? what is phisical? any
configuration file?
In practice, what is the difference between these machines ?
http://www.ovirt.org/Features/PrestartedVm
"Today there are 2 types of Vm pools:
1. Manual - the Vm is supposed to be manually returned to the pool. In
practice, this is not really entirely supported.
2. Automatic - once the user shuts down the Vm - it returns to the pool
(stateless)."
all vm created from pool are stateless ?
thanks
10 years, 5 months
[ACTION REQUESTED] please review new oVirt look-and-feel patch
by Greg Sheremeta
Hi,
A while back, I sent out an email for the new oVirt look-and-feel feature [1]. The new look and feel patch is ready for both code and UI review. At this point we're not looking for design review, although you are welcome to suggest design improvements. I'm mostly looking for help regression testing the entire UI.
Especially if you're an oVirt *UI* developer, please download the patch and try it [2]. The patch introduces some new CSS globally, and I had to adjust many screens and dialogs. It's quite possible that there are dialogs that I don't know about, so if you maintain any especially-hidden dialogs, please try the patch and verify your dialogs look good.
And you can also just try out the patch to see the amazing new look and feel!
Thanks for your feedback.
Greg
[1] http://www.ovirt.org/Features/NewLookAndFeelPatternFlyPhase1
[2] http://gerrit.ovirt.org/#/c/24594/
Greg Sheremeta
Red Hat, Inc.
Sr. Software Engineer, RHEV
Cell: 919-807-1086
gshereme(a)redhat.com
10 years, 5 months
Re: [ovirt-users] is spice html5 console actually working
by Maurice James
----_com.android.email_1032220951304830
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: base64
VGhlcmUgYXJlIGEgZmV3IHN0ZXBzLiBEb3dubG9hZCB0aGUgQ0EgY2VydCBmcm9tIHlvdXIgbWFu
YWdlciBodHRwczovLzxvdmlydGFkZHJlc3M+L2NhLmNydApNYWtlIHN1cmUgaXQgaXMgdHJ1c3Rl
ZC4KTWFrZSBzdXJlIG92aXJ0LXdlYnByb3h5LXNvY2tldCBpcyBpbnN0YWxsZWQgYW5kIHJ1bm5p
bmcuCgpTZW50IGZyb20gbXkgR2FsYXh5IFPCrklJSQoKLS0tLS0tLS0gT3JpZ2luYWwgbWVzc2Fn
ZSAtLS0tLS0tLQpGcm9tOiBKZXJlbWlhaCBKYWhuIDxqZXJlbWlhaEBnb29kaW5hc3NvY2lhdGVz
LmNvbT4gCkRhdGU6MDQvMTcvMjAxNCAgOTo1NiBBTSAgKEdNVC0wNTowMCkgClRvOiB1c2Vyc0Bv
dmlydC5vcmcgClN1YmplY3Q6IFtvdmlydC11c2Vyc10gaXMgc3BpY2UgaHRtbDUgY29uc29sZSBh
Y3R1YWxseSB3b3JraW5nIAoKSGFzIGFueW9uZSBnb3R0ZW4gdGhlIGh0bWw1IHNwaWNlIGNvbnNv
bGUgdG8gd29yaywgYW5kIGRpZCB5b3UgaGF2ZSB0bwpkbyBhbnl0aGluZyBzcGVjaWFsIG90aGVy
IHRoYW4gZW5hYmxlIGl0P8KgIEkndmUgdHJpZWQgZXZlcnkgYnJvd3NlcgpleGNlcHQgb3BlcmEg
YW5kIGllIG9uIGxpbnV4IGFuZCBtYWMKX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f
X19fX19fX19fX19fX18KVXNlcnMgbWFpbGluZyBsaXN0ClVzZXJzQG92aXJ0Lm9yZwpodHRwOi8v
bGlzdHMub3ZpcnQub3JnL21haWxtYW4vbGlzdGluZm8vdXNlcnMK
----_com.android.email_1032220951304830
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: base64
PGh0bWw+PGhlYWQ+PG1ldGEgaHR0cC1lcXVpdj0iQ29udGVudC1UeXBlIiBjb250ZW50PSJ0ZXh0
L2h0bWw7IGNoYXJzZXQ9VVRGLTgiPjwvaGVhZD48Ym9keSA+PGRpdj5UaGVyZSBhcmUgYSBmZXcg
c3RlcHMuIERvd25sb2FkIHRoZSBDQSBjZXJ0IGZyb20geW91ciBtYW5hZ2VyIGh0dHBzOi8vJmx0
O292aXJ0YWRkcmVzcyZndDsvY2EuY3J0PC9kaXY+PGRpdj5NYWtlIHN1cmUgaXQgaXMgdHJ1c3Rl
ZC48L2Rpdj48ZGl2Pk1ha2Ugc3VyZSBvdmlydC13ZWJwcm94eS1zb2NrZXQgaXMgaW5zdGFsbGVk
IGFuZCBydW5uaW5nLjwvZGl2PjxkaXY+PGJyPjwvZGl2PjxkaXY+PGRpdiBzdHlsZT0iZm9udC1z
aXplOjlweDtjb2xvcjojNTc1NzU3Ij5TZW50IGZyb20gbXkgR2FsYXh5IFPCrklJSTwvZGl2Pjwv
ZGl2Pjxicj48YnI+LS0tLS0tLS0gT3JpZ2luYWwgbWVzc2FnZSAtLS0tLS0tLTxicj5Gcm9tOiBK
ZXJlbWlhaCBKYWhuIDxqZXJlbWlhaEBnb29kaW5hc3NvY2lhdGVzLmNvbT4gPGJyPkRhdGU6MDQv
MTcvMjAxNCAgOTo1NiBBTSAgKEdNVC0wNTowMCkgPGJyPlRvOiB1c2Vyc0BvdmlydC5vcmcgPGJy
PlN1YmplY3Q6IFtvdmlydC11c2Vyc10gaXMgc3BpY2UgaHRtbDUgY29uc29sZSBhY3R1YWxseSB3
b3JraW5nIDxicj48YnI+SGFzIGFueW9uZSBnb3R0ZW4gdGhlIGh0bWw1IHNwaWNlIGNvbnNvbGUg
dG8gd29yaywgYW5kIGRpZCB5b3UgaGF2ZSB0bzxicj5kbyBhbnl0aGluZyBzcGVjaWFsIG90aGVy
IHRoYW4gZW5hYmxlIGl0PyZuYnNwOyBJJ3ZlIHRyaWVkIGV2ZXJ5IGJyb3dzZXI8YnI+ZXhjZXB0
IG9wZXJhIGFuZCBpZSBvbiBsaW51eCBhbmQgbWFjPGJyPl9fX19fX19fX19fX19fX19fX19fX19f
X19fX19fX19fX19fX19fX19fX19fX19fPGJyPlVzZXJzIG1haWxpbmcgbGlzdDxicj5Vc2Vyc0Bv
dmlydC5vcmc8YnI+aHR0cDovL2xpc3RzLm92aXJ0Lm9yZy9tYWlsbWFuL2xpc3RpbmZvL3VzZXJz
PGJyPjwvYm9keT4=
----_com.android.email_1032220951304830--
10 years, 5 months
Re: [ovirt-users] hosted engine health check issues
by Martin Sivak
Hi René,
> >> libvirtError: Failed to acquire lock: No space left on device
> >> 2014-04-22 12:38:17+0200 654 [3093]: r2 cmd_acquire 2,9,5733 invalid
> >> lockspace found -1 failed 0 name 2851af27-8744-445d-9fb1-a0d083c8dc82
Can you please check the contents of /rhev/data-center/<your nfs mount>/<nfs domain uuid>/ha_agent/?
This is how it should look like:
[root@dev-03 ~]# ls -al /rhev/data-center/mnt/euryale\:_home_ovirt_he/e16de6a2-53f5-4ab3-95a3-255d08398824/ha_agent/
total 2036
drwxr-x---. 2 vdsm kvm 4096 Mar 19 18:46 .
drwxr-xr-x. 6 vdsm kvm 4096 Mar 19 18:46 ..
-rw-rw----. 1 vdsm kvm 1048576 Apr 23 11:05 hosted-engine.lockspace
-rw-rw----. 1 vdsm kvm 1028096 Mar 19 18:46 hosted-engine.metadata
The errors seem to indicate that you somehow lost the lockspace file.
--
Martin Sivák
msivak(a)redhat.com
Red Hat Czech
RHEV-M SLA / Brno, CZ
----- Original Message -----
> On 04/23/2014 12:28 AM, Doron Fediuck wrote:
> > Hi Rene,
> > any idea what closed your ovirtmgmt bridge?
> > as long as it is down vdsm may have issues starting up properly
> > and this is why you see the complaints on the rpc server.
> >
> > Can you try manually fixing the network part first and then
> > restart vdsm?
> > Once vdsm is happy hosted engine VM will start.
>
> Thanks for your feedback, Doron.
>
> My ovirtmgmt bridge seems to be on or isn't it:
> # brctl show ovirtmgmt
> bridge name bridge id STP enabled interfaces
> ovirtmgmt 8000.0025907587c2 no eth0.200
>
> # ip a s ovirtmgmt
> 7: ovirtmgmt: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
> state UNKNOWN
> link/ether 00:25:90:75:87:c2 brd ff:ff:ff:ff:ff:ff
> inet 10.0.200.102/24 brd 10.0.200.255 scope global ovirtmgmt
> inet6 fe80::225:90ff:fe75:87c2/64 scope link
> valid_lft forever preferred_lft forever
>
> # ip a s eth0.200
> 6: eth0.200@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
> noqueue state UP
> link/ether 00:25:90:75:87:c2 brd ff:ff:ff:ff:ff:ff
> inet6 fe80::225:90ff:fe75:87c2/64 scope link
> valid_lft forever preferred_lft forever
>
> I tried the following yesterday:
> Copy virtual disk from GlusterFS storage to local disk of host and
> create a new vm with virt-manager which loads ovirtmgmt disk. I could
> reach my engine over the ovirtmgmt bridge (so bridge must be working).
>
> I also started libvirtd with Option -v and I saw the following in
> libvirtd.log when trying to start ovirt engine:
> 2014-04-22 14:18:25.432+0000: 8901: debug : virCommandRunAsync:2250 :
> Command result 0, with PID 11491
> 2014-04-22 14:18:25.478+0000: 8901: debug : virCommandRun:2045 : Result
> exit status 255, stdout: '' stderr: 'iptables v1.4.7: goto 'FO-vnet0' is
> not a chain
>
> So it could be that something is broken in my hosted-engine network. Do
> you have any clue how I can troubleshoot this?
>
>
> Thanks,
> René
>
>
> >
> > ----- Original Message -----
> >> From: "René Koch" <rkoch(a)linuxland.at>
> >> To: "Martin Sivak" <msivak(a)redhat.com>
> >> Cc: users(a)ovirt.org
> >> Sent: Tuesday, April 22, 2014 1:46:38 PM
> >> Subject: Re: [ovirt-users] hosted engine health check issues
> >>
> >> Hi,
> >>
> >> I rebooted one of my ovirt hosts today and the result is now that I
> >> can't start hosted-engine anymore.
> >>
> >> ovirt-ha-agent isn't running because the lockspace file is missing
> >> (sanlock complains about it).
> >> So I tried to start hosted-engine with --vm-start and I get the
> >> following errors:
> >>
> >> ==> /var/log/sanlock.log <==
> >> 2014-04-22 12:38:17+0200 654 [3093]: r2 cmd_acquire 2,9,5733 invalid
> >> lockspace found -1 failed 0 name 2851af27-8744-445d-9fb1-a0d083c8dc82
> >>
> >> ==> /var/log/messages <==
> >> Apr 22 12:38:17 ovirt-host02 sanlock[3079]: 2014-04-22 12:38:17+0200 654
> >> [3093]: r2 cmd_acquire 2,9,5733 invalid lockspace found -1 failed 0 name
> >> 2851af27-8744-445d-9fb1-a0d083c8dc82
> >> Apr 22 12:38:17 ovirt-host02 kernel: ovirtmgmt: port 2(vnet0) entering
> >> disabled state
> >> Apr 22 12:38:17 ovirt-host02 kernel: device vnet0 left promiscuous mode
> >> Apr 22 12:38:17 ovirt-host02 kernel: ovirtmgmt: port 2(vnet0) entering
> >> disabled state
> >>
> >> ==> /var/log/vdsm/vdsm.log <==
> >> Thread-21::DEBUG::2014-04-22
> >> 12:38:17,563::libvirtconnection::124::root::(wrapper) Unknown
> >> libvirterror: ecode: 38 edom: 42 level: 2 message: Failed to acquire
> >> lock: No space left on device
> >> Thread-21::DEBUG::2014-04-22
> >> 12:38:17,563::vm::2263::vm.Vm::(_startUnderlyingVm)
> >> vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::_ongoingCreations released
> >> Thread-21::ERROR::2014-04-22
> >> 12:38:17,564::vm::2289::vm.Vm::(_startUnderlyingVm)
> >> vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::The vm start process failed
> >> Traceback (most recent call last):
> >> File "/usr/share/vdsm/vm.py", line 2249, in _startUnderlyingVm
> >> self._run()
> >> File "/usr/share/vdsm/vm.py", line 3170, in _run
> >> self._connection.createXML(domxml, flags),
> >> File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py",
> >> line 92, in wrapper
> >> ret = f(*args, **kwargs)
> >> File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2665, in
> >> createXML
> >> if ret is None:raise libvirtError('virDomainCreateXML() failed',
> >> conn=self)
> >> libvirtError: Failed to acquire lock: No space left on device
> >>
> >> ==> /var/log/messages <==
> >> Apr 22 12:38:17 ovirt-host02 vdsm vm.Vm ERROR
> >> vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::The vm start process
> >> failed#012Traceback (most recent call last):#012 File
> >> "/usr/share/vdsm/vm.py", line 2249, in _startUnderlyingVm#012
> >> self._run()#012 File "/usr/share/vdsm/vm.py", line 3170, in _run#012
> >> self._connection.createXML(domxml, flags),#012 File
> >> "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 92,
> >> in wrapper#012 ret = f(*args, **kwargs)#012 File
> >> "/usr/lib64/python2.6/site-packages/libvirt.py", line 2665, in
> >> createXML#012 if ret is None:raise libvirtError('virDomainCreateXML()
> >> failed', conn=self)#012libvirtError: Failed to acquire lock: No space
> >> left on device
> >>
> >> ==> /var/log/vdsm/vdsm.log <==
> >> Thread-21::DEBUG::2014-04-22
> >> 12:38:17,569::vm::2731::vm.Vm::(setDownStatus)
> >> vmId=`f26dd37e-13b5-430c-b2f2-ecd098b82a91`::Changed state to Down:
> >> Failed to acquire lock: No space left on device
> >>
> >>
> >> No space left on device is nonsense as there is enough space (I had this
> >> issue last time as well where I had to patch machine.py, but this file
> >> is now Python 2.6.6 compatible.
> >>
> >> Any idea what prevents hosted-engine from starting?
> >> ovirt-ha-broker, vdsmd and sanlock are running btw.
> >>
> >> Btw, I can see in log that json rpc server module is missing - which
> >> package is required for CentOS 6.5?
> >> Apr 22 12:37:14 ovirt-host02 vdsm vds WARNING Unable to load the json
> >> rpc server module. Please make sure it is installed.
> >>
> >>
> >> Thanks,
> >> René
> >>
> >>
> >>
> >> On 04/17/2014 10:02 AM, Martin Sivak wrote:
> >>> Hi,
> >>>
> >>>>>> How can I disable notifications?
> >>>
> >>> The notification is configured in /etc/ovirt-hosted-engine-ha/broker.conf
> >>> section notification.
> >>> The email is sent when the key state_transition exists and the string
> >>> OldState-NewState contains the (case insensitive) regexp from the value.
> >>>
> >>>>>> Is it intended to send out these messages and detect that ovirt engine
> >>>>>> is down (which is false anyway), but not to restart the vm?
> >>>
> >>> Forget about emails for now and check the
> >>> /var/log/ovirt-hosted-engine-ha/agent.log and broker.log (and attach them
> >>> as well btw).
> >>>
> >>>>>> oVirt hosts think that hosted engine is down because it seems that
> >>>>>> hosts
> >>>>>> can't write to hosted-engine.lockspace due to glusterfs issues (or at
> >>>>>> least I think so).
> >>>
> >>> The hosts think so or can't really write there? The lockspace is managed
> >>> by
> >>> sanlock and our HA daemons do not touch it at all. We only ask sanlock to
> >>> get make sure we have unique server id.
> >>>
> >>>>>> Is is possible or planned to make the whole ha feature optional?
> >>>
> >>> Well the system won't perform any automatic actions if you put the hosted
> >>> engine to global maintenance and only start/stop/migrate the VM manually.
> >>> I would discourage you from stopping agent/broker, because the engine
> >>> itself has some logic based on the reporting.
> >>>
> >>> Regards
> >>>
> >>> --
> >>> Martin Sivák
> >>> msivak(a)redhat.com
> >>> Red Hat Czech
> >>> RHEV-M SLA / Brno, CZ
> >>>
> >>> ----- Original Message -----
> >>>> On 04/15/2014 04:53 PM, Jiri Moskovcak wrote:
> >>>>> On 04/14/2014 10:50 AM, René Koch wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> I have some issues with hosted engine status.
> >>>>>>
> >>>>>> oVirt hosts think that hosted engine is down because it seems that
> >>>>>> hosts
> >>>>>> can't write to hosted-engine.lockspace due to glusterfs issues (or at
> >>>>>> least I think so).
> >>>>>>
> >>>>>> Here's the output of vm-status:
> >>>>>>
> >>>>>> # hosted-engine --vm-status
> >>>>>>
> >>>>>>
> >>>>>> --== Host 1 status ==--
> >>>>>>
> >>>>>> Status up-to-date : False
> >>>>>> Hostname : 10.0.200.102
> >>>>>> Host ID : 1
> >>>>>> Engine status : unknown stale-data
> >>>>>> Score : 2400
> >>>>>> Local maintenance : False
> >>>>>> Host timestamp : 1397035677
> >>>>>> Extra metadata (valid at timestamp):
> >>>>>> metadata_parse_version=1
> >>>>>> metadata_feature_version=1
> >>>>>> timestamp=1397035677 (Wed Apr 9 11:27:57 2014)
> >>>>>> host-id=1
> >>>>>> score=2400
> >>>>>> maintenance=False
> >>>>>> state=EngineUp
> >>>>>>
> >>>>>>
> >>>>>> --== Host 2 status ==--
> >>>>>>
> >>>>>> Status up-to-date : True
> >>>>>> Hostname : 10.0.200.101
> >>>>>> Host ID : 2
> >>>>>> Engine status : {'reason': 'vm not running on
> >>>>>> this
> >>>>>> host', 'health': 'bad', 'vm': 'down', 'detail': 'unknown'}
> >>>>>> Score : 0
> >>>>>> Local maintenance : False
> >>>>>> Host timestamp : 1397464031
> >>>>>> Extra metadata (valid at timestamp):
> >>>>>> metadata_parse_version=1
> >>>>>> metadata_feature_version=1
> >>>>>> timestamp=1397464031 (Mon Apr 14 10:27:11 2014)
> >>>>>> host-id=2
> >>>>>> score=0
> >>>>>> maintenance=False
> >>>>>> state=EngineUnexpectedlyDown
> >>>>>> timeout=Mon Apr 14 10:35:05 2014
> >>>>>>
> >>>>>> oVirt engine is sending me 2 emails every 10 minutes with the
> >>>>>> following
> >>>>>> subjects:
> >>>>>> - ovirt-hosted-engine state transition EngineDown-EngineStart
> >>>>>> - ovirt-hosted-engine state transition EngineStart-EngineUp
> >>>>>>
> >>>>>> In oVirt webadmin I can see the following message:
> >>>>>> VM HostedEngine is down. Exit message: internal error Failed to
> >>>>>> acquire
> >>>>>> lock: error -243.
> >>>>>>
> >>>>>> These messages are really annoying as oVirt isn't doing anything with
> >>>>>> hosted engine - I have an uptime of 9 days in my engine vm.
> >>>>>>
> >>>>>> So my questions are now:
> >>>>>> Is it intended to send out these messages and detect that ovirt engine
> >>>>>> is down (which is false anyway), but not to restart the vm?
> >>>>>>
> >>>>>> How can I disable notifications? I'm planning to write a Nagios plugin
> >>>>>> which parses the output of hosted-engine --vm-status and only Nagios
> >>>>>> should notify me, not hosted-engine script.
> >>>>>>
> >>>>>> Is is possible or planned to make the whole ha feature optional? I
> >>>>>> really really really hate cluster software as it causes more troubles
> >>>>>> then standalone machines and in my case the hosted-engine ha feature
> >>>>>> really causes troubles (and I didn't had a hardware or network outage
> >>>>>> yet only issues with hosted-engine ha agent). I don't need any ha
> >>>>>> feature for hosted engine. I just want to run engine virtualized on
> >>>>>> oVirt and if engine vm fails (e.g. because of issues with a host) I'll
> >>>>>> restart it on another node.
> >>>>>
> >>>>> Hi, you can:
> >>>>> 1. edit /etc/ovirt-hosted-engine-ha/{agent,broker}-log.conf and tweak
> >>>>> the logger as you like
> >>>>> 2. or kill ovirt-ha-broker & ovirt-ha-agent services
> >>>>
> >>>> Thanks for the information.
> >>>> So engine is able to run when ovirt-ha-broker and ovirt-ha-agent isn't
> >>>> running?
> >>>>
> >>>>
> >>>> Regards,
> >>>> René
> >>>>
> >>>>>
> >>>>> --Jirka
> >>>>>>
> >>>>>> Thanks,
> >>>>>> René
> >>>>>>
> >>>>>>
> >>>>>
> >>>> _______________________________________________
> >>>> Users mailing list
> >>>> Users(a)ovirt.org
> >>>> http://lists.ovirt.org/mailman/listinfo/users
> >>>>
> >> _______________________________________________
> >> Users mailing list
> >> Users(a)ovirt.org
> >> http://lists.ovirt.org/mailman/listinfo/users
> >>
>
10 years, 5 months
hosted engine health check issues
by René Koch
Hi,
I have some issues with hosted engine status.
oVirt hosts think that hosted engine is down because it seems that hosts
can't write to hosted-engine.lockspace due to glusterfs issues (or at
least I think so).
Here's the output of vm-status:
# hosted-engine --vm-status
--== Host 1 status ==--
Status up-to-date : False
Hostname : 10.0.200.102
Host ID : 1
Engine status : unknown stale-data
Score : 2400
Local maintenance : False
Host timestamp : 1397035677
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=1397035677 (Wed Apr 9 11:27:57 2014)
host-id=1
score=2400
maintenance=False
state=EngineUp
--== Host 2 status ==--
Status up-to-date : True
Hostname : 10.0.200.101
Host ID : 2
Engine status : {'reason': 'vm not running on this
host', 'health': 'bad', 'vm': 'down', 'detail': 'unknown'}
Score : 0
Local maintenance : False
Host timestamp : 1397464031
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=1397464031 (Mon Apr 14 10:27:11 2014)
host-id=2
score=0
maintenance=False
state=EngineUnexpectedlyDown
timeout=Mon Apr 14 10:35:05 2014
oVirt engine is sending me 2 emails every 10 minutes with the following
subjects:
- ovirt-hosted-engine state transition EngineDown-EngineStart
- ovirt-hosted-engine state transition EngineStart-EngineUp
In oVirt webadmin I can see the following message:
VM HostedEngine is down. Exit message: internal error Failed to acquire
lock: error -243.
These messages are really annoying as oVirt isn't doing anything with
hosted engine - I have an uptime of 9 days in my engine vm.
So my questions are now:
Is it intended to send out these messages and detect that ovirt engine
is down (which is false anyway), but not to restart the vm?
How can I disable notifications? I'm planning to write a Nagios plugin
which parses the output of hosted-engine --vm-status and only Nagios
should notify me, not hosted-engine script.
Is is possible or planned to make the whole ha feature optional? I
really really really hate cluster software as it causes more troubles
then standalone machines and in my case the hosted-engine ha feature
really causes troubles (and I didn't had a hardware or network outage
yet only issues with hosted-engine ha agent). I don't need any ha
feature for hosted engine. I just want to run engine virtualized on
oVirt and if engine vm fails (e.g. because of issues with a host) I'll
restart it on another node.
Thanks,
René
--
Best Regards
René Koch
Senior Solution Architect
============================================
LIS-Linuxland GmbH
Brünner Straße 163, A-1210 Vienna
Phone: +43 1 236 91 60
Mobile: +43 660 / 512 21 31
E-Mail: rkoch(a)linuxland.at
============================================
10 years, 5 months
Fwd: [RHSA-2014:0421-01] Moderate: qemu-kvm-rhev security update
by Sven Kieske
Hi list,
if you do not already monitor some security lists:
You are strongly encouraged to update your qemu-kvm
packages, especially on CentOS :)
See below for details
-------- Original-Nachricht --------
Betreff: [RHSA-2014:0421-01] Moderate: qemu-kvm-rhev security update
Datum: Tue, 22 Apr 2014 17:52:38 +0000
Von: <bugzilla(a)redhat.com>
An: <rhsa-announce(a)redhat.com>, <rhev-watch-list(a)redhat.com>
=====================================================================
Red Hat Security Advisory
Synopsis: Moderate: qemu-kvm-rhev security update
Advisory ID: RHSA-2014:0421-01
Product: Red Hat Enterprise Virtualization
Advisory URL: https://rhn.redhat.com/errata/RHSA-2014-0421.html
Issue date: 2014-04-22
CVE Names: CVE-2014-0142 CVE-2014-0143 CVE-2014-0144
CVE-2014-0145 CVE-2014-0146 CVE-2014-0147
CVE-2014-0148 CVE-2014-0150
=====================================================================
1. Summary:
Updated qemu-kvm-rhev packages that fix several security issues are now
available for Red Hat Enterprise Virtualization.
The Red Hat Security Response Team has rated this update as having
Moderate
security impact. Common Vulnerability Scoring System (CVSS) base scores,
which give detailed severity ratings, are available for each vulnerability
from the CVE links in the References section.
2. Relevant releases/architectures:
RHEV Agents (vdsm) - x86_64
3. Description:
KVM (Kernel-based Virtual Machine) is a full virtualization solution for
Linux on AMD64 and Intel 64 systems. The qemu-kvm-rhev package
provides the
user-space component for running virtual machines using KVM in
environments
managed by Red Hat Enterprise Virtualization Manager.
Multiple integer overflow, input validation, logic error, and buffer
overflow flaws were discovered in various QEMU block drivers. An attacker
able to modify a disk image file loaded by a guest could use these
flaws to
crash the guest, or corrupt QEMU process memory on the host, potentially
resulting in arbitrary code execution on the host with the privileges of
the QEMU process. (CVE-2014-0143, CVE-2014-0144, CVE-2014-0145,
CVE-2014-0147)
A buffer overflow flaw was found in the way the virtio_net_handle_mac()
function of QEMU processed guest requests to update the table of MAC
addresses. A privileged guest user could use this flaw to corrupt QEMU
process memory on the host, potentially resulting in arbitrary code
execution on the host with the privileges of the QEMU process.
(CVE-2014-0150)
A divide-by-zero flaw was found in the seek_to_sector() function of the
parallels block driver in QEMU. An attacker able to modify a disk image
file loaded by a guest could use this flaw to crash the guest.
(CVE-2014-0142)
A NULL pointer dereference flaw was found in the QCOW2 block driver in
QEMU. An attacker able to modify a disk image file loaded by a guest could
use this flaw to crash the guest. (CVE-2014-0146)
It was found that the block driver for Hyper-V VHDX images did not
correctly calculate BAT (Block Allocation Table) entries due to a missing
bounds check. An attacker able to modify a disk image file loaded by a
guest could use this flaw to crash the guest. (CVE-2014-0148)
The CVE-2014-0143 issues were discovered by Kevin Wolf and Stefan Hajnoczi
of Red Hat, the CVE-2014-0144 issues were discovered by Fam Zheng, Jeff
Cody, Kevin Wolf, and Stefan Hajnoczi of Red Hat, the CVE-2014-0145 issues
were discovered by Stefan Hajnoczi of Red Hat, the CVE-2014-0150 issue was
discovered by Michael S. Tsirkin of Red Hat, the CVE-2014-0142,
CVE-2014-0146, and CVE-2014-0147 issues were discovered by Kevin Wolf of
Red Hat, and the CVE-2014-0148 issue was discovered by Jeff Cody of
Red Hat.
All users of qemu-kvm-rhev are advised to upgrade to these updated
packages, which contain backported patches to correct these issues. After
installing this update, shut down all running virtual machines. Once all
virtual machines have shut down, start them again for this update to take
effect.
4. Solution:
Before applying this update, make sure all previously released errata
relevant to your system have been applied.
This update is available via the Red Hat Network. Details on how to
use the Red Hat Network to apply this update are available at
https://access.redhat.com/site/articles/11258
5. Bugs fixed (https://bugzilla.redhat.com/):
1078201 - CVE-2014-0142 qemu: crash by possible division by zero
1078212 - CVE-2014-0148 Qemu: vhdx: bounds checking for block_size and
logical_sector_size
1078232 - CVE-2014-0146 Qemu: qcow2: NULL dereference in qcow2_open()
error path
1078846 - CVE-2014-0150 qemu: virtio-net: buffer overflow in
virtio_net_handle_mac() function
1078848 - CVE-2014-0147 Qemu: block: possible crash due signed types or
logic error
1078885 - CVE-2014-0145 Qemu: prevent possible buffer overflows
1079140 - CVE-2014-0143 Qemu: block: multiple integer overflow flaws
1079240 - CVE-2014-0144 Qemu: block: missing input validation
6. Package List:
RHEV Agents (vdsm):
Source:
ftp://ftp.redhat.com/pub/redhat/linux/enterprise/6Server/en/RHEV/SRPMS/qe...
x86_64:
qemu-img-rhev-0.12.1.2-2.415.el6_5.8.x86_64.rpm
qemu-kvm-rhev-0.12.1.2-2.415.el6_5.8.x86_64.rpm
qemu-kvm-rhev-debuginfo-0.12.1.2-2.415.el6_5.8.x86_64.rpm
qemu-kvm-rhev-tools-0.12.1.2-2.415.el6_5.8.x86_64.rpm
These packages are GPG signed by Red Hat for security. Our key and
details on how to verify the signature are available from
https://access.redhat.com/security/team/key/#package
7. References:
https://www.redhat.com/security/data/cve/CVE-2014-0142.html
https://www.redhat.com/security/data/cve/CVE-2014-0143.html
https://www.redhat.com/security/data/cve/CVE-2014-0144.html
https://www.redhat.com/security/data/cve/CVE-2014-0145.html
https://www.redhat.com/security/data/cve/CVE-2014-0146.html
https://www.redhat.com/security/data/cve/CVE-2014-0147.html
https://www.redhat.com/security/data/cve/CVE-2014-0148.html
https://www.redhat.com/security/data/cve/CVE-2014-0150.html
https://access.redhat.com/security/updates/classification/#moderate
8. Contact:
The Red Hat security contact is <secalert(a)redhat.com>. More contact
details at https://access.redhat.com/security/team/contact/
Copyright 2014 Red Hat, Inc.
--
rhev-watch-list mailing list
rhev-watch-list(a)redhat.com
https://www.redhat.com/mailman/listinfo/rhev-watch-list
10 years, 5 months
Re: [ovirt-users] Ovirt + GLUSTER
by Jeremiah Jahn
Nothing too complicated.
SL 6x and 5
8 vm hosts running on a hitachi blade symphony.
25 Server guests
15+ desktop windows guests
3x 12TB storage servers (All 1TB based raid 10 SSDs) 10Gbs PTP
between 2 of the servers, with one geolocation server offsite.
Most of the server images/luns are exported through 4Gbps FC cards
from the servers using LIO except the desktop machines with are
attached directly to the gluster storage pool since normal users can
create them. Various servers use use the gluster system directly for
storing a large number of documents and providing public access to
the tune of about 300 to 400 thousand requests per day. We
aggregate, provide public access, efiling, e-payments and case
management software to most of the illinois circuit court system.
Gluster has worked like a champ.
On Tue, Apr 22, 2014 at 8:05 AM, Ovirt User <ldrt8789(a)gmail.com> wrote:
> what type of configuration and use case ?
>
> Il giorno 22/apr/2014, alle ore 14:53, Jeremiah Jahn <jeremiah(a)goodinassociates.com> ha scritto:
>
>> I am.
>>
>> On Mon, Apr 21, 2014 at 1:50 PM, Joop <jvdwege(a)xs4all.nl> wrote:
>>> Ovirt User wrote:
>>>>
>>>> Hello,
>>>>
>>>> anyone are using ovirt with glusterFS as storage domain in production
>>>> environment ?
>>>>
>>>>
>>>
>>> Not directly production but almost. Having problems?
>>>
>>> Regards,
>>>
>>> Joop
>>>
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users(a)ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>
10 years, 5 months
Ovirt + GLUSTER
by Ovirt User
Hello,
anyone are using ovirt with glusterFS as storage domain in production environment ?
Thanks
Lukas
10 years, 5 months