Two host cluster without hyperconverged
by Göker Dalar
Hello everyone,
I want to get an idea in this topic.
I have two servers with the same capabilities and 8 same physical disc per
node. I want to setup a cluster using a redundant disc.I dont have another
server for gluster hyperconverged . How i should build for this
structure ?
Thanks in advance,
Göker
3 years, 4 months
Gluster Heal Issue
by Christian Reiss
Hey folks,
in our production setup with 3 nodes (HCI) we took one host down
(maintenance, stop gluster, poweroff via ssh/ovirt engine). Once it was
up the gluster hat 2k healing entries that went down in a matter on 10
minutes to 2.
Those two give me a headache:
[root@node03:~] # gluster vol heal ssd_storage info
Brick node01:/gluster_bricks/ssd_storage/ssd_storage
<gfid:a121e4fb-0984-4e41-94d7-8f0c4f87f4b6>
<gfid:6f8817dc-3d92-46bf-aa65-a5d23f97490e>
Status: Connected
Number of entries: 2
Brick node02:/gluster_bricks/ssd_storage/ssd_storage
Status: Connected
Number of entries: 0
Brick node03:/gluster_bricks/ssd_storage/ssd_storage
<gfid:a121e4fb-0984-4e41-94d7-8f0c4f87f4b6>
<gfid:6f8817dc-3d92-46bf-aa65-a5d23f97490e>
Status: Connected
Number of entries: 2
No paths, only gfid. We took down node2, so it does not have the file:
[root@node01:~] # md5sum
/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
75c4941683b7eabc223fc9d5f022a77c
/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
[root@node02:~] # md5sum
/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
md5sum:
/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6:
No such file or directory
[root@node03:~] # md5sum
/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
75c4941683b7eabc223fc9d5f022a77c
/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
The other two files are md5-identical.
These flags are identical, too:
[root@node01:~] # getfattr -d -m . -e hex
/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
getfattr: Removing leading '/' from absolute path names
# file:
gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.ssd_storage-client-1=0x0000004f0000000100000000
trusted.gfid=0xa121e4fb09844e4194d78f0c4f87f4b6
trusted.gfid2path.d4cf876a215b173f=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f38366461303238392d663734662d343230302d393238342d3637386537626437363139352e31323030
trusted.glusterfs.mdata=0x010000000000000000000000005e349b1e000000001139aa2a000000005e349b1e000000001139aa2a000000005e34994900000000304a5eb2
getfattr: Removing leading '/' from absolute path names
# file:
gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.ssd_storage-client-1=0x0000004f0000000100000000
trusted.gfid=0xa121e4fb09844e4194d78f0c4f87f4b6
trusted.gfid2path.d4cf876a215b173f=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f38366461303238392d663734662d343230302d393238342d3637386537626437363139352e31323030
trusted.glusterfs.mdata=0x010000000000000000000000005e349b1e000000001139aa2a000000005e349b1e000000001139aa2a000000005e34994900000000304a5eb2
Now, I dont dare simply proceeding withouth some advice.
Anyone got a clue on who to resolve this issue? File #2 is identical to
this one, from a problem point of view.
Have a great weekend!
-Chris.
--
with kind regards,
mit freundlichen Gruessen,
Christian Reiss
3 years, 4 months
Ovirt-engine-ha cannot to see live status of Hosted Engine
by asm@pioner.kz
Good day for all.
I have some issues with Ovirt 4.2.6. But now the main this of it:
I have two Centos 7 Nodes with same config and last Ovirt 4.2.6 with Hostedengine with disk on NFS storage.
Also some of virtual machines working good.
But, when HostedEngine running on one node (srv02.local) everything is fine.
After migrating to another node (srv00.local), i see that agent cannot to check livelinness of HostedEngine. After few minutes HostedEngine going to reboot and after some time i see some situation. After migration to another node (srv00.local) all looks OK.
hosted-engine --vm-status commang when HosterEngine on srv00 node:
--== Host 1 status ==--
conf_on_shared_storage : True
Status up-to-date : True
Hostname : srv02.local
Host ID : 1
Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down_unexpected", "detail": "unknown"}
Score : 0
stopped : False
Local maintenance : False
crc32 : ecc7ad2d
local_conf_timestamp : 78328
Host timestamp : 78328
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=78328 (Tue Sep 18 12:44:18 2018)
host-id=1
score=0
vm_conf_refresh_time=78328 (Tue Sep 18 12:44:18 2018)
conf_on_shared_storage=True
maintenance=False
state=EngineUnexpectedlyDown
stopped=False
timeout=Fri Jan 2 03:49:58 1970
--== Host 2 status ==--
conf_on_shared_storage : True
Status up-to-date : True
Hostname : srv00.local
Host ID : 2
Engine status : {"reason": "failed liveliness check", "health": "bad", "vm": "up", "detail": "Up"}
Score : 3400
stopped : False
Local maintenance : False
crc32 : 1d62b106
local_conf_timestamp : 326288
Host timestamp : 326288
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=326288 (Tue Sep 18 12:44:21 2018)
host-id=2
score=3400
vm_conf_refresh_time=326288 (Tue Sep 18 12:44:21 2018)
conf_on_shared_storage=True
maintenance=False
state=EngineStarting
stopped=False
Log agent.log from srv00.local:
MainThread::INFO::2018-09-18 12:40:51,749::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE
ngine::(consume) VM is powering up..
MainThread::INFO::2018-09-18 12:40:52,052::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.
HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400)
MainThread::INFO::2018-09-18 12:41:01,066::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE
ngine::(consume) VM is powering up..
MainThread::INFO::2018-09-18 12:41:01,374::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.
HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400)
MainThread::INFO::2018-09-18 12:41:11,393::state_machine::169::ovirt_hosted_engine_ha.agent.hosted_engine.
HostedEngine::(refresh) Global metadata: {'maintenance': False}
MainThread::INFO::2018-09-18 12:41:11,393::state_machine::174::ovirt_hosted_engine_ha.agent.hosted_engine.
HostedEngine::(refresh) Host srv02.local.pioner.kz (id 1): {'conf_on_shared_storage': True, 'extra': 'meta
data_parse_version=1\nmetadata_feature_version=1\ntimestamp=78128 (Tue Sep 18 12:40:58 2018)\nhost-id=1\ns
core=0\nvm_conf_refresh_time=78128 (Tue Sep 18 12:40:58 2018)\nconf_on_shared_storage=True\nmaintenance=Fa
lse\nstate=EngineUnexpectedlyDown\nstopped=False\ntimeout=Fri Jan 2 03:49:58 1970\n', 'hostname': 'srv02.
local.pioner.kz', 'alive': True, 'host-id': 1, 'engine-status': {'reason': 'vm not running on this host',
'health': 'bad', 'vm': 'down_unexpected', 'detail': 'unknown'}, 'score': 0, 'stopped': False, 'maintenance
': False, 'crc32': 'e18e3f22', 'local_conf_timestamp': 78128, 'host-ts': 78128}
MainThread::INFO::2018-09-18 12:41:11,393::state_machine::177::ovirt_hosted_engine_ha.agent.hosted_engine.
HostedEngine::(refresh) Local (id 2): {'engine-health': {'reason': 'failed liveliness check', 'health': 'b
ad', 'vm': 'up', 'detail': 'Up'}, 'bridge': True, 'mem-free': 12763.0, 'maintenance': False, 'cpu-load': 0
.0364, 'gateway': 1.0, 'storage-domain': True}
MainThread::INFO::2018-09-18 12:41:11,393::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE
ngine::(consume) VM is powering up..
MainThread::INFO::2018-09-18 12:41:11,703::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.
HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400)
MainThread::INFO::2018-09-18 12:41:21,716::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE
ngine::(consume) VM is powering up..
MainThread::INFO::2018-09-18 12:41:22,020::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.
HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400)
MainThread::INFO::2018-09-18 12:41:31,033::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE
ngine::(consume) VM is powering up..
MainThread::INFO::2018-09-18 12:41:31,344::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.
HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400)
As we can see, agent thinking that HostedEngine just in powering up mode. I cannot to do anythink with it. I allready reinstalled many times srv00 node without success.
One time i even has to uninstall ovirt* and vdsm* software. Also here one interesting point, after installing just "yum install http://resources.ovirt.org/pub/yum-repo/ovirt-release42.rpm" on this node i try to install this node from engine web interface with "Deploy" action. But, installation was unsuccesfull, before i didnt install ovirt-hosted-engine-ha on this node. I dont see in documentation that its need bofore installation of new hosts. But this is for information and checking. After installing ovirt-hosted-engine-ha node was installed with HostedEngine support. But the main issue not changed.
Thanks in advance for help.
BR,
Alexandr
3 years, 4 months
Reimport VMs after lost Engine with broken Backup
by Vinícius Ferrão
Hello,
I’m with a scenario with a lost hosted-engine. For reasons unknown the backup is broken and I’ve tried everything: redeploy with backup file, deploy a new one and them restore the backup. Changed the HE storage domain in both cases just be sure. Reinstalled one of the hosts just to be safe but nothing worked.
So I just deployed a brand new engine and now I want to import back the VM’s.
My scenario right now is:
ovirt1 brand new with the new VM in a new storage domain.
ovirt2 have production VMs running without any issue.
So my questions right now:
* What happens if I just add the ovirt2 machine to the new engine? It will reboot or will add everything back: storage, networks etc?
My plan right now would be:
* Reconfigure the DC and the cluster.
* Readd all the networks.
* Reattach the storage domains.
* Recreate all the VM”s attaching it’s disks.
Is there’s something that I can do better to mitigate the work?
Thanks,
3 years, 4 months
Re: High level network advice request
by Richard Nilsson
Thanks so much for your reply Robert!
I like your set-up alot, that's where I'm going too actually. But now I have only one node, I'm trying to learn very basic setup with just the one for the moment (Because I have it running after years of trying! It will take a few weeks and another motherboard / rebuild before I have the second node, I'll get there soon).
I've just learned (I think) that I can't sync new logical networks with the host, because I can't put the only host in maintenance mode...I thought that there might be a way with cli and restarts and all that but lo, there no point, I will have another node in a few weeks or months :)
That's okay, I'm trying to work out why I can't access my new test server from WAN. I use split dns with pfsense and haproxy reverse redirects. I can get to the server test pages from LAN via the pfSense dns resolution (LAN) but the reverse redirects are not working from WAN. I don't know what next step to take to debug the problem. The engine is accessible from WAN, so I think it should work for the vm server, which is also on the default ovirt management network and uses all defaults like the hosted engine. I suspect that there is a security setting on the engine, logical network or maybe the server?
What should I check next?
My singe node, is also in the same condition, which may be instructive to a noob like me...the node I can reach from the LAN but not the WAN. So the engine is a special case. Do I need to create certificates on the vm webserver?
I'd like to see if I can set-up a NextCloud server after trying a SuiteCRM server. But I started with fedora 31 server and a very basic lamp stack to limit variables...
Thanks in advance. Let me know if I can ever help you with anything! I'm an Architect, but a real one; not IT but bricks and all that :)
These are the links:
engine.metrodesignoffice.com
mdowebserver.metrodesignoffice.com
3 years, 4 months
0virt VMs status down after host reboot
by Eugène Ngontang
Hi all,
I've set up an infrastructure with OVirt, using self-hosted engine.
I use some ansible scripts from my Virtualization Host (the physical
machine), to bootstrap the hosted engine, and create a set of virtual
machines on which I deploy a k8s cluster.
The deployment goes well, and everything is OK.
Now I'm doing some reboot tests, and when I reboot the physical server,
only the hosted-engine vm is up after the reboot, the rest of VMs and thus
the k8s cluster are down.
Had someone here ever experienced this issue? What can cause it and how to
automate the virtual machines startup in RHVE/Ovirt?
Thanks.
Regards,
Eugene
--
LesCDN <http://lescdn.com>
engontang(a)lescdn.com
------------------------------------------------------------
*Aux hommes il faut un chef, et au*
* chef il faut des hommes!L'habit ne fait pas le moine, mais lorsqu'on te
voit on te juge!*
3 years, 4 months
Update 4.4
by Dirk Streubel
Hello,
i use for testing Version 4.4 and i wanted to make a update.
This is the result:
LANG=C engine-setup
...
[ INFO ] Checking for product updates...
[ ERROR ] Yum
[u'ovirt-engine-backend-4.4.0-0.0.master.20200122090542.git5f0a359.el7.noarch
requires java-client-kubevirt >= 0.1.0']
[ INFO ] Yum Performing yum transaction rollback
[ ERROR ] Failed to execute stage 'Environment customization':
[u'ovirt-engine-backend-4.4.0-0.0.master.20200122090542.git5f0a359.el7.noarch
requires java-client-kubevirt >= 0.1.0']
[ INFO ] Stage: Clean up
Log file is located at
/var/log/ovirt-engine/setup/ovirt-engine-setup-20200127193352-x6p04t.log
[ INFO ] Generating answer file
'/var/lib/ovirt-engine/setup/answers/20200127193405-setup.conf'
[ ERROR ] Failed to execute stage 'Clean up': must be unicode, not str
[ INFO ] Stage: Pre-termination
[ INFO ] Stage: Termination
[ ERROR ] Execution of setup failed
So, i found this:
https://repo1.maven.org/maven2/org/ovirt/java-client-kubevirt/java-client...
So, do i have to install a.jar to make the engine update or what is the
best way to do the update?
Dirk
3 years, 4 months
High level network advice request
by Richard Nilsson
High level network advice request :)
I have a self-hosted engine deployed on a node, Ovirt v. 4.3. I am testing, but I don't understand the big idea of how to set-up Ovirt networking for hosted / engine-managed virtual servers. I would like to host a few virtual servers for things like Next/OwnCloud, SuiteCRM, NethServer or others.
For example, I know exactly how set-up a virtual machine with a centos / lamp stack on a fedora host, I can make a network bridge for the vm with fedora cli, then use haproxy (or squid) as a reverse-redirect server to allow WAN access to the vm server using FQDNs.
What is a good strategy for Ovirt hosting a webserver? To use the default ovirt management network for the virtual server machines doesn't seem like a best practice?
Should I make a new logical network for the virtual servers? Do I need to configure bridges for the machines? It looks like bridges and virtual NICs are automatically configured when I make the network and virtual machines, is that right?
Is it the usual or typical practice that one ovirt logical network uses only one network bridge to a one physical NIC? Would all of the kVMs on the logical network share the same / single bridge of the particular network? I'm not sure what the big idea should be, what is a best practice?
I wonder, should I bond several physical NICs, then point the bridge, for a new / dedicated logical network for webservers, to the the bonded NICs? There is more than a little new vocabulary for me to onboard for Ovirt / virtual / logical networks...I will greatly appreciate, and I thank you in advance for any top level / best practice advice!
3 years, 4 months
Remembering Lars Kurth at FOSDEM
by Tal Nisan
The Virtualization community that is gathering at FOSDEM would like to
share its plans to remember the life and accomplishments of our valued
friend and colleague, Lars Kurth, who has recently passed away.
The remembrance will be at 09:45 Sunday Feb. 2 in the Virtualization
Devroom (H.1309). We invite all to attend who would like to share in their
memories of Lars and celebrate his life.
3 years, 4 months
Assign permissions from within the VM portal?
by nicolas@devels.es
Hi,
We're testing version 4.3.8, we're planning to upgrade to this version
in production as currently we're still using 4.1.9.
In 4.1.9, users could grant permissions on their created VMs to other
users from within the VM portal, however I can't find this option on
version 4.3.8.
Permissions granted to users so they can create and handle their VMs are
VmCreator and DiskProfileUser on the DataCenter.
Is there a way to allow users grant permissions on their VMs to other
users in the VM portal?
Thanks
3 years, 4 months