you got 4 spare disks, and can take out one of your raidz to create a
temp. parallel existing pool. zfs send/receive to migrate the data, this
shouldnt take much time if you are not using huge drives?
Am 22.04.2015 um 11:54 schrieb Maikel vd Mosselaar:
Yes we are aware of that, problem is it's running production so
not very
easy to change the pool.
On 04/22/2015 11:48 AM, InterNetX - Juergen Gotteswinter wrote:
> i expect that you are aware of the fact that you only get the write
> performance of a single disk in that configuration? i whould drop that
> pool configuration, drop the spare drives and go for a mirror pool.
>
> Am 22.04.2015 um 11:39 schrieb Maikel vd Mosselaar:
>> pool: z2pool
>> state: ONLINE
>> scan: scrub canceled on Sun Apr 12 16:33:38 2015
>> config:
>>
>> NAME STATE READ WRITE CKSUM
>> z2pool ONLINE 0 0 0
>> raidz1-0 ONLINE 0 0 0
>> c0t5000C5004172A87Bd0 ONLINE 0 0 0
>> c0t5000C50041A59027d0 ONLINE 0 0 0
>> c0t5000C50041A592AFd0 ONLINE 0 0 0
>> c0t5000C50041A660D7d0 ONLINE 0 0 0
>> c0t5000C50041A69223d0 ONLINE 0 0 0
>> c0t5000C50041A6ADF3d0 ONLINE 0 0 0
>> logs
>> c0t5001517BB2845595d0 ONLINE 0 0 0
>> cache
>> c0t5001517BB2847892d0 ONLINE 0 0 0
>> spares
>> c0t5000C50041A6B737d0 AVAIL
>> c0t5000C50041AC3F07d0 AVAIL
>> c0t5000C50041AD48DBd0 AVAIL
>> c0t5000C50041ADD727d0 AVAIL
>>
>> errors: No known data errors
>>
>>
>> On 04/22/2015 11:17 AM, Karli Sjöberg wrote:
>>> On Wed, 2015-04-22 at 11:12 +0200, Maikel vd Mosselaar wrote:
>>>> Our pool is configured as Z1 with ZIL (normal SSD), the sync parameter
>>>> is on the default setting (standard) so "sync" is on.
>>> # zpool status ?
>>>
>>> /K
>>>
>>>> When the issue happens oVirt event viewer shows indeed latency
>>>> warnings.
>>>> Not always but most of the time this will be followed by an i/o
>>>> storage
>>>> error linked to random VMs and they will be paused when that happens.
>>>>
>>>> All the nodes use mode 4 bonding. The interfaces on the nodes don't
>>>> show
>>>> any drops or errors, i checked 2 of the VMs that got paused the last
>>>> time it happened they have dropped packets on their interfaces.
>>>>
>>>> We don't have a subscription with nexenta (anymore).
>>>>
>>>> On 04/21/2015 04:41 PM, InterNetX - Juergen Gotteswinter wrote:
>>>>> Am 21.04.2015 um 16:19 schrieb Maikel vd Mosselaar:
>>>>>> Hi Juergen,
>>>>>>
>>>>>> The load on the nodes rises far over >200 during the event.
Load on
>>>>>> the
>>>>>> nexenta stays normal and nothing strange in the logging.
>>>>> ZFS + NFS could be still the root of this. Your Pool Configuration
is
>>>>> RaidzX or Mirror, with or without ZIL? The sync Parameter of your
ZFS
>>>>> Subvolume which gets exported is kept default on "standard"
?
>>>>>
>>>>>
http://christopher-technicalmusings.blogspot.de/2010/09/zfs-and-nfs-perfo...
>>>>>
>>>>>
>>>>>
>>>>> Since Ovirt acts very sensible about Storage Latency (throws VM into
>>>>> unresponsive or unknown state) it might be worth a try to do
"zfs set
>>>>> sync=disabled pool/volume" to see if this changes things. But
be
>>>>> aware
>>>>> that this makes the NFS Export vuln. against dataloss in case of
>>>>> powerloss etc, comparable to async NFS in Linux.
>>>>>
>>>>> If disabling the sync setting helps, and you dont use a seperate ZIL
>>>>> Flash Drive yet -> this whould be very likely help to get rid of
>>>>> this.
>>>>>
>>>>> Also, if you run a subscribed Version of Nexenta it might be
>>>>> helpful to
>>>>> involve them.
>>>>>
>>>>> Do you see any messages about high latency in the Ovirt Events
Panel?
>>>>>
>>>>>> For our storage interfaces on our nodes we use bonding in mode 4
>>>>>> (802.3ad) 2x 1Gb. The nexenta has 4x 1Gb bond in mode 4 also.
>>>>> This should be fine, as long as no Node uses Mode0 / Round Robin
>>>>> which
>>>>> whould lead to out of order TCP Packets. The Interfaces themself
dont
>>>>> show any Drops or Errors - on the VM Hosts as well as on the Switch
>>>>> itself?
>>>>>
>>>>> Jumbo Frames?
>>>>>
>>>>>> Kind regards,
>>>>>>
>>>>>> Maikel
>>>>>>
>>>>>>
>>>>>> On 04/21/2015 02:51 PM, InterNetX - Juergen Gotteswinter wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> how about Load, Latency, strange dmesg messages on the
Nexenta ?
>>>>>>> You are
>>>>>>> using bonded Gbit Networking? If yes, which mode?
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> Juergen
>>>>>>>
>>>>>>> Am 20.04.2015 um 14:25 schrieb Maikel vd Mosselaar:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> We are running ovirt 3.5.1 with 3 nodes and seperate
engine.
>>>>>>>>
>>>>>>>> All on CentOS 6.6:
>>>>>>>> 3 x nodes
>>>>>>>> 1 x engine
>>>>>>>>
>>>>>>>> 1 x storage nexenta with NFS
>>>>>>>>
>>>>>>>> For multiple weeks we are experiencing issues of our
nodes that
>>>>>>>> cannot
>>>>>>>> access the storage at random moments (atleast thats what
the nodes
>>>>>>>> think).
>>>>>>>>
>>>>>>>> When the nodes are complaining about a unavailable
storage then
>>>>>>>> the load
>>>>>>>> rises up to +200 on all three nodes, this causes that all
running
>>>>>>>> VMs
>>>>>>>> are unaccessible. During this process oVirt event viewer
shows
>>>>>>>> some i/o
>>>>>>>> storage error messages, when this happens random VMs get
paused
>>>>>>>> and will
>>>>>>>> not be resumed anymore (this almost happens every time
but not
>>>>>>>> all the
>>>>>>>> VMs get paused).
>>>>>>>>
>>>>>>>> During the event we tested the accessibility from the
nodes to the
>>>>>>>> storage and it looks like it is working normal, at least
we can
>>>>>>>> do a
>>>>>>>> normal
>>>>>>>> "ls" on the storage without any delay of
showing the contents.
>>>>>>>>
>>>>>>>> We tried multiple things that we thought it causes this
issue but
>>>>>>>> nothing worked so far.
>>>>>>>> * rebooting storage / nodes / engine.
>>>>>>>> * disabling offsite rsync backups.
>>>>>>>> * moved the biggest VMs with highest load to different
platform
>>>>>>>> outside
>>>>>>>> of oVirt.
>>>>>>>> * checked the wsize and rsize on the nfs mounts, storage
and
>>>>>>>> nodes are
>>>>>>>> correct according to the "NFS troubleshooting
page" on
ovirt.org.
>>>>>>>>
>>>>>>>> The environment is running in production so we are not
free to
>>>>>>>> test
>>>>>>>> everything.
>>>>>>>>
>>>>>>>> I can provide log files if needed.
>>>>>>>>
>>>>>>>> Kind Regards,
>>>>>>>>
>>>>>>>> Maikel
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Users mailing list
>>>>>>>> Users(a)ovirt.org
>>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>>>>> _______________________________________________
>>>>>>> Users mailing list
>>>>>>> Users(a)ovirt.org
>>>>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users(a)ovirt.org
>>>>
http://lists.ovirt.org/mailman/listinfo/users
>>> _______________________________________________
>>> Users mailing list
>>> Users(a)ovirt.org
>>>
http://lists.ovirt.org/mailman/listinfo/users
>> _______________________________________________
>> Users mailing list
>> Users(a)ovirt.org
>>
http://lists.ovirt.org/mailman/listinfo/users
> _______________________________________________
> Users mailing list
> Users(a)ovirt.org
>
http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users(a)ovirt.org
http://lists.ovirt.org/mailman/listinfo/users