Zsh Mailing List Archive
Messages sorted by:
Re: [PATCH] [[:blank:]] only matches on SPC and TAB
- X-seq: zsh-workers 42782
- From: Oliver Kiddle <okiddle@xxxxxxxxxxx>
- To: Zsh hackers list <zsh-workers@xxxxxxx>
- Subject: Re: [PATCH] [[:blank:]] only matches on SPC and TAB
- Date: Tue, 15 May 2018 21:06:01 +0200
- Authentication-results: amavisd4.gkg.net (amavisd-new); dkim=pass (2048-bit key) header.d=yahoo.co.uk
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.co.uk; s=s2048; t=1526411167; bh=/RLqQdfpq83bVTBaaj/XxuMfSQgsyIx9seaD2o5DufU=; h=From:References:To:Subject:Date:From:Subject; b=Y/oo3N4ZKvoj2Zqt/VhNeG67Fkt5QBEg62VwbaIer+fyZvcBXttRgSdDF2HiThnRDOC+2V9dmdTiZZutXNPciE5XCfvVuwpNl5AKvpS9v5dAZ+sDgx0vtRBze8BAkCAZEtZP1fYE5Fk0LtKcR9olbf2NntrBrcv2s6CQyalH09N7zbQYrydRdsZjg9j3tn94pqc4m4Qg1zEz0Kx4x5XSEsastKGHiQ4EbLC6USMxdSn+AHBpMyuqKdTkGWSlVvFrOVPWOmknAmnQH42/yrzmDui7fj6yEKnBvAjURzRDe4EW68PJex86FYLDsAgN7s75btiZcHGmTG1XY5XPRIj5vQ==
- In-reply-to: <20180514155131.GC7263@chaz.gmail.com>
- List-help: <mailto:email@example.com>
- List-id: Zsh Workers List <zsh-workers.zsh.org>
- List-post: <mailto:firstname.lastname@example.org>
- List-unsubscribe: <mailto:email@example.com>
- Mailing-list: contact zsh-workers-help@xxxxxxx; run by ezmlm
- References: <20180513212553.GA29028@chaz.gmail.com> <CAKc7PVDyrTMsmBSEDcMC=CNVCjOnEDVtywRYA0=UnNCBpF=7JQ@mail.gmail.com> <20180514063611.GA7263@chaz.gmail.com> <CGME20180514064505epcas3p1b2f178c595fc9bb962e4094e296ba699@epcas3p1.samsung.com> <20180514064431.GB7263@chaz.gmail.com> <firstname.lastname@example.org> <20180514123425.GA19631@chaz.gmail.com> <email@example.com> <20180514155131.GC7263@chaz.gmail.com>
Stephane Chazelas wrote:
> > It wouldn't be ridiculous to change the documentation for this case and
> > require "unsetopt multibyte" for strict byte-by-byte comparisions, which
> > is already how it works in the vast majority of other cases.
> But note that here it's not about multibyte vs singlebyte but
> whether [:blank:] honours the locale like the other POSIX
> character classes (alpha, punct...) do.
For consistency with the other character classes, I think the best is to
follow POSIX and the other shells and have [:blank:] call iswblank().
That is apply the patch plus whatever change the documentation needs to
I can't see it actually breaking scripts in practice. We do at least
have the option of using [$' \t'] if we want and could add [[:BLANK:]]
or similar if needed. It does seem wrong for non-breaking spaces to be
matched but that's an issue for NetBSD or whatever.
This isn't as bad as the idiocy of [a-z] matching B-Z.
> What a mess!
I also wish POSIX would standardise an alternative for the C locale
that's UTF-8 aware and with ISO rather than US format dates.
Messages sorted by: