Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

UTF-8 non-breaking spaces



> If you're copy-pasting from an edit in browser gmail, for example, it
> has a tendency to insert non-breaking spaces whenever there is more
> than one consecutive space, which the shell interprets as
> non-whitespace and attempts to execute as commands.

Non-breaking space in this case is (bindkey syntax) "\M-B\M- ".  The
error message is equally confusing because you still can't see the
non-breaking spaces when "not found" is reported.

Handling this is complicated by bracketed-paste, which protects the
non-breaking spaces from (for example) { bindkey -s '\M-B\M- ' ' ' }.

"unsetopt multibyte" does not affect this but LANG=C results in (for example)

(In gmail editor)
 echo " " "  "
(Pasted at shell prompt)
% echo " " "<c2><a0> "

That's totally a ZLE display thing, the actual nbsp is output when the
command executes, but at least you can see what's going on.

(The non-breaking spaces go back to normal spaces in sent email, I
believe, or at least do so when the message is displayed in gmail;
this is just a "thing" in the browser text editor.)

Similar goofiness can result when copy-pasting from other "smart"
multibyte editors when zsh has a UTF-8 variant in $LANG.

Any good suggestions how to deal with this in a non-confusing fashion?
 Everything I've thought of (short of hacking up the lexer) risks
corrupting parts of the input that aren't intended to be word
separators (the bindkey -s above has that problem, for example, if
bracketed-paste is disabled).




Messages sorted by: Reverse Date, Date, Thread, Author