Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: compmatch behaviour



On May 19, 11:34am, Peter Stephenson wrote:
}
} Bart Schaefer <schaefer@xxxxxxxxxxxxxxxx> wrote:
} > There are two situations being handled
} > simultaneously here, and maybe the first thing to do is to separate
} > them.  The first situation is where wpat is a correspondence class
} > and we need to select the corresponding position out of lpat.  The
} > second case is where lpat is an equivalence class and we need to try
} > every possible character in the class at line position *lp.
} 
} Hmm... terminology first... Sven's "correspondence class" appears
} to be the one with the "equiv" flag set, i.e. {...}. So here the
} characters are numbered and we are searching for a particular one.

Actually Sven has, again, overloaded something with a similar structure
to serve multiple purposes.  There are two possible cases:
(1) lpat->equiv is false OR wpat does not exist: an equivalence class.
    [every existing character position in lpat->tab is tried at *lp]
(2) wpat exists AND lpat->equiv is true: a correspondence class.
    [the character wpat->tab[*mword] must have a position in lpat->tab]

Case (1) also occurs as a degenerate of (2) when there is no character
in wpat for the current character in mword.  I'm not sure why that's
correct.

} However, in my rewrite I want to be able to say "any upper case
} character" so that it can match the corresponding lower case
} character.

If it's only going to match the corresponding lower case character, then
you have [:upper:] in wpat->tab and you need to simulate case (2) above.
If your lookup in lpat->tab returns [:lower:], convert *mword to lower
case and you're done.  I have no idea how you plan to handle something
like [:upper:] mapping to [:digit:], though.  There's a reason Sven
chose to require enumeration: this works more like "tr" than like "sed".
The two classes in case (2) ought to have the same number of values,
because its the positions in each class that have to match up.

The bit you're worried about, though, is when you have [:upper:] in
lpat->tab and either no wpat or no character in wpat->tab for *mword.
Then you need to try all the possible upper case characters.  Sven's
algorithm seems to be to build every possible combination all the way
out to the end of the line and then compare entire words, discarding
non-matches.  I would think it's possible to try matching the prefix
so far, so that you can short-circuit the rest of the process on a
non-match.



Messages sorted by: Reverse Date, Date, Thread, Author