Zsh Mailing List Archive Messages sorted by: Reverse Date, Date, Thread, Author
Rough Draft of Article on Writing Completion Functions

X-seq: zsh-users 4812
From: John Beppu <beppu@xxxxxxx>
To: zsh-users@xxxxxxxxxx
Subject: Rough Draft of Article on Writing Completion Functions
Date: Thu, 4 Apr 2002 15:49:32 -0800
Mailing-list: contact zsh-users-help@xxxxxxxxxx; run by ezmlm
    If anything strikes you as being wrong, let me know.
    It's probably going to be too long, and some of the
    sidebars will end up going away in the printed version,
    unfortunately.

    You'll be able to see it in the July 2002 issue of
    Linux Magazine.  It'll be interesting to compare this
    version with the version that the editors hack up.


HEAD: Power Tools
DECK: Learning to Write Zsh Completion Functions While Maintaining Your Sanity
AUTHOR: John Beppu

When one decides to start writing completion functions for Zsh, it's
really hard to know where to start, because there are many disparate
sources from which to draw insight into the workings of the
completion system.  There are the many man pages, Peter Stephenson's
I<Zsh User's Guide>, and finally there is the actual source of the
completion functions which you can find by rummaging through the
directories in C<$fpath>.  Each of these sources of information has
their own particular strengths and weaknesses, but there is no
single source of information that will quickly teach you how to
write completion functions.


SUBHEAD: The Pros and Cons of the Various Sources of Information

The most complete source of information is the man pages (zshcompsys
and zshcompwid).  However, trying to learn how to write completion
functions from them is like trying to learn how to write prose by
reading a dictionary straight through.  This is futile due to the
lack of a cohesive narrative that explains things from beginning to
end.  Thus, the man pages are best used as a reference.

The most practical source of information is the source to the shell
functions.  In them, you can see the real coding techniques and
idioms that you will have to learn, and you get to see things from
beginning to end.  The downside is that their authors had no
intention of creating teaching aides, and thus they have very few
comments in them.  This, too, is an incomplete but valuable source
of information.  They are like role models for the young completion
function programmer to look up to.  First you'll mimick what they
do, and when you mature and gain an understanding of what you're
doing, you will (without a doubt) have your own unique way of doing
things.

The one source of information that had a chance to get it right was
the I<Zsh User's Guide>.  It provides many examples followed by
explanations, but what it lacks is an example that takes a command
and then writes a completion function for it, from start to finish.
It also starts talking about some very advanced material without
preparing the reader for this onslaught.  However, it is a living
document, so let's hope that it matures and grows sensitive to what
people really want from such a document so that it can effectively
satisfy those needs.

The great tragedy of Zsh is that they actually made it very easy to
write completion functions, but you'd never know it by just
reading the documentation.  But no worries -- there's going to be a
happy ending.  :-)


SUBHEAD: The Basic Mechanics of the Completion System

If you look at the actual source of the completion
functions (by digging through C<$fpath>), you might notice
something.  Do you see how often the C<_arguments> function is used?
It's everywhere, and that's because it's easily the most important
function in the arsenal of functions made available to authors of
completion functions.  This is the function you must focus the
majority of your attention on.

The C<_arguments> function is a wrapper around the C<compadd>
builtin like so many other auxiliary functions that come with Zsh.
Ultimately, it is C<compadd> that takes a list of information and
feeds it to the core of the completion system so that Zsh can
display the results of a completion request on the terminal.  The
other builtin to be aware of is C<compdef> which is used to bind
completion functions to commands.  See the sidebar on C<compadd> and
C<compdef> to see them being used in combination.

You don't need to know a lot more than this to effectively write
completion functions, so without further adieu, let's look at an
example that takes a command (I<figlet> in this case), and describes
how the completion function for it was programmed.  In case you
don't know, I<figlet> is a cute little program for taking text
and rendering each letter using ASCII art fonts.  


SUBHEAD: A Completion Function for FIGlet

Look over the C<_figlet> function in I<Listing 1> and study its
contents for a moment.  Note how structurally simple it is.  There
are hardly any conditional statements at all, and that's because all
the hard work has been done for you already.  Pause for a moment to be
thankful for that, and then get ready.  (See I<Preparation> for details)

We start by creating a file called C<_figlet> and putting it
somewhere in our C<$fpath>.  The underscore is necessary simply
because the C<compinit> function which initializes the completion
system looks in the C<$fpath> directories for files that have
leading underscores.  Then, based on what the first line of the file
is, C<compinit> will take various actions.

For completion functions, the first line of the file should say
C<#compdef> followed by the name of the command(s) this function
will perform completions for.  There are other things you can say on
the first line, but they are rarely used in practice.

Next, on lines 3 through 6, we initialize a few variables.
C<$opt_args>, C<$context>, C<$state>, and C<$line> are variables
that Zsh's completion system has special knowledge of.  C<$opt_args>
is an associative array that will have command line options like
C<-d> or C<-f> as its keys and parameters to those options (if any)
as its values.  C<$state> is a scalar variable that will be used by
C<_arguments>' state mechanism.  The other two special variables,
C<$context> and C<$line> won't be used directly in this function,
but C<_arguments> will be using them behind the scenes.  The only
reason they are declared local in this C<_figlet> function is to
prevent namespace pollution.  (Zsh used dynamic scoping for its
variables rather than lexical scoping).

There's one other variable, C<$fontdir>, and it is unique to the
C<_figlet> function.  Its purpose is to hold the directory where all
the figlet fonts are stored.  We can find out what this directory is
by executing C<figlet -I2>, but notice how we don't do things so
simply on line 6.  Instead we ask the C<_call_program> function to
execute C<figlet -I2> for us, and we also redirect C<STDERR> to
C</dev/null>.  Whenever you want to save the output of an external
command, you should get into the habit of using C<_call_program>,
because it lets advanced users override the actual program that is
called.  Doing this for C<_figlet> is actually a very contrived
thing to do, but it's something that completion function writers
should be aware of.

With the initialization phase of our function behind us, we're
finally ready to call C<_arguments>.  The genius of C<_arguments> is
that it reduces the task of writing completion functions to an
exercise in taking human-readable documentation and turning it into
something machine-readable.  If you have the man page for I<figlet>
handy, look at it, and then look at lines 9 through 34.  Can you see
the parallels between the two?

Let's start by looking at lines 9 through 12 which describe the options
for line justification.  Each of these lines has 3 distinct parts.

  - The exclusion list which is everything between parentheses ( )
  - The option
  - The description which is between square brackets [ ]

The purpose of the exclusion list becomes clear when you ask yourself,
"Does it make sense to be able to be left justified and center justified
at the same time?"  No, it's illogical, but Zsh doesn't know that, so
you have to tell it explicitly.  To use line 12 as a specific example,
it is saying that if there's a C<-r> on the command line, do not provide
C<-x>, C<-l>, or C<-c> as completions from now on.  Exclusion lists are
a very useful feature, but you don't always need them, so they are
purely optional.

That's all you have to know for simple single letter options, and if you look
closely, that just enabled us to cover lines 9 through 25.

However, on line 26, we encounter a new kind of option that takes an
argument.  The C<-w> option tells I<figlet> how many characters wide the
output should be, and it takes an integer as an argument.  There are
only 2 differences here.  The first difference is that the option part
has a + after it, so it says C<-w+>, now.  The + means that this option
takes another argument.  The second difference is that there is an
additional string after the description that is used as a hint when the
user presses tab.  However, it is only visible when you've set your
verbosity up as described in I<Preparations>.

Line 29 adds another variation.   Here we have an option, that takes an
argument, that is easily predictable.  There are only 6 possible
arguments that C<-I> can take, so after the hint, we list those 6
arguments in parentheses.

Line 30 adds yet another variation.  The C<-d> option wants
a directory as an argument, so we use the C<_path_files> function to
build the list for us.  Like C<_arguments>, C<_path_files> is also a
front-end to C<compadd>.

Line 31 is the last variation C<_figlet> is going to need.  Sometimes,
building a completion list for an option's arguments is non-trivial, so
you might want to (or have to) handle it with custom code.  That's when
we use the I<state mechanism>.  See where it says, "->fonts"?  That
means, "Let me handle it, and set the C<$state> variable to 'fonts'."
Before this, C<_arguments> was practically doing all the work, and the
flow of control never went past line 34.  

However, when we use the state mechanism, we're going to fall through
and hit line 36 to adjust the C<$fontdir> directory if necessary, and
then on line 38, we're going to hit the case statement.  Based on the
value of the C<$state> variable, we're either going to build a
completion list for fonts or control files.  In both cases, we're going
to use the C<_files> function which appropriately enough builds
completion lists for files.  Using the C<-W> option, we root the search
for completions in whatever C<$fontdir> is, and using the C<-g> option,
we specify a glob pattern that limits which files are returned in the
completion list.   In fact, the glob pattern is the only difference
between font completion and control file completion, and even the glob
pattern is only different by one letter.  What a waste of code.  Oh
well.  At least it works, and it's not ugly.

So alright, we're done.  Oh, one more thing.  Return 0 when you succeed
and return 1 (or non-zero) when you fail.  That's it.


SUBHEAD: Topic Review and Suggestions for Further Study

Well, that's not really it.  Writing a completion function can be a lot
more complicated (or creative depending on your point of view) than
this.  If you found this too easy, take a look at the completion
functions for I<tar>, I<cvs>, and I<ssh> for starters.  They can teach
you a lot, but your code reading skills have to be fairly strong.

Still, you can do a lot with what you've learned here.  You know:

    - where to put your completion functions
    - how completion functions should start
    - how to use arguments for
        + single letter options
        + single letter options with arguments
        + single letter options with predictable arguments
        + single letter options with directory arguments
        + single letter options with complex arguments that you need to
                                handle yourself

Anything you don't know, can be found out by looking at the man pages.
The Zsh Guide has some advanced tips hiding in it if you look.  And
again, the source of other completion functions is a good place to look
for real-life examples.  This makes the Zsh experience better for you
and everyone else (if you choose to contribute your work back).


BIO: John Beppu <beppu@xxxxxxxx> would like to send shout out to Oliver
Kiddle, Felix Rosencrantz, and Bart Schaefer for their invaluable help.


[ BEGIN LISTING 1 - _figlet ]

C<
#compdef figlet

typeset -A opt_args
local context state line
local fontdir
fontdir=$(_call_program path figlet -I2 2>/dev/null)

_arguments -s -S \
  "(-l -c -r)-x[use default justification of font]" \
  "(-x -c -r)-l[left justify]" \
  "(-x -l -r)-c[center justify]" \
  "(-x -l -c)-r[right justify]" \
  "(-S -s -o -W -m)-k[use kerning]" \
  "(-k -s -o -W -m)-S[smush letters together or else!]" \
  "(-k -S -o -W -m)-s[smushed spacing]" \
  "(-k -S -s -W -m)-o[let letters overlap]" \
  "(-k -S -s -o -m)-W[wide spacing]" \
  "(-p)-n[normal mode]" \
  "(-n)-p[paragraph mode]" \
  "(-E)-D[use Deutsch character set]" \
  "(-D)-E[use English character set]" \
  "(-X -R)-L[left-to-right]" \
  "(-L -X)-R[right-to-left]" \
  "(-L -R)-X[use default writing direction of font]" \
  "(-w)-t[use terminal width]" \
  "(-t)-w+[specify output width]:output width (in columns)" \
  "(-k -S -s -o -W)-m+[specify layout mode]:layout mode" \
  "(-I)-v[version]" \
  "(-v)-I+[display info]:info code:(-1 0 1 2 3 4)" \
  "-d+[specify font directory]:font directory:_path_files -/" \
  '-f+[specify font]:font:->fonts' \
  '(-N)-C+[specify control file]:control file:->controls' \
  "(-C)-N[clear controlfile list]" \
  && return 0

(( $+opt_args[-d] )) && fontdir=$opt_args[-d]

case $state in
(fonts)
  _files -W $fontdir -g '*flf*(:r)' && return 0
  ;;
(controls)
  _files -W $fontdir -g '*flc*(:r)' && return 0
  ;;
esac

return 1
>

[ END LISTING 1 ]


[ BEGIN SIDEBAR 1 - compadd and compdef ]

[ JSB: I'd like to leave this sidebar in, but it's optional ]

You can try this example from the command line.

- First, we define a function which uses C<compadd> to feed values
  into the completion system.
- Then we use C<compdef> to bind the function to the command, C<f>.
- Finally, we let the completion function go into action by trying
  to execute the C<f> command and pushing tab a lot.  Note that the
  completion system works regardless of whether there the C<f>
  command exists or not.

C<
% f() { compadd a b c x y z }

% compdef _f f

% f [TAB][TAB][TAB][TAB]
>

[ END SIDEBAR 1 ]


[ BEGIN SIDEBAR 2 - A Public Service Announcement ]

For those of you just joining us, Zsh has quite the completion
system.  Your tab key will have answers to practically everything if
you put the following in your I<.zshrc> and start up I<zsh>.

C<
autoload -U compinit
compinit
>

Try running C<compinstall>, too.  It will let you interactively
configure the behaviour of the completion system.

[ END SIDEBAR 2 ]


[ BEGIN SIDEBAR 3 - Preparation ]

B<Verbosity>: When you push tab, Zsh can provide you with very
descriptive completion lists, and having the following lines in your
I<~/.zshrc> is one way to tell Zsh to do so.  (Thanks to Oliver Kiddle
for these.)

C<
zstyle ':completion:*' verbose yes
zstyle ':completion:*:descriptions' format '%B%d%b'
zstyle ':completion:*:messages' format '%d'
zstyle ':completion:*:warnings' format 'No matches for: %d'
zstyle ':completion:*' group-name ''
>

B<Where To Put Your Functions>:  You should set a side a directory
somewhere under C<$HOME> for your personal Zsh functions.  As an example,
assume you decided to put your functions in I<~/fun>.  Your
I<~/.zshrc> should be modified to have something like:

C<
fpath=(~/fun $fpath)
autoload -U ~/fun/*(:t)
>

Doing this will make all the functions you put in I<~/fun> available
to any Zsh session you start up.  Also, by storing your completion
functions in a directory that's in C<$fpath>, they become visible to the 
C<compinit> function which is what initializes the completion system.


B<Reloading Functions>:  In the course of writing your own functions,
you will find yourself having to reload the function over and over for
debugging purposes.  Here's a function to blindly C<unfunction> and then
C<autoload> everything in I<~/fun>.  It might seem wasteful, but in
practice it only takes an instant to run.

C<
r() {
    local f
    f=(~/fun/*(.))
    unfunction $f:t 2> /dev/null
    autoload -U $f:t
}
>

[ JSB: the part about Documentation can be taken out if you hit
  space limitations. ]

B<Documentation>:  Searching for the right piece of documentation can take
longer than you expect, and if you do it repeatedly, it can really wear you
down.  To avoid this unnecessary fatigue, the trick is to do all your
searching at the start of your coding session.  This usually means having a
lot of man pages open at once.  Imagine this:

5 instances of C<man zshexpn> with each instance focused on one of
"Modifiers", "PARAMETER EXPANSION", "Parameter Expansion Flags",
"Globbing Flags", or "Glob Qualifiers".

2 instances of C<man zshcompsys> with one instance on the description
of "_arguments" and another one free to roam.

1 instance of C<man zshcompwid> scrolled down to the part 
on the "compadd" builtin.

You get the idea.  How you organize this information on your desktop is
up to you.  Using multiple terminals in combination with programs like
I<screen> might be a good idea, though.  Also, the same content is
available as HTML or GNU/info manuals if you prefer those formats.
Find out what works for you, and do it.

[ END SIDEBAR 3 ]



[ BEGIN SIDEBAR 4 - Refinements ]

[ JSB: this entire sidebar is optional ]

Oliver Kiddle from the zsh-workers mailing list suggested the following
improvements to I<_figlet>.

B<Adding More Documentation>: On line 29, Zsh is told that the C<-I>
option takes an integer between -1 and 4 as an argument.  It would be
even more helpful if we could say what those numbers meant.  That can
be done by doing:

C<
"(-v)-I+[display info]:info code:((
  -1\:normal\ operation\ \(default\)
  0\:version,\ copyright\ and\ usage\ information
  1\:version\ in\ integer\ format
  2\:default\ font\ directory
  3\:name\ of\ font\ figlet\ would\ use
  4\:output\ width\ in\ columns
))"
>

B<Not Using States>:  By making lines 29 and 30 not use states, line 4 and
lines 34 through 47 can be deleted.  The key to making this work is realizing
that C<$opt_args> is available during the call to I<_arguments> as well as
after.  Line 29 would be replaced by the following.

C<
'-f+[specify font]:font:_files \
    -W ${~opt_args[-d]\:-$fontdir} \
    -g \*flf\*\(\:r\)' \
>

For printing purposes, this line has been split up, but in reality, it should
all be on a single line.  Also, note how it is single-quoted to prevent
C<$opt_args> and C<$fontdir> from being expanded prematurely.

[ END SIDEBAR 4 ]


[ BEGIN SIDEBAR 5 - Resources ]

Zsh
    http://www.zsh.org/

FIGlet
    http://ianchai.50megs.com/figlet.html

Felix Rosencrantz's XML-based Completion Generator
    http://www.geocities.com/f_rosencrantz/xml_completion.htm

[ END SIDEBAR 5 ]
Follow-Ups:
- Re: Rough Draft of Article on Writing Completion Functions
  - From: Peter Stephenson
Messages sorted by: Reverse Date, Date, Thread, Author