Zsh Mailing List Archive Messages sorted by: Reverse Date, Date, Thread, Author
PATCH: wordcode

X-seq: zsh-workers 10041
From: Sven Wischnowsky <wischnow@xxxxxxxxxxxxxxxxxxxxxxx>
To: zsh-workers@xxxxxxxxxxxxxx
Subject: PATCH: wordcode
Date: Fri, 10 Mar 2000 10:28:31 +0100 (MET)
Mailing-list: contact zsh-workers-help@xxxxxxxxxxxxxx; run by ezmlm
So...

This patch does two things:

- It makes lookup of autoloaded function as suggested by Zefram in
  9969. I.e. if you have fpath=(foo) and autoload the function bar, we 
  now look at (in this order):

  - the digest file foo.zwc (if it exists)
  - the wordcode file foo/bar.zwc
  - the normal file foo/bar

  The youngest of these that actually contains a definition for bar is 
  taken.

- It also makes lookup of other digest files (those not named after a
  directory) as suggested by Bart, i.e. the .zwc suffix has to be
  given in $fpath (and in that case no further searching for
  directories or other files is done).

- Finally, it implements my first attempt at supporting compiled
  sourced files. Whenever the shell tries to source a file foo it
  first looks if there is a file foo.zwc, if that is younger and a
  valid wordcode file and if it contains the code for foo. If all that 
  is true, the wordcode file is used.

With that, it should be possible to savely create, for example, digest 
files for the completion directories with something like

  for i in *(/); do zcompile ${i}.zwc ${i}/_*~*~; done

create wordcode files for compinit/compdump (probably better to use
the -r option here) and to create wordcode files for ones init files,
which is as simple as:

  cd; zcompile -r .zshrc; zcompile -r .zcompdump; ...

Again, the -r is, of course, only a suggestion. I've even written that 
suggestion in the docs because if, for example, .zshrc defines a
function, the file will stay mapped because that function still uses
the code from the mapping. On the other hand, if you often start new
shells, this may be good to have (due to the sharing of mapped files).

Dunno, should I take the comment about -r for sourced files out of the 
docs again?


Anyway: executing sourced files from wordcode files probably behaves
not exactly like sourcing the file because this is the simplest
possible implementation -- I haven't tried to ensure everything loop() 
does is done for them, too. So, if you compile a sourced file and that 
behaves differently from the original file, I'd like to hear about it.

Bye
 Sven

diff -ru ../z.old/Doc/Zsh/builtins.yo Doc/Zsh/builtins.yo
--- ../z.old/Doc/Zsh/builtins.yo	Thu Mar  9 15:46:52 2000
+++ Doc/Zsh/builtins.yo	Fri Mar 10 10:02:09 2000
@@ -33,6 +33,11 @@
 they become the positional parameters; the old positional
 parameters are restored when the var(file) is done executing.
 The exit status is the exit status of the last command executed.
+
+If a file named `var(file)tt(.zwc)' exists, is newer than var(file)
+and is a wordcode created with the tt(zcompile) builtin containing the 
+contents of var(file), that file will be used. This allows to speed up 
+processing of scripts by creating pre-compiled wordcode files for them.
 )
 findex(NOTRANS(:))
 cindex(expanding parameters)
@@ -1286,10 +1291,10 @@
 findex(zcompile)
 cindex(wordcode, creation)
 cindex(compilation)
-xitem(tt(zcompile) [ tt(-U) ] [ tt(-r) | tt(-m) ] var(file) [ var(function) ... ])
+xitem(tt(zcompile) [ tt(-U) ] [ tt(-r) | tt(-m) ] var(file) [ var(name) ... ])
 item(tt(zcompile -t) var(file) [ var(name) ... ])(
 This builtin command can be used to create and display files
-containing the wordcode for functions. In the first form, a wordcode
+containing the wordcode for functions or scripts. In the first form, a wordcode
 file is created. If called with only the var(file) argument, the
 wordcode file has the name `var(file)tt(.zwc)' and will be placed in
 the same directory as the var(file). This will make the wordcode file
@@ -1303,25 +1308,30 @@
 )
 for a description of how autoloaded functions are searched).
 
-If there is at least one var(function) argument, the wordcode for all
-these functions will be put in the created wordcode var(file) (if that 
+If there is at least one var(name) argument, the wordcode for all
+these files will be put in the created wordcode var(file) (if that 
 name does not end in tt(.zwc), this extension is automatically
-appended). Such files containing the code for multiple functions are
-intended to be used as elements of the tt(FPATH)/tt(fpath) special array.
+appended). Such digest files are intended to be used as elements of
+the tt(FPATH)/tt(fpath) special array.
 
-If the tt(-U) option is given, aliases in the var(function)s will not
-be expanded. If the tt(-r) option is given, the function(s) in the
+If the tt(-U) option is given, aliases in the var(name)d files will not
+be expanded. If the tt(-r) option is given, the wordcode in the
 file will be read and copied into the shell's memory when they are
-autoloaded. If the tt(-m) option is given instead, the wordcode file
+used. If the tt(-m) option is given instead, the wordcode file
 will be mapped into the shell's memory. This is done in such a way
 that multiple instances of the shell running on the same host will
-share this mapped function. If neither tt(-r) nor tt(-m) are given,
+share this mapped file. If neither tt(-r) nor tt(-m) are given,
 the tt(zcompile) builtin decides which style is used based on the size 
 of the resulting wordcode file. On some systems it is impossible to
-map wordcode files into memory. On such systems, the functions will
-only be read from the files, independent on the mode selected when the 
+map wordcode files into memory. On such systems, the wordcode will
+only be read from the file, independent on the mode selected when the 
 file was created.
 
+When creating wordcode files for scripts instead of functions, it is
+often better to use the tt(-r) option. Otherwise the whole wordcode
+file will remain mapped if the script defined one or more functions
+even if the rest of the file will not be used again.
+
 In every case, the created file contains two versions of the wordcode, 
 one for big-endian machines and one for small-endian machines. The
 upshot of this is that the wordcode file is machine independent and if 
@@ -1329,12 +1339,12 @@
 (and mapped).
 
 In the second form, with the tt(-t) option, an existing wordcode file is
-tested. Without further arguments, the names of the function files
+tested. Without further arguments, the names of the original files
 used for it are listed. The first line tells the version of the shell
 the file was created with and how the file will be used (mapping or
 reading the file). With arguments, only the return value is set
-to zero if all var(name)s name functiones defined in the file and
-non-zero if at least one var(name) is not contained in the wordcode file.
+to zero if all var(name)s name files contained in the wordcode file and
+non-zero if at least one var(name) is not contained in it.
 )
 findex(zmodload)
 cindex(modules, loading)
diff -ru ../z.old/Doc/Zsh/files.yo Doc/Zsh/files.yo
--- ../z.old/Doc/Zsh/files.yo	Thu Mar  9 15:46:53 2000
+++ Doc/Zsh/files.yo	Fri Mar 10 10:02:09 2000
@@ -46,3 +46,7 @@
 be executed when zsh is invoked with the `tt(-f)' option.
 ifnzman(includefile(Zsh/filelist.yo))
 
+For all of these files pre-compiled wordcode files may be created with
+the tt(zcompile) builtin command. If such a files exists (names like
+the original file plus the tt(.zwc) extension) and it is younger than
+the original file, the wordcode file will be used instead.
diff -ru ../z.old/Doc/Zsh/func.yo Doc/Zsh/func.yo
--- ../z.old/Doc/Zsh/func.yo	Thu Mar  9 15:46:53 2000
+++ Doc/Zsh/func.yo	Fri Mar 10 10:02:10 2000
@@ -33,14 +33,25 @@
 cindex(functions, autoloading)
 A function can be marked as em(undefined) using the tt(autoload) builtin
 (or `tt(functions -u)' or `tt(typeset -fu)').  Such a function has no
-body.  When the function is first executed, each element of the tt(fpath)
-variable will first be searched for a file with the same name as the
-function plus the extension tt(.zwc) and then with the name of the
-function.  The first file will only be used if it was created with the 
-tt(zcompile) builtin command, if it contains the wordcode for the
-function and it is either older than the file with the name of the
-function in the same directory or if such a file does not exist.  The
-usual alias expansion during reading will be suppressed
+body.  When the function is first executed, the definition for it will 
+be searched using the elements of the tt(fpath) variable. For each
+element, the shell looks for three files: the element plus the
+extension tt(.zwc), a file named after the function plus the extension 
+tt(.zwc) in a directory named by the element of tt(fpath) and the name 
+of the function without the extension in the same directory. The
+youngest of these files will be used to get the definition for the
+function. The files with the tt(.zwc) extension should be wordcode
+files created with the tt(zcompile) builtin command. The first one
+(with the name of the element from tt(fpath) plus the extension) is
+normally used to contain the definitions for all functions in the
+directory. The latter is intended to be used for individual wordcode
+files for single functions. But of course it is also possible to
+create any number of wordcode files and put their names (including the 
+extension) in the tt(fpath) variable. In that case these files will be 
+searched for the definition of the function directly without comparing 
+its age to that of other files.
+
+The usual alias expansion during reading will be suppressed
 if the tt(autoload) builtin or its equivalent is given the option
 tt(-U), for wordcode files this has to be decided when creating the
 file with the tt(-U) option of the tt(zcompile) builtin command;
diff -ru ../z.old/Src/init.c Src/init.c
--- ../z.old/Src/init.c	Thu Mar  9 15:46:42 2000
+++ Src/init.c	Fri Mar 10 10:02:10 2000
@@ -890,12 +890,15 @@
 int
 source(char *s)
 {
+    Eprog prog;
     int tempfd, fd, cj, oldlineno;
     int oldshst, osubsh, oloops;
     FILE *obshin;
-    char *old_scriptname = scriptname;
+    char *old_scriptname = scriptname, *us;
 
-    if (!s || (tempfd = movefd(open(unmeta(s), O_RDONLY | O_NOCTTY))) == -1) {
+    if (!s || 
+	(!(prog = try_source_file((us = unmeta(s)))) &&
+	 (tempfd = movefd(open(us, O_RDONLY | O_NOCTTY))) == -1)) {
 	return 1;
     }
 
@@ -908,8 +911,10 @@
     oloops    = loops;           /* stored the # of nested loops we are in    */
     oldshst   = opts[SHINSTDIN]; /* store current value of this option        */
 
-    SHIN = tempfd;
-    bshin = fdopen(SHIN, "r");
+    if (!prog) {
+	SHIN = tempfd;
+	bshin = fdopen(SHIN, "r");
+    }
     subsh  = 0;
     lineno = 1;
     loops  = 0;
@@ -917,14 +922,24 @@
     scriptname = s;
 
     sourcelevel++;
-    loop(0, 0);			/* loop through the file to be sourced        */
+    if (prog) {
+	pushheap();
+	errflag = 0;
+	execode(prog, 1, 0);
+	popheap();
+    } else
+	loop(0, 0);		     /* loop through the file to be sourced        */
     sourcelevel--;
-    fclose(bshin);
-    fdtable[SHIN] = 0;
 
     /* restore the current shell state */
-    SHIN = fd;                       /* the shell input fd                   */
-    bshin = obshin;                  /* file handle for buffered shell input */
+    if (prog)
+	freeeprog(prog);
+    else {
+	fclose(bshin);
+	fdtable[SHIN] = 0;
+	SHIN = fd;		     /* the shell input fd                   */
+	bshin = obshin;		     /* file handle for buffered shell input */
+    }
     subsh = osubsh;                  /* whether we are in a subshell         */
     thisjob = cj;                    /* current job number                   */
     lineno = oldlineno;              /* our current lineno                   */
diff -ru ../z.old/Src/parse.c Src/parse.c
--- ../z.old/Src/parse.c	Thu Mar  9 15:46:43 2000
+++ Src/parse.c	Fri Mar 10 10:02:10 2000
@@ -2252,7 +2252,8 @@
 	    zerrnam(nam, "too few arguments", NULL, 0);
 	    return 1;
 	}
-	if (!(f = load_dump_header(*args))) {
+	if (!(f = load_dump_header(*args)) &&
+	    !(f = load_dump_header(dyncat(*args, FD_EXT)))) {
 	    zerrnam(nam, "invalid dump file: %s", *args, 0);
 	    return 1;
 	}
@@ -2280,7 +2281,9 @@
     if (!args[1])
 	return build_dump(nam, dyncat(*args, FD_EXT), args, ops['U'], map);
 
-    return build_dump(nam, *args, args + 1, ops['U'], map);
+    return build_dump(nam,
+		      (strsfx(FD_EXT, *args) ? *args : dyncat(*args, FD_EXT)),
+		      args + 1, ops['U'], map);
 }
 
 /* Load the header of a dump file. Returns NULL if the file isn't a
@@ -2538,21 +2541,94 @@
 
 #endif
 
-/* See if `dump' is the name of a dump file and it has the definition
- * for the function `name'. If so, return an eprog for it. */
+/* Try to load a function from one of the possible wordcode files for it.
+ * The first argument is a element of $fpath, the second one is the name
+ * of the function searched and the last one is the possible name for the
+ * uncompiled function file (<path>/<func>). */
 
 /**/
 Eprog
-try_dump_file(char *dump, char *name, char *func)
+try_dump_file(char *path, char *name, char *file)
+{
+    Eprog prog;
+    struct stat std, stc, stn;
+    int rd, rc, rn;
+    char *dig, *wc;
+
+    if (strsfx(FD_EXT, path))
+	return check_dump_file(path, name);
+
+    dig = dyncat(path, FD_EXT);
+    wc = dyncat(file, FD_EXT);
+
+    rd = stat(dig, &std);
+    rc = stat(wc, &stc);
+    rn = stat(file, &stn);
+
+    /* See if there is a digest file for the directory, it is younger than
+     * both the uncompiled function file and its compiled version (or they
+     * don't exist) and the digest file contains the definition for the
+     * function. */
+    if (!rd &&
+	(rc || std.st_mtime > stc.st_mtime) &&
+	(rn || std.st_mtime > stn.st_mtime) &&
+	(prog = check_dump_file(dig, name)))
+	return prog;
+
+    /* No digest file. Now look for the per-function compiled file. */
+    if (!rc &&
+	(rn || stc.st_mtime > stn.st_mtime) &&
+	(prog = check_dump_file(wc, name)))
+	return prog;
+
+    /* No compiled file for the function. The caller (getfpfunc() will
+     * check if the directory contains the uncompiled file for it. */
+    return NULL;
+}
+
+/* Almost the same, but for sourced files. */
+
+/**/
+Eprog
+try_source_file(char *file)
+{
+    Eprog prog;
+    struct stat stc, stn;
+    int rc, rn;
+    char *wc, *tail;
+
+    if ((tail = strrchr(file, '/')))
+	tail++;
+    else
+	tail = file;
+
+    if (strsfx(FD_EXT, file))
+	return check_dump_file(file, tail);
+
+    wc = dyncat(file, FD_EXT);
+
+    rc = stat(wc, &stc);
+    rn = stat(file, &stn);
+
+    if (!rc && (rn || stc.st_mtime > stn.st_mtime) &&
+	(prog = check_dump_file(wc, tail)))
+	return prog;
+
+    return NULL;
+}
+
+/* See if `file' names a wordcode dump file and that contains the
+ * definition for the function `name'. If so, return an eprog for it. */
+
+/**/
+static Eprog
+check_dump_file(char *file, char *name)
 {
-    char *file;
     int isrec = 0;
     Wordcode d;
     FDHead h;
     FuncDump f;
 
-    file = (strsfx(FD_EXT, dump) ? dump : dyncat(dump, FD_EXT));
-
 #ifdef USE_MMAP
 
  rec:
@@ -2575,24 +2651,9 @@
 
 #endif
 
-    if (!f && (isrec || !(d = load_dump_header(file)))) {
-	if (!isrec) {
-	    struct stat stc, stn;
-	    char *p = (char *) zhalloc(strlen(dump) + strlen(name) +
-				       strlen(FD_EXT) + 2);
-
-	    sprintf(p, "%s/%s%s", dump, name, FD_EXT);
-
-	    /* Ignore the dump file if it is older than the normal one. */
-	    if (stat(p, &stc) || (!stat(func, &stn) && stn.st_mtime > stc.st_mtime))
-		return NULL;
-
-	    if (!(d = load_dump_header(file = p)))
-		return NULL;
+    if (!f && (isrec || !(d = load_dump_header(file))))
+	return NULL;
 
-	} else
-	    return NULL;
-    }
     if ((h = dump_find_func(d, name))) {
 	/* Found the name. If the file is already mapped, return the eprog,
 	 * otherwise map it and just go up. */
@@ -2698,6 +2759,7 @@
 		dumps = p->next;
 	    munmap((void *) f->addr, f->len);
 	    zclose(f->fd);
+	    zsfree(f->name);
 	    zfree(f, sizeof(*f));
 	}
     }

--
Sven Wischnowsky                         wischnow@xxxxxxxxxxxxxxxxxxxxxxx
Messages sorted by: Reverse Date, Date, Thread, Author