Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

Re: TRAPINT doesn't work reliably



On Thursday, September 26, 2019 2:48 PM, Dennis Schwartz <dennis.schwartz@xxxxxxxxxxxxxx> wrote:

> However, I can only reproduce the bug if I have the following code in my
> `~/.zshrc`:
>
> # Antigen zsh plugins
> if [ -f "/usr/share/zsh-antigen/antigen.zsh" ]; then
>     source "/usr/share/zsh-antigen/antigen.zsh"
>
>     # load some plugins here, but they are not relevant to trigger
>     # the bug
> fi
>
> So, I conditionally `source` another file. Apparently, this is causing
> super weird behavior. Unbelievably, if I open the file `.zshrc` (e.g.,
> vim/gedit) and save the file, I cannot trigger the bug. However, if I
> open the file, but do not save the file, I always trigger the bug.

Okay, so of course that didn't make any sense. Now I know that I can
trigger the bug if (at least) the following conditions have been met:

* On my system (Debian 10), I need to compile zsh with the version
  number from my default Debian installation. So I always do
  `git checkout zsh-5.7.1 -- Config/version.mk` before I compile.
* `.zshrc` needs to contain several function definitions, aliases,
  keybindings, or other configurations.
* `.zshrc` needs to contain a trap on interrupt.
* I suspect that `.zshrc` also needs to contain
  `source "/usr/share/zsh-antigen/antigen.zsh"` (I'm using 2.2.3-2 from
  Debian 10)
* `zsh` needs to be started twice.
  * The first time the bug cannot be triggered.
  * The second time the bug can be triggered by typing a character and
    then hitting TAB to autocomplete. Now hit Ctrl+C to interrupt. The
    bug is triggered.

I suspect that `.zshrc` is read and either zsh or antigen generates some
files based on the loaded configuration. That would explain why the bug
is only triggered after zsh has been executed at least once.

Unfortunately, I cannot easily generate a minimal `.zshrc` that triggers
the bug. If I remove a function definition of my `.zshrc` and replace it
by a bogus function I can trigger the bug based on the function
definition. I haven't found a clear pattern though. However, I found
that I could cause zsh to segfault using the following Python 3
generated `.zshrc`

>>> open('/home/USERNAME/.zshrc', 'w').write('function fun() { echo "' + 'a' * (1 << 24) + '" }\nTRAPINT() { print $1; return $(( 128 + $1 )) }\nsource "/usr/share/zsh-antigen/antigen.zsh"')

WARNING: This causes to crash zsh even if you replace your `.zshrc` with
a 'normal' file again. You have to first run `zsh -f`, afterwards, you
can start `zsh` again normally. I guess this again has to do with some
file begin automatically generated by zsh or antigen which needs to be
regenerated. Which file could this be? How can I easily see which files
get loaded on start-up? The file `~/.zcompdump` remains the same,
independent whether the bug can be triggered.


I have run

$ ./configure --enable-zsh-debug --enable-zsh-mem && make && sudo make install
$ valgrind --leak-check=full --log-file=zsh-valgrind.log /usr/local/bin/zsh

to capture the segfault. I cannot be sure that this is the same bug as
the one I experience with the TRAPINT function.

The log file (the memory addresses shift 0x10 bytes if I compile without
`--enable-zsh-mem`):

> Memcheck, a memory error detector
> Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
> Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
> Command: /usr/local/bin/zsh
> Parent PID: 10371
>
> Invalid read of size 1
>    at 0x4839565: __strncmp_sse42 (vg_replace_strmem.c:651)
>    by 0x14BD28: execfuncdef (exec.c:5286)
>    by 0x140669: execsimple (exec.c:1248)
>    by 0x140A75: execlist (exec.c:1378)
>    by 0x14037F: execode (exec.c:1194)
>    by 0x168151: source (init.c:1460)
>    by 0x168649: sourcehome (init.c:1536)
>    by 0x167D01: run_init_scripts (init.c:1340)
>    by 0x169224: zsh_main (init.c:1754)
>    by 0x11FD44: main (main.c:93)
>  Address 0x584f110 is not stack'd, malloc'd or (recently) free'd
>
>
> Process terminating with default action of signal 11 (SIGSEGV)
>  Access not within mapped region at address 0x584F110
>    at 0x4839565: __strncmp_sse42 (vg_replace_strmem.c:651)
>    by 0x14BD28: execfuncdef (exec.c:5286)
>    by 0x140669: execsimple (exec.c:1248)
>    by 0x140A75: execlist (exec.c:1378)
>    by 0x14037F: execode (exec.c:1194)
>    by 0x168151: source (init.c:1460)
>    by 0x168649: sourcehome (init.c:1536)
>    by 0x167D01: run_init_scripts (init.c:1340)
>    by 0x169224: zsh_main (init.c:1754)
>    by 0x11FD44: main (main.c:93)
>  If you believe this happened as a result of a stack
>  overflow in your program's main thread (unlikely but
>  possible), you can try to increase the size of the
>  main thread stack using the --main-stacksize= flag.
>  The main thread stack size used in this run was 8388608.
>
> HEAP SUMMARY:
>     in use at exit: 62,886 bytes in 919 blocks
>   total heap usage: 1,052 allocs, 133 frees, 105,358 bytes allocated
>
> 1 bytes in 1 blocks are definitely lost in loss record 4 of 360
>    at 0x483577F: malloc (vg_replace_malloc.c:299)
>    by 0x17E8C8: zalloc (mem.c:966)
>    by 0x1B232A: ztrdup (string.c:83)
>    by 0x16724E: setupvals (init.c:1062)
>    by 0x169210: zsh_main (init.c:1749)
>    by 0x11FD44: main (main.c:93)
>
> 2 bytes in 1 blocks are definitely lost in loss record 11 of 360
>    at 0x483577F: malloc (vg_replace_malloc.c:299)
>    by 0x17E8C8: zalloc (mem.c:966)
>    by 0x166AC3: init_term (init.c:805)
>    by 0x19564C: term_reinit_from_pm (params.c:4892)
>    by 0x1956A4: termsetfn (params.c:4912)
>    by 0x18ED1C: assignstrvalue (params.c:2532)
>    by 0x190C74: assignsparam (params.c:3144)
>    by 0x18A805: createparamtable (params.c:867)
>    by 0x167446: setupvals (init.c:1116)
>    by 0x169210: zsh_main (init.c:1749)
>    by 0x11FD44: main (main.c:93)
>
> 4 bytes in 1 blocks are definitely lost in loss record 22 of 360
>    at 0x483577F: malloc (vg_replace_malloc.c:299)
>    by 0x17E8C8: zalloc (mem.c:966)
>    by 0x1CA909: metafy (utils.c:4769)
>    by 0x1CAABE: ztrdup_metafy (utils.c:4826)
>    by 0x18A6E6: createparamtable (params.c:834)
>    by 0x167446: setupvals (init.c:1116)
>    by 0x169210: zsh_main (init.c:1749)
>    by 0x11FD44: main (main.c:93)
>
> 5 bytes in 1 blocks are definitely lost in loss record 26 of 360
>    at 0x483577F: malloc (vg_replace_malloc.c:299)
>    by 0x17E8C8: zalloc (mem.c:966)
>    by 0x1B232A: ztrdup (string.c:83)
>    by 0x166F14: setupvals (init.c:973)
>    by 0x169210: zsh_main (init.c:1749)
>    by 0x11FD44: main (main.c:93)
>
> 8 bytes in 1 blocks are definitely lost in loss record 62 of 360
>    at 0x483577F: malloc (vg_replace_malloc.c:299)
>    by 0x17E8C8: zalloc (mem.c:966)
>    by 0x195CA7: mkenvstr (params.c:5244)
>    by 0x18A862: createparamtable (params.c:871)
>    by 0x167446: setupvals (init.c:1116)
>    by 0x169210: zsh_main (init.c:1749)
>    by 0x11FD44: main (main.c:93)
>
> 9 bytes in 1 blocks are definitely lost in loss record 77 of 360
>    at 0x483577F: malloc (vg_replace_malloc.c:299)
>    by 0x17E8C8: zalloc (mem.c:966)
>    by 0x1B232A: ztrdup (string.c:83)
>    by 0x166F2E: setupvals (init.c:974)
>    by 0x169210: zsh_main (init.c:1749)
>    by 0x11FD44: main (main.c:93)
>
> 9 bytes in 1 blocks are definitely lost in loss record 78 of 360
>    at 0x483577F: malloc (vg_replace_malloc.c:299)
>    by 0x17E8C8: zalloc (mem.c:966)
>    by 0x1B232A: ztrdup (string.c:83)
>    by 0x166F48: setupvals (init.c:975)
>    by 0x169210: zsh_main (init.c:1749)
>    by 0x11FD44: main (main.c:93)
>
> 10 bytes in 1 blocks are definitely lost in loss record 95 of 360
>    at 0x483577F: malloc (vg_replace_malloc.c:299)
>    by 0x17E8C8: zalloc (mem.c:966)
>    by 0x1CA909: metafy (utils.c:4769)
>    by 0x1672C5: setupvals (init.c:1075)
>    by 0x169210: zsh_main (init.c:1749)
>    by 0x11FD44: main (main.c:93)
>
> 15 bytes in 1 blocks are definitely lost in loss record 128 of 360
>    at 0x483577F: malloc (vg_replace_malloc.c:299)
>    by 0x17E8C8: zalloc (mem.c:966)
>    by 0x1B232A: ztrdup (string.c:83)
>    by 0x166F62: setupvals (init.c:976)
>    by 0x169210: zsh_main (init.c:1749)
>    by 0x11FD44: main (main.c:93)
>
> 16 bytes in 1 blocks are definitely lost in loss record 134 of 360
>    at 0x483577F: malloc (vg_replace_malloc.c:299)
>    by 0x17E8C8: zalloc (mem.c:966)
>    by 0x17CC01: pushheap (mem.c:304)
>    by 0x18A6FA: createparamtable (params.c:848)
>    by 0x167446: setupvals (init.c:1116)
>    by 0x169210: zsh_main (init.c:1749)
>    by 0x11FD44: main (main.c:93)
>
> 68 (56 direct, 12 indirect) bytes in 1 blocks are definitely lost in loss record 263 of 360
>    at 0x483577F: malloc (vg_replace_malloc.c:299)
>    by 0x17E8C8: zalloc (mem.c:966)
>    by 0x17EA5E: zshcalloc (mem.c:979)
>    by 0x183E4A: load_module (module.c:2219)
>    by 0x167C3B: run_init_scripts (init.c:1318)
>    by 0x169224: zsh_main (init.c:1754)
>    by 0x11FD44: main (main.c:93)
>
> 81 bytes in 2 blocks are definitely lost in loss record 290 of 360
>    at 0x483577F: malloc (vg_replace_malloc.c:299)
>    by 0x17E8C8: zalloc (mem.c:966)
>    by 0x1B232A: ztrdup (string.c:83)
>    by 0x18A87E: createparamtable (params.c:874)
>    by 0x167446: setupvals (init.c:1116)
>    by 0x169210: zsh_main (init.c:1749)
>    by 0x11FD44: main (main.c:93)
>
> 112 bytes in 4 blocks are definitely lost in loss record 299 of 360
>    at 0x483577F: malloc (vg_replace_malloc.c:299)
>    by 0x17E8C8: zalloc (mem.c:966)
>    by 0x1CA909: metafy (utils.c:4769)
>    by 0x18A7EB: createparamtable (params.c:867)
>    by 0x167446: setupvals (init.c:1116)
>    by 0x169210: zsh_main (init.c:1749)
>    by 0x11FD44: main (main.c:93)
>
> 256 bytes in 1 blocks are definitely lost in loss record 330 of 360
>    at 0x483577F: malloc (vg_replace_malloc.c:299)
>    by 0x17E8C8: zalloc (mem.c:966)
>    by 0x18A675: createparamtable (params.c:829)
>    by 0x167446: setupvals (init.c:1116)
>    by 0x169210: zsh_main (init.c:1749)
>    by 0x11FD44: main (main.c:93)
>
> LEAK SUMMARY:
>    definitely lost: 584 bytes in 18 blocks
>    indirectly lost: 12 bytes in 1 blocks
>      possibly lost: 0 bytes in 0 blocks
>    still reachable: 62,290 bytes in 900 blocks
>         suppressed: 0 bytes in 0 blocks
> Reachable blocks (those to which a pointer was found) are not shown.
> To see them, rerun with: --leak-check=full --show-leak-kinds=all
>
> For counts of detected and suppressed errors, rerun with: -v
> ERROR SUMMARY: 15 errors from 15 contexts (suppressed: 0 from 0)



On Friday, September 27, 2019 1:46 PM, Daniel Shahaf <d.s@xxxxxxxxxxxxxxxxxx> wrote:

> Dennis Schwartz wrote on Thu, 26 Sep 2019 17:10 +00:00:
>
> > $ valgrind --leak-check=full --log-file=zsh-valgrind.log /usr/local/bin/zsh
> > /usr/share/zsh-antigen/antigen.zsh:2134: parse error near `\n'
> > $ ll
>
> What's the output of `dpkg -l zsh-antigen`? (I'm looking for the version number.)

Good point. Debian 10 (buster) ships 2.2.3-2, which I'm running.
I believe the bug is triggered in zsh by using this newer version
(Debian 9 ships 1.3.4-1). If I compile and run zsh 5.3.1 (shipped with
Debian 9, where I did not encountered this issue) on with `zsh-antigen`
from Debian 10, I can also trigger the bug.


On Friday, September 27, 2019 7:05 PM, Peter Stephenson <p.w.stephenson@xxxxxxxxxxxx> wrote:

> Thanks, this is exactly what I was asking for.

Thanks for quite extensively explaining what's going on!

> Does removing that assignment make a difference?

No, the bug triggers for any TRAPINT function I've tried so far.


I have the feeling we getting closer to the root cause of the bug.

Cheers,
Dennis



Messages sorted by: Reverse Date, Date, Thread, Author