Zsh Mailing List Archive
Messages sorted by: Reverse Date, Date, Thread, Author

New features, uniq(1) alike?



This idea was prompted by the recent "Builtin append() and prepend() ..."
discussion.  The patch below can do what the original poster requested
and a lot more, resembling the functionality of the external command
`uniq'.  There are two places in zsh I thought these features might fit:

As new options -q, -d and -U to the print builtin, operating on its
argument list, which is straightforward and in one respect comparable to
the existing -m option (that they might throw away all the arguments).

Secondly, in parameter substitution, I invented the new flags q, d and u.
A little redecoration was required here in order to deal with the a-per-
fectly-good-array-reduced-to-one-or-zero-elements problem, admittedly an
inconsistency with other flags.  Now, what is this all about:

print -q ... or ... ${(q)foo}     print only the first of repeated words
      -d            ${(d)foo}     print only words that are repeated
      -U            ${(u)foo}     print only words that are not repeated
      -dq           ${(dq)foo}    the sole useful combination of the above.
                                  Throw away all words that are not repea-
                                  ted, then keep only one of all those that
                                  are; equivalent to `uniq -d|uniq'

There are the following advantages when compared to the external command:

1) of course there is no fork/exec necessary when something like
   `foo=$(print -lo $bar|uniq -u)' gets replaced by `foo=${(uo)bar}'
2) the word delimiter doesn't need to be a newline;
3) the order of elements can be preserved, if desired.  OTOH the
   uniquifying (sp?) can be done faster if the elements are sorted.

The latter is the major drawback with `typeset -U'.  This is hopelessly
ineffective in case the array is bigger than a few elements and there is
no way to tell it that the array is sorted.  I'm dealing with this by
making unique after sorting, and providing different functions depending
on the sort options -o and -O, or sort flags (oO), respectively.

A few examples and the patch are to follow below.  Is it useful?  Is the
shell the right place for it?  Are the algorithms employed too crude?
What do you think?  

Regards,
--Thorsten


Find the words that are only in one of both arrays and not in the other

% foo=(a b c c) bar=(b c d d)
% echo ${(u)=:-${(q)foo} ${(q)bar}}
a d


Which commands are in more than one directory of $PATH?

% cmds=( $^path/*(N-x^/:t) ); whence -ap ${(dqo)cmds}
/usr/bin/compress
/bin/compress
/local/home/kaefer/bin/csh
/usr/bin/csh
/bin/csh
...


Prepend() and append(); could be done with `typeset -U' as well.
Take the first positional parameter - if present - as a name of an
array and prepend/append the other positional parameters to it,
removing duplicate words as appropriate.

% functions prepend append
prepend () {
        [[ -n $1 ]] && $1=($(eval print -q $@[2,-1] \$$1))
}
append () {
        local tmp
        tmp=($(eval print -q $@[2,-1] \$$1 ))
        [[ -n $1 ]] && $1=($tmp[$#,-1] $@[2,-1])
}


Word frequency count.  Zsh saves a few `tr's and one `sort -u' but
still doesn't obviate the need for external commands :)

% (IFS=$IFS'!"#$%&'\''()*+,-./:;<=>?@[\\]^_`{|}~';
  print -l ${(Ldo):-$(< zsh-RCS/Etc/FAQ))})|uniq -c|sort -r|head -10|cat -n
     1   482 the
     2   228 to
     3   203 is
     4   173 of
     5   173 in
     6   170 a
     7   169 zsh
     8   151 you
     9   149 and
    10   113 for


begin 664 b20-uniqpatch.gz
M'XL(",1KP#$"`WIS:#(N-F(R,"UU;FEQ<&%T8V@`W5I;4]M*$GYV?L7`5IWX
M(AM+EFV,P\/F4*<.55F@DE#[0%A*EL98%5D2ND!(PG_?[IX9:23+0$AJ+\=%
MC&;4W=.7KWMZAG2[7784N7M?T]4B]X/,#]/!V@E;YFPVV1O"SX@-AP<CZ\">
MM:R!:;WJ]_N/,I@6,\<'PQ'\O.I6/SAFD^&^,3%M1A.,)?[U*ALP=KQDV8JS
M3\NW9^<?_CRZ^L?QR?F'3\LS%L69'X7,3UG*,X.(UMP)_?`Z9=&2]9@3>B#G
M4Y_Y(;P%.C<*,_XE8T["67KGQ#'W!D`Q^'CV:@<7B!,_S%#T!7#!.$S2X&M\
M<G06G?KN&E]<JE<YC#XMCT,Q^0FD7#"<?R_8\9&+E\2R/':2:QP.!@-V"=3_
M]+,5"R.V#!Q4-V%W.($CD"^,`8Y\S<,L)7U)-^ZQ*`1N?)UF8)^3P$R>Q3D8
ME3*/IV[B+X!J<8_V<'<5P:*&$(Y,RR@(HCMP$?/\Y9(G/'1Y>D"A4^Z'Y_[_
MGOMOO/._=`0:$V(\-B:3H4@("A%.3$<J1.0&'\3#L[]DU_XM!T]'UQS6D>80
M243A2L3@%`=KYS-/49TUNXN2S\QU4HY"0H_''+["++A_U:/(]`3;#2[3$SZH
M^28/_9N<!_?22"<(V`*\$84<!**/7(Z`\%&L[SH!B"F9$[Z.;L%=RR1:5^2R
MP$^S054)3U?"R^,`Q&6Z*E$8W,^E3%*D?'6W\MT5+@VP<Q+&OS@N&`D<H-V6
MQ1'[5:<"?^E6=(G4!=?=,+A!P9H]Y[H]PHT_8(PT!02L(X!GMG+"1\V1J29Q
MXPK<-`44\M6-@GP=I@1#6=/YESC4ZOEXSQHQ$XKY\&"X#QO`OE[_:[1/U_Z1
M.35&EE5"G29&134:O&6^RED=T0+/!DNC)",4]RL85C;W4,`-_#KR4Q<S]DF0
M8EYXJ2A='(*0YH&&1I#FU:3)\#6PL7;`G5O,>(()5"+$<0$.XC`065I0:R'M
M#$K;;\A@KU@ZC71K0(YN#VB1<@V)M%C%CKQF1XWV*5.P"FN6;]BAYUE'+4SQ
M?`>_?H_"6PZ1PX4#GF4\J2^810S*)90S#*X*)R+F0^+NK9QTE3F+@`]695LR
M9:9]8-L'8PO;D@FA:1OQ4["T9D/#FDU42_(WP)6_A`?XP<^WD_-W[PRV&T>Q
MMVNPH<$6?GCE>O1H&>SM\<G5V>G9D<$$(7X_&+#959@Q`W>)^(^KL_?')Q]/
MSSY^$*+H'4GKFU(<$@#7^Z,SVAY/\J%IC>SQ9+H_B_WHU%WW=XN%:EKFZ6J[
MFKBYU_3<9`_"'U54,O?#K[N;$N\T=6!`S^)G-RFMP`"J0,AR\'\0"&I8_G*Q
M4)DG&_R!6^;=A)G3`VMT8(\@[T:S(N\V2<NL&S=FG6V:MF&;EJGRKL4#ZD]:
MJ/$-UOHV[%:I`7M6$O"0!AW8!/RO/%JVW174G2Z,_?`VS9+87<>=N33W0<9\
MK\N<)90;VCBPGA6[GP&5!^J_FP=8`T'Z=;9*67=/,OK+=A2G%Z^CUY?L^W=&
MSZ>O+SNHW3)*VB$[9,,Y"]D;%.G"4Z_74:J#M(OP$BA`+:4W3)!VY"]I^:@`
M^J^UO#2^[V%[SOKGL)#>PQ35?V^C'=%<P(0//,T'Y^B#'=;"E\T>8K_]QG9H
MY$M:%(9.8OU#M">%5@?63Z6%:@FP8$?Z89/G$1;-V!MU6*D:BYM_<Y_68.S-
M+S`0?2I"]IA-5:HG`8M]$/3S.$<-O0C<DS!FC484SS?:\T:<?QKKJHRD^0(:
M4[>V>8_EG8(U+8I(G1!+R/Z!:0)M<PF9F(8]F93])$Z,)Z;**_PD/,N3D+4!
M#YT/'T_/?V]G.?4:[:[7Z;!^XQNW0Q8\H!4]C,H1A\:%5QK*$E`0G\3GU--`
M)P>Q=>X1<7&$)\D$$`L!(2FB9UKG:<86G&(,9V/&W@L5PWR]@-`#)RQ6.3<-
MR!*20H+H"833&4L"3E2%;MJ!R6_P#S]R#K:);C>?RTE,#'@17H/&&%DY+7]!
MQ%D[QX#.6;N;]WKP"$+G#(FSCJ0J*#-!V>WU,DH,@`&4HW87C<Z`2RX$B)D+
MY;60R'?XXD%9!DUJED4&"0=?YJ%PTJ93!UO=\=/>:/`$T+,<8=]L?@:Y`119
ME4)E8.&4G)S2*132/X6XO-]'S^-C=F%")I%8U*O5*KPY;Y"P2+CSN?[BX1DN
M?R_.GMRG"X7JZ2!*]&./`CH=/76H*W2"M!+U6]!^K&HQ.P2'L\^<QY4U`MXO
MI<G-6APSL/CI)XTYP]H*]3""[Y*D>J@::+K)1,.#1T.R>93C,MD>^6S-0[5-
M*?`9K+S!:`(B@H?`2"AJ!J3!W#Q)UDX&KJ1AX*19,=3@"MX..&Q:)*V`&6UD
M.]+?_\+7,E/;NM@Z0CNX$10+=5!WJO:Y0)[8TN1<!9,M7;MB!67:PW\D_W\T
M"A0!_+K50Z!YO+D\:(Z5K@;M"<RU9,$FQ!"5G.X#56ZEPH)646=N5=7%I;JW
MD/DXV^O=2OE:E"&*.46EM5%@4A7YUK/+"A(J>TVC%M'6@UH;EQ*`H/E-#TD.
MI%4(::E)%?L'D;,5AVWNI#6?E:6AZ-N`O4B;EO)*KGFJLCT)R%=1CAC/L`;E
MNC.4H=U,6)G7C)*^+?WZO&+=4O@O'?%X,C"!:P53>/"BV/$\<(=$-E@CH!TG
M'*J9?([2C`:"!EY!JU*,X*4^!/_F@?X2AIWF.VI[:$SL<=EH34=C8VI7&BU<
M?AG`XB)1RDGJZI*-6>QP,XD?)_7AG*)U(]I5*5%@9W]3%P%LZ\C39Z5G`%.'
M\AS<3>-R6"43]I>4T@/:!'FO2E!.S!L]-;5M<,RL]-0^N&Y_/"Z.>BV\XF)P
M>#B@$:FCS#?GQ5RQFPLL$<\-\*A\+9QCECE<\"@&3V-0'GR,/F^@MS;H2R.X
M9@2_=8)&$YJ\M&_/P"G%C1N14Z8P3)#J(1ZS48(%DECXBLY=<O+[(6QFPH-O
MWC!3'#VT[DN$F4XL1<1WV>Z<HC,;CXW9Q"JB\SQ%].5[+UV^R2^F-=DW3&LZ
M*O%CCJ:V8=IFB:!2KW(9/X6=3]OR2X?MN%'L<Z_8PQT1)R"'LMO&40<2D&@*
M=&@2H#Z:!:^\F@">\FI""*A?393W$@+!5#JK8JT7B*W<>&R5/'J!9/<)N2]3
M=D.J.)%#9FD[%<%$BU+K.2':S(P=E1KR+7[:MY'O=2JW+J2X3.Y#PFMMR][@
M?9SU09DE*M(/VO4,6UJE$>+.I,)7-!H%58VHT`^G+H:7))Z>8;\NE96J"9I2
M*TPKM2]M;-V5Q(/$%)'_,B^?[^D9]ZHO`)1M.;\_,\S1<%JMACSSG,QIAP:[
MIYK2TIJ%4-:G_M9D[V]-]GF-2R1X_^4)WF?(W9S@_9]*\*V21R^0[#XA]V7*
MNOH%+,4:!,:!$\XD)E#B.S_\_,Z'8W"&P0CY70`3^,?:=D?L0J8],J'&JPMA
M=6_V->!7T`^[G^^K-_#6$&_@+;R!M\N_?#52EY?PTRU_D;5,`[X*\&$6O6+T
M7P?HS-4NE*<][)O<"^7AR8GQX+2@;S=6[16RG$0>'+(C3]L.H=U_F_N!5]X<
M8!=-_Z,$CQ)<G.G$GX655G+/^U&M?HDFS?X:[8-FXV$E6;%:8I*WNQUVU6Z[
M40B:H<ZLBX6[','1&@"ST!$C]?D[]`_B`H+T@#83_^>&D['(A7/TQIU&<<5+
M!Q$'&UP7OQQI`)QYP`O,B>$X0NYI07S$91YZ9T<>.Q?$`X2+&,IAY08/0TNS
M'7Q+IQK@HI46)$%OC;INV6*+F5`J(JH/)9&<*6]6R?`_G=`+A-'45-[Q!+_`
EL!Q<HMLMHZ.AA&)AVY7N[;\8B\+FRA\"?J79_P9&WE!KKB<``-'4
`
end




Messages sorted by: Reverse Date, Date, Thread, Author