(patch for Bash) regex conditional tests
From: William Park (opengeometry_at_yahoo.ca)
Date: 01/07/04
- Next message: Keith Thompson: "Re: NEWBIE. / and // dirs"
- Previous message: Stephane CHAZELAS: "Re: days-between"
- Next in thread: tri10o_at_bsod.org: "Re: (patch for Bash) regex conditional tests"
- Reply: tri10o_at_bsod.org: "Re: (patch for Bash) regex conditional tests"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 7 Jan 2004 09:12:26 GMT
1. I've finally added regular expression operators =~ and !~ to
conditional test, ie.
string =~ regex
string !~ regex
You have to compile with PATTERN_MATCHING defined, in order to get
=~ and !~ recognized as "binary operators". They were originally
for csh-like glob matching, but [[...]] made them redundant. Now,
I'm using them for regex. They can be used in any conditional
expression, ie. [...], [[...]], and 'test'. Eg.
[ abc123 =~ '[0-9]+' ]
[[ abc123 =~ '[0-9]+' ]]
test abc123 =~ '[0-9]+'
Furthermore, substrings which match the parenthesized groups in
'regex' are returned in array variable SUBMATCH. Eg.
[[ abc123 =~ '(.)([0-9]+)' ]]
will return true, and array of strings
('c123' 'c' '123')
will be returned in variable SUBMATCH. 0'th element is for the
whole regex match, and n'th element is for n'th parenthesized group
in 'regex'.
You can set shell options
nocaseregex -- if you want case-insensitive matching
multilineregex -- if you want line by line matching
Default matching is case-sensitive and does not consider newline
(\n) to be special.
2. I've also added new builtin command,
match string regex
In fact, this builtin is the backend engine for the conditional
tests above, so that
string =~ regex --> match string regex
string !~ regex --> ! match string regex
More info is available from
help match
3. Splitting and extracting substrings based on regular expression.
Eg.
array -e '[0-9]+' x abc123efg456 --> '123' '456'
array -v '[0-9]+' x abc123efg456 --> 'abc' 'efg'
will append all matching (-e) or non-matching (-v) substrings to the
end of array variable 'x'. Shell option 'nocaseregex' and
'multilineregex' will same effect as the conditional tests.
This command does not create variable, so you have to create it
manually before use. But, it allows for repeated calls to collect
substrings from many arguments. Eg.
array -e '[0-9]+' x "${var[*]}"
array -e '[0-9]+' x "$*"
4. Sorting, reversing, indexing of array. Eg.
array -s x --> sort array elements
array -r x --> reverse the order of array elements
array -c x --> collapse the missing array indexes
5. Printing of array. Eg.
array -j 'ZZZ' x --> join array elements with 'ZZZ' separator
array -i 'abc' x --> print all array indexes with 'abc' string
array x --> print array elements, one per line
6. You can get more info from
help array
help arrayfilter
help arraymap
Here are summary of the rest of my patch. Most of them were posted
before.
7. Integer and letter sequence generators. Eg.
a{1..5}b --> a1b a2b a3b a4b a5b
1{a--e}2 --> 1a2 1b2 1c2 1d2 1e2
set -- a b c d e
{1..#} --> 1 2 3 4 5
{1..*} --> $1 $2 $3 $4 $5 --> a b c d e
8. Since '%20' is the only URL hexcode that I can remember, I patched
'echo' to convert between URL hexcode and ASCII character. Eg.
echo -U ' ' --> '%20'
echo -u '%20' --> ' '
It's useful for quick char conversion only.
9. Reading DOS lines which are CRLF (\r\n) terminated. Eg.
read -D a b
is equivalent to
read
REPLY=${REPLY%$'\r'}
read a b <<< "${REPLY}"
10. Line number and field number, like NF and NR variables in Awk. Eg.
read -N ... --> NF, NR
will assign the number of IFS fields to 'NF' and assign the number
of lines read so far to 'NR'.
11. Multiple for-loop variables. Eg.
for a,b in ...
will assign variables 'a' and 'b' sequentially from the list of
'...' items.
12. Skipping of positional parameters, array elements, string
characters. Eg.
${*:1~2} --> odd numbered parameters
${@:1~2} --> odd numbered parameters
${array[*]:1~2} --> odd numbered array elements
${array[@]:1~2} --> odd numbered array elements
${string:1~2} --> odd numbered characters
The format of 'x~y' is similar to Sed-style line addressing.
--
William Park, Open Geometry Consulting, <opengeometry@yahoo.ca>
Linux solution for data management and processing.
diff -ru bash-2.05b/Makefile.in bash/Makefile.in
--- bash-2.05b/Makefile.in 2002-05-31 13:44:23.000000000 -0400
+++ bash/Makefile.in 2004-01-05 16:16:19.000000000 -0500
@@ -400,7 +400,7 @@
$(DEFSRC)/builtin.def $(DEFSRC)/cd.def $(DEFSRC)/colon.def \
$(DEFSRC)/command.def ${DEFSRC}/complete.def \
$(DEFSRC)/declare.def \
- $(DEFSRC)/echo.def $(DEFSRC)/enable.def $(DEFSRC)/eval.def \
+ $(DEFSRC)/echo.def $(DEFSRC)/enable.def $(DEFSRC)/eval.def $(DEFSRC)/array.def \
$(DEFSRC)/exec.def $(DEFSRC)/exit.def $(DEFSRC)/fc.def \
$(DEFSRC)/fg_bg.def $(DEFSRC)/hash.def $(DEFSRC)/help.def \
$(DEFSRC)/history.def $(DEFSRC)/jobs.def $(DEFSRC)/kill.def \
@@ -419,7 +419,7 @@
BUILTIN_OBJS = $(DEFDIR)/alias.o $(DEFDIR)/bind.o $(DEFDIR)/break.o \
$(DEFDIR)/builtin.o $(DEFDIR)/cd.o $(DEFDIR)/colon.o \
$(DEFDIR)/command.o $(DEFDIR)/declare.o \
- $(DEFDIR)/echo.o $(DEFDIR)/enable.o $(DEFDIR)/eval.o \
+ $(DEFDIR)/echo.o $(DEFDIR)/enable.o $(DEFDIR)/eval.o $(DEFDIR)/array.o\
$(DEFDIR)/exec.o $(DEFDIR)/exit.o $(DEFDIR)/fc.o \
$(DEFDIR)/fg_bg.o $(DEFDIR)/hash.o $(DEFDIR)/help.o \
$(DEFDIR)/history.o $(DEFDIR)/jobs.o $(DEFDIR)/kill.o \
@@ -1135,6 +1135,11 @@
builtins/eval.o: command.h config.h ${BASHINCDIR}/memalloc.h error.h general.h xmalloc.h ${BASHINCDIR}/maxpath.h
builtins/eval.o: shell.h syntax.h bashjmp.h ${BASHINCDIR}/posixjmp.h sig.h unwind_prot.h variables.h arrayfunc.h conftypes.h quit.h
builtins/eval.o: dispose_cmd.h make_cmd.h subst.h externs.h ${BASHINCDIR}/stdc.h
+
+builtins/array.o: command.h config.h ${BASHINCDIR}/memalloc.h error.h general.h xmalloc.h ${BASHINCDIR}/maxpath.h
+builtins/array.o: shell.h syntax.h bashjmp.h ${BASHINCDIR}/posixjmp.h sig.h unwind_prot.h variables.h arrayfunc.h conftypes.h quit.h
+builtins/array.o: dispose_cmd.h make_cmd.h subst.h externs.h ${BASHINCDIR}/stdc.h
+
builtins/exec.o: bashtypes.h
builtins/exec.o: command.h config.h ${BASHINCDIR}/memalloc.h error.h general.h xmalloc.h ${BASHINCDIR}/maxpath.h
builtins/exec.o: shell.h syntax.h bashjmp.h ${BASHINCDIR}/posixjmp.h sig.h unwind_prot.h variables.h arrayfunc.h conftypes.h
@@ -1278,6 +1283,9 @@
builtins/echo.o: $(DEFSRC)/echo.def
builtins/enable.o: $(DEFSRC)/enable.def
builtins/eval.o: $(DEFSRC)/eval.def
+
+builtins/array.o: $(DEFSRC)/array.def
+
builtins/exec.o: $(DEFSRC)/exec.def
builtins/exit.o: $(DEFSRC)/exit.def
builtins/fc.o: $(DEFSRC)/fc.def
diff -ru bash-2.05b/braces.c bash/braces.c
--- bash-2.05b/braces.c 2002-05-06 13:50:40.000000000 -0400
+++ bash/braces.c 2004-01-03 16:40:26.000000000 -0500
@@ -64,6 +64,8 @@
static char **array_concat ();
#endif
+#include "chartypes.h" /* needed for ISLOWER() and ISUPPER() */
+
/* Return an array of strings; the brace expansion of TEXT. */
char **
brace_expand (text)
@@ -161,22 +163,195 @@
ADVANCE_CHAR (amble, alen, j);
}
- if (!amble[j])
- {
- free (amble);
- free (preamble);
- result[0] = savestring (text);
- return (result);
- }
+ if (!amble[j]) {
+ /*************************************************************************
+ * Okey, found a standalone brace expression without ','. If the amble
+ * contains 'a..b' expression, where 'a' and 'b' are positive integers,
+ * then replace it with 'a,a+1,...,b' (if a < b) or 'a,a-1,...,b' (if a >
+ * b), and give it back to shell for a normal expansion. If 'a' or 'b'
+ * has leading '0', then zero pad the numbers. The format size is the
+ * maximum size of 'a' or 'b'. This is brace version of 'seq a b'.
+ *
+ * If 'a' or 'b' is a regular shell variable (not positional parameter or
+ * array element), then replace it with its value $a or $b. If 'a' or 'b'
+ * starts with '!', then indirect substitution will be tried, similiar to
+ * ${!a} or ${!b}. In any case, if the final 'a..b' is pure number, then
+ * generate the usual integer sequence. This is brace version of 'seq $a
+ * $b' or 'seq ${!a} ${!b}'.
+ *
+ * If 'a' or 'b' is '#', then replace it with value $# and generate
+ * integer sequence as usual. If 'a' or 'b' is '*', then replace it with
+ * value $#, and generate parameter sequence by putting '${}' around the
+ * integers to indicate positional parameter. However, expansion is done
+ * only if there are parameters (ie. $# >= 1). If there is no parameter,
+ * then don't replace it. This is brace version of 'seq a $#', 'seq $#
+ * b', and $*.
+ *
+ * If the expression is 'a--b', where 'a' and 'b' are strings of same
+ * size, then generate string sequence. Characters must be both lowercase
+ * or both uppercase. So, {a--c} is same as {a,b,c} and {A--C} is same as
+ * {A,B,C}, and {Aa--Bb} is same as {Aa,Ab,...,Az,Ba,Bb}.
+ *
+ * Otherwise, return the original string back to shell as is, like before.
+ *
+ * --William Park <opengeometry@yahoo.ca>
+ */
+ char *a, *b, *t;
+ int dollarflag, zeropad, compareflag;
+ size_t i, end, n, size;
+ intmax_t x, y;
+
+ if (t = strstr (amble, "--")) {
+ a = substring (amble, 0, t - amble);
+ b = substring (amble, t - amble + 2, alen);
+ if (strlen (a) == 0 || strlen (a) != strlen (b)) {
+ free (a);
+ free (b);
+ goto Original_Code;
+ }
+ size = strlen (a);
+ n = 1;
+ for (i = 0; i < size; i++) {
+ if (! (ISLOWER (a[i]) && ISLOWER (b[i]) || ISUPPER (a[i]) && ISUPPER (b[i]))) {
+ free (a);
+ free (b);
+ goto Original_Code;
+ }
+ if (a[i] != b[i] || n > 1)
+ if (n == 1)
+ n = abs (b[i] - a[i]) + 1; /* first position */
+ else
+ n *= 26; /* max number: 26^{size} */
+ }
+
+ /* By this point, 'a' and 'b' are strings of equal size.
+ */
+ tack = strvec_create (n + 1);
+ n = 0;
+ do {
+ tack[n++] = savestring (a);
+ if ((compareflag = strcmp (a, b)) == 0) {
+ tack[n] = (char *)NULL;
+ break;
+ }
+ else if (compareflag < 0) {
+ for (i = size - 1; i >= 1 && (a[i] == 'Z' || a[i] == 'z'); i--)
+ a[i] -= 25; /* back to 'A' or 'a' */
+ ++a[i];
+ }
+ else if (compareflag > 0) {
+ for (i = size - 1; i >= 1 && (a[i] == 'A' || a[i] == 'a'); i--)
+ a[i] += 25; /* back to 'Z' or 'z' */
+ --a[i];
+ }
+ } while (1);
+ }
+ else if (t = strstr (amble, "..")) {
+ a = substring (amble, 0, t - amble);
+ b = substring (amble, t - amble + 2, alen);
+ dollarflag = zeropad = 0;
+
+ if (legal_identifier (a) && (t = get_string_value (a))) {
+ free (a);
+ a = savestring (t);
+ }
+ else if (*a == '!') {
+ if (legal_identifier (a + 1) && (t = get_string_value (a + 1)))
+ if (legal_identifier (t) && (t = get_string_value (t))) {
+ free (a);
+ a = savestring (t);
+ }
+ }
+ else if ((*a == '#' || *a == '*') && a[1] == '\0') {
+ if (*a == '*')
+ dollarflag = 1;
+ if (n = number_of_args ()) {
+ free (a);
+ a = itos (n);
+ }
+ }
+
+ if (legal_identifier (b) && (t = get_string_value (b))) {
+ free (b);
+ b = savestring (t);
+ }
+ else if (*b == '!') {
+ if (legal_identifier (b + 1) && (t = get_string_value (b + 1)))
+ if (legal_identifier (t) && (t = get_string_value (t))) {
+ free (b);
+ b = savestring (t);
+ }
+ }
+ else if ((*b == '#' || *b == '*') && b[1] == '\0') {
+ if (*b == '*')
+ dollarflag = 1;
+ if (n = number_of_args ()) {
+ free (b);
+ b = itos (n);
+ }
+ }
+
+ /* By this point, 'a' and 'b' must be all numbers. If not, then exit
+ * per original code. Check for empty string explicitly, because
+ * all_digits() returns 1 if string is empty (crazy!).
+ */
+ if (!(*a && all_digits (a) && legal_number (a, &x) && x >= 0
+ && *b && all_digits (b) && legal_number (b, &y) && y >= 0)) {
+ free (a);
+ free (b);
+ goto Original_Code;
+ }
+
+ i = x;
+ end = y;
+ n = abs (end - i) + 1;
+ size = (strlen (a) > strlen (b)) ? strlen (a) : strlen (b);
+ if (strlen (a) > 1 && *a == '0' || strlen (b) > 1 && *b == '0')
+ zeropad = 1;
+
+ tack = strvec_create (n + 1);
+ n = 0;
+ do {
+ t = (char *)xmalloc (size + 3 + 1); /* ${number} or number */
+ if (dollarflag)
+ sprintf (t, "${%d}", i);
+ else if (zeropad)
+ sprintf (t, "%0*d", size, i);
+ else
+ sprintf (t, "%d", i);
+ tack[n++] = t;
+ if (i == end) {
+ tack[n] = (char *)NULL;
+ break;
+ }
+ else if (i < end)
+ ++i;
+ else if (i > end)
+ --i;
+ } while (1);
+ }
+ else {
+Original_Code:
+ free (amble); /* original code */
+ free (preamble);
+ result[0] = savestring (text);
+ return (result);
+ }
+
+ free (a);
+ free (b);
+ goto New_Tack;
+ }
#endif /* SHELL */
- postamble = &text[i + 1];
-
tack = expand_amble (amble, alen);
+New_Tack:
result = array_concat (result, tack);
free (amble);
strvec_dispose (tack);
+ postamble = &text[i + 1];
+
tack = brace_expand (postamble);
result = array_concat (result, tack);
strvec_dispose (tack);
diff -ru bash-2.05b/builtins/Makefile.in bash/builtins/Makefile.in
--- bash-2.05b/builtins/Makefile.in 2002-04-23 09:24:23.000000000 -0400
+++ bash/builtins/Makefile.in 2004-01-05 16:16:34.000000000 -0500
@@ -108,7 +108,7 @@
DEFSRC = $(srcdir)/alias.def $(srcdir)/bind.def $(srcdir)/break.def \
$(srcdir)/builtin.def $(srcdir)/cd.def $(srcdir)/colon.def \
$(srcdir)/command.def $(srcdir)/declare.def $(srcdir)/echo.def \
- $(srcdir)/enable.def $(srcdir)/eval.def $(srcdir)/getopts.def \
+ $(srcdir)/enable.def $(srcdir)/eval.def $(srcdir)/array.def $(srcdir)/getopts.def \
$(srcdir)/exec.def $(srcdir)/exit.def $(srcdir)/fc.def \
$(srcdir)/fg_bg.def $(srcdir)/hash.def $(srcdir)/help.def \
$(srcdir)/history.def $(srcdir)/jobs.def $(srcdir)/kill.def \
@@ -125,7 +125,7 @@
OFILES = builtins.o \
alias.o bind.o break.o builtin.o cd.o colon.o command.o \
- common.o declare.o echo.o enable.o eval.o evalfile.o \
+ common.o declare.o echo.o enable.o eval.o array.o evalfile.o \
evalstring.o exec.o \
exit.o fc.o fg_bg.o hash.o help.o history.o jobs.o kill.o let.o \
pushd.o read.o return.o set.o setattr.o shift.o source.o \
@@ -225,6 +225,9 @@
echo.o: echo.def
enable.o: enable.def
eval.o: eval.def
+
+array.o: array.def
+
exec.o: exec.def
exit.o: exit.def
fc.o: fc.def
@@ -365,6 +368,14 @@
eval.o: $(topdir)/subst.h $(topdir)/externs.h
eval.o: $(topdir)/shell.h $(topdir)/syntax.h $(topdir)/unwind_prot.h $(topdir)/variables.h $(topdir)/conftypes.h
eval.o: $(BASHINCDIR)/maxpath.h
+
+array.o: $(topdir)/command.h ../config.h $(BASHINCDIR)/memalloc.h
+array.o: $(topdir)/error.h $(topdir)/general.h $(topdir)/xmalloc.h
+array.o: $(topdir)/quit.h $(topdir)/dispose_cmd.h $(topdir)/make_cmd.h
+array.o: $(topdir)/subst.h $(topdir)/externs.h
+array.o: $(topdir)/shell.h $(topdir)/syntax.h $(topdir)/unwind_prot.h $(topdir)/variables.h $(topdir)/conftypes.h
+array.o: $(BASHINCDIR)/maxpath.h
+
exec.o: $(topdir)/bashtypes.h
exec.o: $(topdir)/command.h ../config.h $(BASHINCDIR)/memalloc.h
exec.o: $(topdir)/error.h $(topdir)/general.h $(topdir)/xmalloc.h
diff -ru bash-2.05b/builtins/array.def bash/builtins/array.def
--- bash-2.05b/builtins/array.def 2004-01-06 14:13:49.000000000 -0500
+++ bash/builtins/array.def 2004-01-06 21:10:11.000000000 -0500
@@ -0,0 +1,785 @@
+This file is array.def, from which is created array.c.
+It implements the builtin "array" in Bash.
+
+$PRODUCES array.c
+
+
+/* Copied from ./eval.def */
+
+#include <config.h>
+#if defined (HAVE_UNISTD_H)
+# ifdef _MINIX
+# include <sys/types.h>
+# endif
+# include <unistd.h>
+#endif
+
+#include "../shell.h"
+#include "bashgetopt.h"
+#include "common.h"
+
+
+/* My code begins... */
+
+#include <sys/types.h> /* for regex */
+#include <regex.h> /* for regex */
+
+int regex_ignore_case = 0;
+int regex_match_newline = 0;
+
+
+/*******************************************************************************
+ * Command-line version of regex conditional test
+ * string =~ regex
+ * string !~ regex
+ * It's equivalent to Awk match() function,
+ * match (string, regex, SUBMATCH)
+ *
+ * --William Park <opengeometry@yahoo.ca>
+ */
+
+$BUILTIN match
+$FUNCTION match_builtin
+$SHORT_DOC match string regex
+Command-line version of regex conditional test,
+ string =~ regex --> match string regex
+ string !~ regex --> ! match string regex
+Return success if 'string' contains 'regex' pattern. Also, return array
+variable SUBMATCH containing substrings which match parenthesized groups
+in 'regex'. It's equivalent to Awk match() function,
+ match (string, regex, SUBMATCH)
+$END
+
+
+int
+match_builtin (list)
+ WORD_LIST *list;
+{
+ SHELL_VAR *var;
+ char *string, *regex, *t;
+ regoff_t a, b;
+ size_t i, n;
+ int rcode;
+
+ regex_t preg; /* size_t preg.re_nsub; */
+ regmatch_t *pmatch;
+ int cflag, eflag;
+
+ if (no_options (list))
+ return (EX_USAGE);
+ list = loptend; /* skip over possible `--' */
+
+ if (list == 0) /* 0 argument */
+ return (EXECUTION_FAILURE);
+
+ string = list->word->word;
+ list = list->next;
+ if (list == 0) /* 1 argument: match string */
+ return (EXECUTION_FAILURE);
+
+ regex = list->word->word;
+ cflag = REG_EXTENDED;
+ if (regex_ignore_case) cflag |= REG_ICASE;
+ if (regex_match_newline) cflag |= REG_NEWLINE;
+ if (regcomp (&preg, regex, cflag) != 0) {
+ builtin_error ("`%s': illegal regex in regcomp()", regex);
+ regfree (&preg);
+ return (EXECUTION_FAILURE);
+ }
+
+ n = preg.re_nsub;
+ pmatch = (regmatch_t *) xmalloc ((n+1) * sizeof (regmatch_t));
+ eflag = 0;
+ if (regexec (&preg, string, n+1, pmatch, eflag) != 0) {
+ rcode = EXECUTION_FAILURE;
+
+ } else if ((var = find_or_make_array_variable ("SUBMATCH", 1)) == 0) {
+ rcode = EXECUTION_FAILURE; /* readonly or noassign */
+
+ } else {
+ rcode = EXECUTION_SUCCESS;
+ array_flush (array_cell (var));
+ for (i = 0; i <= n; i++) {
+ a = pmatch[i].rm_so;
+ b = pmatch[i].rm_eo;
+ if (a >= 0 && b >= 0) {
+ t = substring (string, a, b);
+ array_insert (array_cell (var), i, t);
+ free (t);
+ }
+ }
+ }
+
+ free (pmatch);
+ regfree (&preg);
+ return (rcode);
+}
+
+
+/*******************************************************************************
+ * Emulate Python's map() function.
+ *
+ * --William Park <opengeometry@yahoo.ca>
+ */
+
+$BUILTIN arraymap
+$FUNCTION arraymap_builtin
+$SHORT_DOC arraymap command name [name ...]
+Mimicking Python's map() function, it runs 'command' for each element of
+arrays 'name', ... in parallel. 'command' should take as many positional
+parameters as there are arrays. This is modified version of 'eval'
+builtins, and is equivalent to
+ command "${name[0]}" "${name[0]}" ...
+ command "${name[1]}" "${name[1]}" ...
+ ...
+ command "${name[N]}" "${name[N]}" ...
+where 'N' is the maximum of all indexes. Array elements are referenced by
+index key, starting from 0 to N, not the order of storage. So, there can
+be empty parameters.
+$END
+
+
+int
+arraymap_builtin (list)
+ WORD_LIST *list;
+{
+#if defined (ARRAY_VARS)
+ char *name, *command, *eval_string;
+ arrayind_t i, n;
+ size_t size, eval_len;
+ SHELL_VAR *var;
+ WORD_LIST *t;
+
+ if (no_options (list))
+ return (EX_USAGE);
+ list = loptend; /* skip over possible `--' */
+
+ if (list == 0) /* 0 argument */
+ return (EXECUTION_SUCCESS);
+
+ command = list->word->word; /* no checking */
+
+ list = list->next;
+ if (list == 0) /* 1 argument: arraymap command */
+ return (EXECUTION_SUCCESS);
+
+ /* 2 or more arguments: arraymap command a ... */
+
+ n = 0;
+ size = strlen (command);
+ for (t = list; t != 0; t = t->next) {
+ name = t->word->word;
+ if (legal_identifier (name) == 0) {
+ sh_invalidid (name);
+ return (EXECUTION_FAILURE);
+ }
+
+ var = find_variable (name);
+ if (var == 0 || array_p (var) == 0) {
+ sh_notfound (name);
+ return (EXECUTION_FAILURE);
+ }
+
+ i = array_max_index (array_cell (var));
+ n = (n > i) ? n : i; /* max of all index */
+
+ /* ' "${name[index]}"' --> name + index + 8 */
+ size += strlen (name) + INT_STRLEN_BOUND (intmax_t) + 8;
+ }
+
+ /* command "${name[0]}" "${name[0]}" ...
+ * ...
+ * command "${name[n]}" "${name[n]}" ...
+ */
+ for (i = 0; i <= n; i++) {
+ eval_string = (char *) xmalloc (size + 1);
+
+ strcpy (eval_string, command);
+ eval_len = strlen (eval_string);
+
+ for (t = list; t != 0; t = t->next) {
+ name = t->word->word;
+ sprintf (eval_string + eval_len, " \"${%s[%d]}\"", name, i);
+ eval_len = strlen (eval_string);
+ }
+
+ /* Note that parse_and_execute () frees the string it is passed. */
+ if (parse_and_execute (eval_string, "arraymap", SEVAL_NOHIST) != EXECUTION_SUCCESS)
+ return (EXECUTION_FAILURE);
+ }
+
+ return (EXECUTION_SUCCESS);
+#endif /* ARRAY_VARS */
+}
+
+
+/*******************************************************************************
+ * Emulate Python's filter() function.
+ *
+ * --William Park <opengeometry@yahoo.ca>
+ */
+
+$BUILTIN arrayfilter
+$FUNCTION arrayfilter_builtin
+$SHORT_DOC arrayfilter filter name
+Mimicking Python's filter() function, it runs 'filter' for each element of
+array 'name'. It returns the array elements, for which 'filter' returns
+success (0).
+$END
+
+
+int
+arrayfilter_builtin (list)
+ WORD_LIST *list;
+{
+#if defined (ARRAY_VARS)
+ char *name, *filter, *eval_string;
+ size_t size;
+ SHELL_VAR *var;
+ ARRAY *a;
+ ARRAY_ELEMENT *ae;
+
+ if (no_options (list))
+ return (EX_USAGE);
+ list = loptend; /* skip over possible `--' */
+
+ if (list == 0) /* 0 argument */
+ return (EXECUTION_SUCCESS);
+
+ filter = list->word->word; /* no checking */
+
+ list = list->next;
+ if (list == 0) /* 1 argument: arrayfilter filter */
+ return (EXECUTION_SUCCESS);
+
+ name = list->word->word; /* 2 arguments: arrayfilter filter name */
+ if (legal_identifier(name) == 0) {
+ sh_invalidid (name);
+ return (EXECUTION_FAILURE);
+ }
+ var = find_variable (name);
+ if (var == 0 || array_p (var) == 0) {
+ sh_notfound (name);
+ return (EXECUTION_FAILURE);
+ }
+
+ /* filter "value"
+ * ...
+ * filter "value"
+ */
+ a = array_cell (var);
+ if (a == 0 || array_empty (a)) return; /* do nothing */
+
+ for (ae = element_forw (a->head); ae != a->head; ae = element_forw (ae)) {
+ size = strlen (filter) + strlen(element_value (ae)) + 3;
+
+ eval_string = (char *)xmalloc (size + 1);
+ sprintf (eval_string, "%s \"%s\"", filter, element_value (ae));
+
+ /* Note that parse_and_execute () frees the string it is passed. */
+ if (parse_and_execute (eval_string, "arrayfilter", SEVAL_NOHIST) == EXECUTION_SUCCESS)
+ puts (element_value (ae));
+ }
+
+ return (EXECUTION_SUCCESS);
+#endif /* ARRAY_VARS */
+}
+
+
+/*******************************************************************************
+ * Add some of Python's list/dict functionalities.
+ *
+ * --William Park <opengeometry@yahoo.ca>
+ */
+
+
+$BUILTIN array
+$FUNCTION array_builtin
+$SHORT_DOC array [-src] [-i value] [-j sep] [-evEV regex] name [arg...]
+By default, array values are printed, one per line. Only one option is
+allowed, so the last one takes effect.
+ -i value Print all indexes with string 'value' --> list.index(value), ...
+ -j sep Join all element strings with 'sep' separator --> sep.join(list)
+
+The following operation changes the array in-place.
+ -s Sort on array element's value --> list.sort()
+ -r Reverse the array --> list.reverse()
+ -c Collapse the array, so that there is no missing index
+
+If one or more arguments are present, then the default is to append them
+sequentially to the end of array, mimicking list.append(arg) in Python. If
+-e or -v option is given, then POSIX 'regex' pattern is applied on 'arg',
+using regcomp(3) and regexec(3), and the resulting substrings are added to
+the end of array. If shell option 'nocaseregex' is set, then match is
+case-insensitive. If shell option 'multilineregex' is set, then '.', '^',
+and '$' do not span across \n (newline), so the match is line by line.
+ -e regex Extract 'regex' patterns from 'arg', and append each
+ matching substring. (think egrep -e)
+ --> re.findall(regex,arg), minus null string
+ -v regex Remove regex(7) patterns from 'arg' strings, and append
+ each non-matching substring. (think egrep -v)
+ --> re.split(regex,arg), minus null string
+
+Array variable 'name' is not created to allow for repeated calls. So,
+create it manually.
+$END
+
+
+/* Wrapper around inttostr() in ../lib/sh/itos.c, to convert array index
+ * (arrayind_t) to string. One can use itos(), but it copies string which
+ * requires an extra step of freeing it.
+ */
+static char *
+element_index_to_string (ae)
+ ARRAY_ELEMENT *ae;
+{
+ /* 'static' to survive outside the function, but is not intended for
+ * long term storage.
+ */
+ static char indstr[INT_STRLEN_BOUND(intmax_t) + 1];
+
+ return inttostr (element_index (ae), indstr, sizeof (indstr));
+}
+
+
+/* Copied from array_walk() in ../array.c. For each array element, print its
+ * index key, value, or both index and value, separated by '\t'. Similiar to
+ * dict.keys(), dict.values(), and dict.items() in Python.
+ */
+static void
+print_elements (var)
+ SHELL_VAR *var;
+{
+ ARRAY *a;
+ ARRAY_ELEMENT *ae;
+
+ a = array_cell (var);
+ if (a == 0 || array_empty (a)) return; /* do nothing */
+
+ for (ae = element_forw (a->head); ae != a->head; ae = element_forw (ae))
+ printf ("%s\t%s\n", element_index_to_string (ae), element_value (ae));
+}
+
+
+/* Copied from array_walk() in ../array.c. Print index of all array elements
+ * with 'value'.
+ */
+static void
+print_all_indexes_with_value (var, value)
+ SHELL_VAR *var;
+ char *value;
+{
+ ARRAY *a;
+ ARRAY_ELEMENT *ae;
+
+ a = array_cell (var);
+ if (a == 0 || array_empty (a)) return; /* do nothing */
+
+ for (ae = element_forw (a->head); ae != a->head; ae = element_forw (ae))
+ if (strcmp(element_value (ae), value) == 0)
+ puts (element_index_to_string (ae));
+}
+
+
+/* Set array index so that they are from 0 to n-1, where n is the number of
+ * elements that the array has.
+ */
+static void
+array_collapse (var)
+ SHELL_VAR *var;
+{
+ ARRAY *a;
+ ARRAY_ELEMENT *ae;
+ arrayind_t i, n;
+
+ a = array_cell (var);
+ if (a == 0 || array_empty (a)) return; /* do nothing */
+
+ n = array_num_elements (a);
+ ae = a->head;
+ for (i = 0; i < n; i++) {
+ ae = element_forw (ae);
+ element_index (ae) = i;
+ }
+}
+
+
+/* Reverse the array order, by swapping the element values. The index keys are
+ * unchanged. Similiar to list.reverse() in Python.
+ */
+static void
+array_reverse (var)
+ SHELL_VAR *var;
+{
+ ARRAY *a;
+ ARRAY_ELEMENT *ae, *be;
+ char *t;
+
+ a = array_cell (var);
+ if (a == 0 || array_empty (a)) return; /* do nothing */
+
+ /* 'ae' goes forward, and 'be' goes backward */
+ for (ae = element_forw (a->head), be = element_back (a->head);
+ ae != a->head && be != a->head && element_index (ae) < element_index (be);
+ ae = element_forw (ae), be = element_back(be))
+ {
+ t = element_value (ae); /* swap the values */
+ element_value (ae) = element_value (be);
+ element_value (be) = t;
+ }
+}
+
+
+/* Sort the array, either based on element's value. Similiar to list.sort() in
+ * Python.
+ */
+static void
+array_sort (var, flag)
+ SHELL_VAR *var;
+ int flag;
+{
+ ARRAY *a;
+ ARRAY_ELEMENT *ae;
+ char **base; /* array holding pointers to element values */
+ arrayind_t n, i;
+
+ int my_strcmp (x, y) char **x, **y;
+ {
+ strcmp (*x, *y);
+ }
+ int my_intcmp (x, y) char **x, **y;
+ {
+ int i = atoi (*x);
+ int j = atoi (*y);
+
+ if (i < j) return -1;
+ if (i > j) return 1;
+ return 0;
+ }
+
+ a = array_cell (var);
+ if (a == 0 || array_empty (a)) return; /* do nothing */
+
+ n = array_num_elements (a);
+ base = (char **) xmalloc (n * sizeof (char *));
+ ae = a->head;
+ for (i = 0; i < n; i++) {
+ ae = element_forw (ae);
+ base[i] = element_value (ae);
+ }
+
+ if (integer_p (var))
+ qsort (base, n, sizeof (char *), my_intcmp);
+ else
+ qsort (base, n, sizeof (char *), my_strcmp);
+
+ ae = a->head;
+ for (i = 0; i < n; i++) {
+ ae = element_forw (ae);
+ element_value (ae) = base[i];
+ }
+ free (base);
+}
+
+
+/* Copied from bind_array_variable() in ../arrayfunc.c. Find the last index and
+ * append right after it. Actually, array_insert() in ../array.c inserts it
+ * "before" the head, which is effectively appending it because ARRAY is
+ * circular linked list. Similiar to list.append() in Python.
+ */
+static void
+array_append (var, arg)
+ SHELL_VAR *var;
+ char *arg; /* raw string */
+{
+ char *value;
+ arrayind_t N;
+
+ if (readonly_p (var) || noassign_p (var))
+ err_readonly (var->name);
+ else {
+ N = array_max_index (array_cell (var)); /* -1 if empty */
+ value = make_variable_value (var, arg);
+ if (var->assign_func)
+ (*var->assign_func) (var, value, N+1);
+ else
+ array_insert (array_cell (var), N+1, value);
+ FREE (value);
+ }
+}
+
+
+int
+array_builtin (list)
+ WORD_LIST *list;
+{
+#if defined (ARRAY_VARS)
+ char *name, *arg, *sep, *value, *regex;
+ SHELL_VAR *var;
+ int flag, opt;
+
+ regex_t preg;
+ regmatch_t pmatch[1];
+ char *head, *body, *tail, *t;
+ int cflag, eflag;
+
+ flag = 0;
+ sep = value = (char *)NULL;
+
+ reset_internal_getopt ();
+ while ((opt = internal_getopt (list, "srci:j:e:v:")) != -1) {
+ switch (opt) {
+ case 's':
+ case 'r':
+ case 'c':
+ flag = opt;
+ break;
+ case 'i':
+ flag = opt;
+ value = list_optarg;
+ break;
+ case 'j':
+ flag = opt;
+ sep = list_optarg;
+ break;
+ case 'e':
+ case 'v':
+ flag = opt;
+ regex = list_optarg;
+ break;
+ default:
+ builtin_usage ();
+ return (EX_USAGE);
+ }
+ }
+ list = loptend;
+
+ if (list == 0) /* 0 argument */
+ return (EXECUTION_SUCCESS);
+
+ name = list->word->word; /* first argument */
+ if (legal_identifier(name) == 0) {
+ sh_invalidid (name);
+ return (EXECUTION_FAILURE);
+ }
+ var = find_variable (name);
+ if (var == 0 || array_p (var) == 0) {
+ sh_notfound (name);
+ return (EXECUTION_FAILURE);
+ }
+
+ list = list->next;
+
+ if (list == 0) { /* 1 argument: array [...] name */
+ switch (flag) {
+ case 's': /* array -s name */
+ array_sort (var, flag);
+ break;
+ case 'r': /* array -r name */
+ array_reverse (var);
+ break;
+ case 'c': /* array -c name */
+ array_collapse (var);
+ break;
+ case 'i': /* array -i value name */
+ print_all_indexes_with_value (var, value);
+ break;
+ case 'j': /* array -j sep name */
+ arg = array_to_string (array_cell (var), sep, 0 /* no quoting */);
+ puts (arg);
+ break;
+ default: /* array name */
+ print_elements (var);
+ break;
+ }
+ }
+
+ /* 2 or more arguments. So, we are appending. If 'list == 0' already, then
+ * it falls through.
+ */
+ while (list) {
+ arg = list->word->word; /* array -[ev] regex name arg... */
+ if (flag == 'e' || flag == 'v') {
+ cflag = REG_EXTENDED;
+ if (regex_ignore_case) cflag |= REG_ICASE;
+ if (regex_match_newline) cflag |= REG_NEWLINE;
+ if (regcomp (&preg, regex, cflag) != 0) {
+ builtin_error ("`%s': illegal regex in regcomp()", regex);
+ regfree (&preg);
+ return (EXECUTION_FAILURE);
+ }
+
+ head = body = tail = arg;
+ eflag = 0;
+ while (*body && regexec (&preg, body, 1, pmatch, eflag) == 0) {
+ body += pmatch[0].rm_so;
+ tail += pmatch[0].rm_eo;
+ if (body == tail) {
+ body++;
+ tail++;
+ } else {
+ if (flag == 'e' && tail != body) {
+ t = substring (body, 0, tail - body);
+ array_append (var, t);
+ free (t);
+ }
+ if (flag == 'v' && body != head) {
+ t = substring (head, 0, body - head);
+ array_append (var, t);
+ free (t);
+ }
+ head = body = tail;
+ }
+ eflag = REG_NOTBOL;
+ }
+ if (flag == 'v' && *head)
+ array_append (var, head);
+
+ regfree (&preg);
+
+ } else
+ array_append (var, arg); /* append original 'arg' */
+
+ list = list->next;
+ }
+
+ stupidly_hack_special_variables (name);
+ /* fflush (stdout); */
+ return (EXECUTION_SUCCESS);
+#endif /* ARRAY_VARS */
+}
+
+
+/*******************************************************************************
+ * Fully embedded Python. With Python compiled and installed to /usr/local as
+ * usual, Bash can be compiled with
+ *
+ * ./configure
+ * make CFLAGS="-DEMBED_PYTHON -I/usr/local/include/python2.2"
+ * LDFLAGS="-L/usr/local/lib/python2.2 -L/usr/local/lib/python2.2/config
+ * -Xlinker -export-dynamic"
+ * LOCAL_LIBS="-lpython2.2 -lpthread -lutil -lm"
+ *
+ * where '-lpython2.2 -lpthread -lutil -lm' were determined from Python's
+ * Makefile, and '-Xlinker -export-dynamic' were determined from
+ * import distutils.sysconfig
+ * distutils.sysconfig.get_config_var('LINKFORSHARED')
+ * as described in Python documentation for embedding.
+ *
+ * If you don't want Python, then simply do
+ * ./configure
+ * make
+ *
+ * Embedding is no longer needed. You can run a Python script which repeatedly
+ * runs 'exec' statement. Script, called 'coprocess.py', would go something
+ * like
+ *
+ * import sys
+ * fifo_in = sys.argv[1]
+ * fifo_out = sys.argv[2]
+ * while 1:
+ * fin = open(fifo_in, "r")
+ * fout = open(fifo_out, "w")
+ * sys.stdout = fout
+ * exec fin
+ * sys.stdout.flush()
+ * fout.close()
+ * fin.close()
+ *
+ * Then, you can do
+ *
+ * mkfifo in out
+ * python coprocess.py in out &
+ *
+ * echo "print 1.0+2.0" > in
+ * cat out
+ * echo "import math" > in
+ * echo "print math.pi" > in
+ * cat out
+ *
+ * --William Park <opengeometry@yahoo.ca>
+ */
+
+#ifdef EMBED_PYTHON
+#include "Python.h" /* includes <stdio.h> */
+
+$BUILTIN embeddedpython
+$FUNCTION embeddedpython_builtin
+$DEPENDS_ON EMBED_PYTHON
+$SHORT_DOC embeddedpython [-cq] arg...
+Send the command-line arguments to embedded Python. Syntax follows the
+normal Python, ie.
+ python scriptfile
+ python -c "command"
+except that multiple files or strings can be used. By default, the
+arguments are script files, so sequentially send the file contents to
+Python via PyRun_SimpleFile(). With '-c' option, the arguments are command
+strings, so sequentially send the string contents to Python via
+PyRun_SimpleString(). For readability, leading whitespaces in the strings
+are removed. '-q' stops the embedded Python via Py_Finalize().
+$END
+
+
+int
+embeddedpython_builtin (list)
+ WORD_LIST *list;
+{
+ char *arg;
+ int opt, cflag, out;
+
+ cflag = 0;
+
+ reset_internal_getopt ();
+ while ((opt = internal_getopt (list, "cq")) != -1) {
+ switch (opt) {
+ case 'c':
+ cflag = 1;
+ break;
+ case 'q':
+ Py_Finalize();
+ break;
+ default:
+ builtin_usage ();
+ return (EX_USAGE);
+ }
+ }
+ list = loptend;
+
+ if (list == 0) /* 0 argument */
+ return (EXECUTION_SUCCESS);
+
+ if (! Py_IsInitialized())
+ Py_Initialize();
+
+ if (cflag) { /* send string */
+ char *t;
+
+ for ( ; list; list = list->next) {
+ arg = list->word->word;
+ while (*arg && spctabnl (*arg) && isifs (*arg))
+ arg++;
+ out = PyRun_SimpleString (arg);
+ if (out)
+ return (EXECUTION_FAILURE);
+ }
+ } else { /* send file */
+ FILE *fd;
+
+ for ( ; list; list = list->next) {
+ arg = list->word->word;
+ fd = fopen (arg, "r");
+ if (fd == NULL) {
+ builtin_error ("cannot open file `%s'", arg);
+ return (EXECUTION_FAILURE);
+ }
+ out = PyRun_SimpleFile (fd, arg);
+ fclose (fd);
+ if (out)
+ return (EXECUTION_FAILURE);
+ }
+ }
+
+ fflush (stdout);
+ return (EXECUTION_SUCCESS);
+}
+#endif /* EMBED_PYTHON */
diff -ru bash-2.05b/builtins/echo.def bash/builtins/echo.def
--- bash-2.05b/builtins/echo.def 2002-03-19 10:45:28.000000000 -0500
+++ bash/builtins/echo.def 2004-01-03 16:30:01.000000000 -0500
@@ -31,10 +31,12 @@
#include <stdio.h>
#include "../shell.h"
+#include "chartypes.h" /* needed for ISXDIGIT() and HEXVALUE() */
+
$BUILTIN echo
$FUNCTION echo_builtin
$DEPENDS_ON V9_ECHO
-$SHORT_DOC echo [-neE] [arg ...]
+$SHORT_DOC echo [-neE] [-uU] [arg ...]
Output the ARGs. If -n is specified, the trailing newline is
suppressed. If the -e option is given, interpretation of the
following backslash-escaped characters is turned on:
@@ -52,6 +54,13 @@
You can explicitly turn off the interpretation of the above characters
with the -E option.
+
+The following options are added:
+ -u converts 2-digit '%NN' URL hexcode into 0xNN ASCII character.
+ To avoid confusion wth '\xN' or '\xNN' hexcodes, this option is
+ ignored if -e option is on.
+ -U encodes ASCII characters to '%NN' hexcode, which is inverse of
+ -u option.
$END
$BUILTIN echo
@@ -62,7 +71,7 @@
$END
#if defined (V9_ECHO)
-# define VALID_ECHO_OPTIONS "neE"
+# define VALID_ECHO_OPTIONS "neEuU"
#else /* !V9_ECHO */
# define VALID_ECHO_OPTIONS "n"
#endif /* !V9_ECHO */
@@ -88,6 +97,8 @@
int display_return, do_v9, i, len;
char *temp, *s;
+ int decode_URL = 0; /* convert '%NN' to 0xNN ASCII character */
+
do_v9 = xpg_echo;
display_return = 1;
@@ -124,6 +135,12 @@
case 'E':
do_v9 = 0;
break;
+ case 'u':
+ decode_URL = 1;
+ break;
+ case 'U':
+ decode_URL = 2;
+ break;
#endif /* V9_ECHO */
default:
goto just_echo; /* XXX */
@@ -145,6 +162,33 @@
for (s = temp; len > 0; len--)
putchar (*s++);
}
+
+ /*********************************************************************
+ * Conversion between 2-digit '%NN' URL hexcode and ASCII character,
+ * but only if -e option is not enabled to avoid confusion. Doing
+ * this in C is much easier than shell function, because you need
+ * access to internal binary number. I wrote this because I could
+ * only remember '%20' as URL code for space.
+ *
+ * --William Park <opengeometry@yahoo.ca>
+ */
+ else if (decode_URL == 1) {
+ for (s = temp; *s; s++)
+ if (*s == '%' && ISXDIGIT (s[1]) && ISXDIGIT (s[2])) {
+ putchar (HEXVALUE (s[1]) * 16 + HEXVALUE (s[2]));
+ s += 2;
+ } else
+ putchar (*s);
+ } else if (decode_URL == 2) {
+ char hexchar[] = "0123456789abcdef";
+
+ for (s = temp; *s; s++) {
+ putchar ('%');
+ putchar (hexchar[(*s / 16) & 15]); /* upper half */
+ putchar (hexchar[*s & 15]); /* lower half */
+ }
+ }
+
else
printf ("%s", temp);
#if defined (SunOS5)
diff -ru bash-2.05b/builtins/read.def bash/builtins/read.def
--- bash-2.05b/builtins/read.def 2002-03-19 14:33:41.000000000 -0500
+++ bash/builtins/read.def 2004-01-05 17:55:33.000000000 -0500
@@ -23,7 +23,7 @@
$BUILTIN read
$FUNCTION read_builtin
-$SHORT_DOC read [-ers] [-u fd] [-t timeout] [-p prompt] [-a array] [-n nchars] [-d delim] [name ...]
+$SHORT_DOC read [-ers] [-u fd] [-t timeout] [-p prompt] [-a array] [-n nchars] [-d delim] [-DN] [name ...]
One line is read from the standard input, or from file descriptor FD if the
-u option is supplied, and the first word is assigned to the first NAME,
the second word to the second NAME, and so on, with leftover words assigned
@@ -45,6 +45,10 @@
its value is the default timeout. The return code is zero, unless end-of-file
is encountered, read times out, or an invalid file descriptor is supplied as
the argument to -u.
+
+The following options are added:
+ -N creates Awk-style NF and NR shell variables.
+ -D reads DOS lines terminated by '\r\n'.
$END
#include <config.h>
@@ -140,6 +144,9 @@
int rlind;
#endif
+ int awk_NF_NR = 0; /* Awk's NF and NR variables */
+ int dos_EOL = 0; /* read DOS lines which end with '\r\n' */
+
USE_VAR(size);
USE_VAR(i);
USE_VAR(pass_next);
@@ -175,7 +182,7 @@
delim = '\n'; /* read until newline */
reset_internal_getopt ();
- while ((opt = internal_getopt (list, "ersa:d:n:p:t:u:")) != -1)
+ while ((opt = internal_getopt (list, "ersa:d:n:p:t:u:DN")) != -1)
{
switch (opt)
{
@@ -239,6 +246,14 @@
case 'd':
delim = *list_optarg;
break;
+
+ case 'N':
+ awk_NF_NR = 1;
+ break;
+ case 'D':
+ dos_EOL = 1;
+ break;
+
default:
builtin_usage ();
return (EX_USAGE);
@@ -454,6 +469,22 @@
break;
}
input_string[i] = '\0';
+
+ /*****************************************************************************
+ * Read DOS lines which end in '\r\n'. If we are reading by lines (ie. delim
+ * == '\n' and nchars == 0), then remove the extra '\r' at the end of string.
+ * So,
+ * read -D a b c ...
+ * is equivalent to
+ * read
+ * REPLY="${REPLY%$'\r'}"
+ * read a b c ... <<< "$REPLY"
+ * but less typing.
+ *
+ * --William Park <opengeometry@yahoo.ca>
+ */
+ if (dos_EOL && nchars == 0 && delim == '\n' && i > 0 && input_string[i-1] == '\r')
+ input_string[--i] = '\0';
#if 1
if (retval < 0)
@@ -492,6 +523,29 @@
retval = eof ? EXECUTION_FAILURE : EXECUTION_SUCCESS;
+ /*****************************************************************************
+ * Emulation of Awk variables NF and NR. The total number of IFS fields and
+ * number of lines read so far will be assigned to shell variables 'NF' and
+ * 'NR', respectively.
+ *
+ * --William Park <opengeometry@yahoo.ca>
+ */
+ if (awk_NF_NR) {
+ intmax_t n;
+ WORD_LIST *fwlist;
+
+ fwlist = list_string (input_string, ifs_chars, 0);
+ n = list_length ((GENERIC_LIST *)fwlist);
+ bind_var_to_int ("NF", n);
+ dispose_words (fwlist);
+
+ t = get_string_value ("NR");
+ if (t && *t && legal_number (t, &n) && n >= 0)
+ bind_var_to_int ("NR", n + 1);
+ else
+ bind_var_to_int ("NR", 1);
+ }
+
#if defined (ARRAY_VARS)
/* If -a was given, take the string read, break it into a list of words,
an assign them to `arrayname' in turn. */
diff -ru bash-2.05b/builtins/shopt.def bash/builtins/shopt.def
--- bash-2.05b/builtins/shopt.def 2002-04-04 14:21:32.000000000 -0500
+++ bash/builtins/shopt.def 2004-01-05 17:11:53.000000000 -0500
@@ -62,6 +62,10 @@
extern int cdspelling, expand_aliases;
extern int check_window_size;
extern int glob_ignore_case;
+
+extern int regex_ignore_case; /* for regex */
+extern int regex_match_newline; /* for regex */
+
extern int hup_on_exit;
extern int xpg_echo;
@@ -139,6 +143,13 @@
{ "no_empty_cmd_completion", &no_empty_command_completion, (shopt_set_func_t *)NULL },
#endif
{ "nocaseglob", &glob_ignore_case, (shopt_set_func_t *)NULL },
+
+ /*****************************************************************************
+ * For case-insensitive regex --William Park <opengeometry@yahoo.ca>
+ */
+ { "nocaseregex", ®ex_ignore_case, (shopt_set_func_t *)NULL },
+ { "multilineregex", ®ex_match_newline, (shopt_set_func_t *)NULL },
+
{ "nullglob", &allow_null_glob_expansion, (shopt_set_func_t *)NULL },
#if defined (PROGRAMMABLE_COMPLETION)
{ "progcomp", &prog_completion_enabled, (shopt_set_func_t *)NULL },
diff -ru bash-2.05b/execute_cmd.c bash/execute_cmd.c
--- bash-2.05b/execute_cmd.c 2002-03-18 13:24:22.000000000 -0500
+++ bash/execute_cmd.c 2004-01-06 18:05:26.000000000 -0500
@@ -1527,15 +1527,49 @@
SHELL_VAR *old_value = (SHELL_VAR *)NULL; /* Remember the old value of x. */
#endif
- if (check_identifier (for_command->name, 1) == 0)
- {
- if (posixly_correct && interactive_shell == 0)
+ /*****************************************************************************
+ * Enable multiple loop variables in for-loop, with syntax
+ * for a,b,c,... in list; do
+ * ...
+ * done
+ * where no space is allowed around ',' (comma) because only one word is
+ * parsed. List items are sequentially assigned to the loop variables 'a',
+ * 'b', 'c', etc. If there is shortage of item, then the last iteration will
+ * run with '' (null) assigned to leftover variables.
+ *
+ * --William Park <opengeometry@yahoo.ca>
+ */
+ int multi_variables;
+ WORD_LIST *list_of_for_variables, *fv;
+
+ multi_variables = 0;
+
+ if (xstrchr (for_command->name->word, ',') != NULL) { /* split 'a,b,c,...' */
+ char *t;
+
+ multi_variables = 1;
+ list_of_for_variables = word_split (for_command->name, ",");
+ identifier = list_of_for_variables->word->word;
+ }
+ /*
+ * Check if a, b, c, ... are legal shell variables.
+ */
+ if (multi_variables) {
+ for (fv = list_of_for_variables; fv; fv = fv->next)
+ if (check_identifier (fv->word, 1) == 0)
+ goto Exit_by_Original_Code;
+ } else {
+ if (check_identifier (for_command->name, 1) == 0) /* original code */
{
- last_command_exit_value = EX_USAGE;
- jump_to_top_level (EXITPROG);
+Exit_by_Original_Code:
+ if (posixly_correct && interactive_shell == 0)
+ {
+ last_command_exit_value = EX_USAGE;
+ jump_to_top_level (EXITPROG);
+ }
+ return (EXECUTION_FAILURE);
}
- return (EXECUTION_FAILURE);
- }
+ }
loop_level++;
identifier = for_command->name->word;
@@ -1561,21 +1595,46 @@
{
QUIT;
this_command_name = (char *)NULL;
- v = bind_variable (identifier, list->word->word);
- if (readonly_p (v) || noassign_p (v))
- {
- if (readonly_p (v) && interactive_shell == 0 && posixly_correct)
- {
- last_command_exit_value = EXECUTION_FAILURE;
- jump_to_top_level (FORCE_EOF);
- }
- else
+
+ /* Assign list items into a, b, c, ...
+ */
+ if (multi_variables) {
+ for (fv = list_of_for_variables; fv; fv = fv->next) {
+ identifier = fv->word->word;
+ if (list) {
+ /*
+ * Goto the next item in the list, only if there are more
+ * variables to assign. If finished assigning, then leave the
+ * incrementing for the next iteration.
+ */
+ v = bind_variable (identifier, list->word->word);
+ if (fv->next)
+ list = list->next;
+ } else /* no more items */
+ v = bind_variable (identifier, "");
+ if (readonly_p (v) || noassign_p (v))
+ goto Exit_by_Original_Code_2;
+ }
+
+ } else {
+ v = bind_variable (identifier, list->word->word); /* original code */
+ if (readonly_p (v) || noassign_p (v))
{
- run_unwind_frame ("for");
- loop_level--;
- return (EXECUTION_FAILURE);
+Exit_by_Original_Code_2:
+ if (readonly_p (v) && interactive_shell == 0 && posixly_correct)
+ {
+ last_command_exit_value = EXECUTION_FAILURE;
+ jump_to_top_level (FORCE_EOF);
+ }
+ else
+ {
+ run_unwind_frame ("for");
+ loop_level--;
+ return (EXECUTION_FAILURE);
+ }
}
- }
+ }
+
retval = execute_command (for_command->action);
REAP ();
QUIT;
@@ -1592,6 +1651,8 @@
if (continuing)
break;
}
+
+ if (multi_variables && list == 0) break;
}
loop_level--;
@@ -1612,6 +1673,8 @@
}
#endif
+ if (multi_variables)
+ dispose_words (list_of_for_variables);
dispose_words (releaser);
discard_unwind_frame ("for");
return (retval);
diff -ru bash-2.05b/subst.c bash/subst.c
--- bash-2.05b/subst.c 2004-01-03 00:10:57.000000000 -0500
+++ bash/subst.c 2004-01-06 16:34:05.000000000 -0500
@@ -4115,6 +4115,11 @@
t = temp ? savestring (temp) : savestring ("");
t1 = dequote_string (t);
free (t);
+#if defined (ARRAY_VARS)
+ if (valid_array_reference (name))
+ assign_array_element (name, t1);
+ else
+#endif
bind_variable (name, t1);
free (t1);
return (temp);
@@ -4359,7 +4364,7 @@
#if defined (ARRAY_VARS)
case VT_ARRAYVAR:
a = (ARRAY *)value;
- len = array_num_elements (a) + 1;
+ len = array_num_elements (a);
break;
#endif
}
@@ -4475,6 +4480,9 @@
char *temp, *val, *tt;
SHELL_VAR *v;
+ int skip_like_sed;
+ intmax_t x, y;
+
if (value == 0)
return ((char *)NULL);
@@ -4484,15 +4492,58 @@
if (vtype == -1)
return ((char *)NULL);
- r = verify_substring_values (val, substr, vtype, &e1, &e2);
- if (r <= 0)
- return ((r == 0) ? &expand_param_error : (char *)NULL);
+ /*****************************************************************************
+ * Check for Sed-style 'x~y' skipping, where 'x' and 'y' are positive
+ * integers. Eg.
+ * ${*:1~2}
+ * ${@:1~2}
+ * ${array[*]:1~2}
+ * ${array[@]:1~2}
+ * ${string:1~2}
+ * all give every other positional parameters, array elements, and string
+ * characters, respectively, starting at 1.
+ *
+ * Whitespace is not allowed, in order to differentiate from a valid
+ * arithmetic bitwise negation (~).
+ *
+ * --William Park <opengeometry@yahoo.ca>
+ */
+ skip_like_sed = 0;
+ tt = xstrchr (substr, '~');
+ if (tt) {
+ *tt++ = '\0';
+ if (*substr && all_digits (substr) && legal_number (substr, &x) && x >= 0 &&
+ *tt && all_digits (tt) && legal_number (tt, &y) && y >= 0)
+ skip_like_sed = 1;
+ tt[-1] = '~'; /* restore the original string */
+ }
+ if (! skip_like_sed) { /* original code */
+ r = verify_substring_values (val, substr, vtype, &e1, &e2);
+ if (r <= 0)
+ return ((r == 0) ? &expand_param_error : (char *)NULL);
+ }
switch (vtype)
{
case VT_VARIABLE:
case VT_ARRAYMEMBER:
- tt = substring (val, e1, e2);
+ if (skip_like_sed) {
+ size_t i, n;
+ char *s;
+
+ n = strlen (val);
+ if (n == 0)
+ return (char *)NULL;
+
+ s = tt = (char *)xmalloc (n + 1);
+ if (y <= 0)
+ y = 1; /* don't want infinite loop */
+ for (i = x; i < n; i += y)
+ *s++ = val[i];
+ *s = '\0';
+ } else
+ tt = substring (val, e1, e2); /* original code */
+
if (vtype == VT_VARIABLE)
FREE (val);
if (quoted & (Q_DOUBLE_QUOTES|Q_HERE_DOCUMENT))
@@ -4501,8 +4552,38 @@
temp = tt ? quote_escapes (tt) : (char *)NULL;
FREE (tt);
break;
+
case VT_POSPARMS:
- tt = pos_params (varname, e1, e2, quoted);
+ if (skip_like_sed) {
+ WORD_LIST *out, *plist, *p;
+
+ plist = list_rest_of_args ();
+ if (plist == 0)
+ return (char *)NULL;
+
+ out = (WORD_LIST *)NULL;
+ for (p = plist; p; p = p->next) {
+ while (p && --x > 0)
+ p = p->next;
+ if (p == 0)
+ break;
+ out = make_word_list (make_bare_word (p->word->word), out);
+ x = y; /* for next time */
+ }
+ out = REVERSE_LIST (out, WORD_LIST *);
+
+ if (varname[0] == '*') /* copied from pos_params() */
+ tt = (quoted & (Q_HERE_DOCUMENT|Q_DOUBLE_QUOTES)) ?
+ string_list_dollar_star (quote_list (out)) : string_list (out);
+ else
+ tt = string_list ((quoted & (Q_HERE_DOCUMENT|Q_DOUBLE_QUOTES)) ?
+ quote_list (out) : out);
+
+ dispose_words (out);
+ dispose_words (plist);
+ } else
+ tt = pos_params (varname, e1, e2, quoted); /* original code */
+
if ((quoted & (Q_DOUBLE_QUOTES|Q_HERE_DOCUMENT)) == 0)
{
temp = tt ? quote_escapes (tt) : (char *)NULL;
@@ -4513,7 +4594,29 @@
break;
#if defined (ARRAY_VARS)
case VT_ARRAYVAR:
- tt = array_subrange (array_cell (v), e1, e2, quoted);
+ if (skip_like_sed) {
+ ARRAY *out, *a;
+ ARRAY_ELEMENT *ae;
+
+ a = array_cell (v);
+ if (a == 0 || array_empty (a) || x > array_num_elements (a))
+ return (char *)NULL;
+
+ out = array_create ();
+ x++; /* only for first time, since array starts at 0 */
+ for (ae = element_forw (a->head); ae != a->head; ae = element_forw (ae)) {
+ while (ae != a->head && --x > 0)
+ ae = element_forw (ae);
+ if (ae == a->head)
+ break;
+ array_insert (out, element_index (ae), element_value (ae));
+ x = y; /* for next time */
+ }
+ tt = array_to_string (out, " ", quoted);
+ array_dispose (out);
+ } else
+ tt = array_subrange (array_cell (v), e1, e2, quoted); /* original code */
+
if ((quoted & (Q_DOUBLE_QUOTES|Q_HERE_DOCUMENT)) == 0)
{
temp = tt ? quote_escapes (tt) : (char *)NULL;
diff -ru bash-2.05b/test.c bash/test.c
--- bash-2.05b/test.c 2002-02-28 10:54:47.000000000 -0500
+++ bash/test.c 2004-01-06 20:55:59.000000000 -0500
@@ -507,6 +507,35 @@
}
}
+ /*****************************************************************************
+ * To enable '=~' or '!~' as binary operator,
+ * string =~ regex
+ * string !~ regex
+ * compile with PATTERN_MATCHING. This is just a frontend to builtin command
+ * match string regex
+ * in builtin/array.def.
+ *
+ * --William Park <opengeometry@yahoo.ca>
+ */
+#if defined (PATTERN_MATCHING)
+ else if ((op[0] == '=' || op[0] == '!') && op[1] == '~' && op[2] == '\0') {
+ WORD_LIST *list;
+ int rcode;
+
+ list = (WORD_LIST *)NULL;
+ list = make_word_list (make_bare_word (arg1), list);
+ list = make_word_list (make_bare_word (arg2), list);
+ list = REVERSE_LIST (list, WORD_LIST *);
+ rcode = match_builtin (list);
+ dispose_words (list);
+
+ if (op[0] == '=')
+ return (rcode == EXECUTION_SUCCESS); /* =~ */
+ else
+ return (rcode == EXECUTION_FAILURE); /* !~ */
+ }
+#endif
+
return (FALSE); /* should never get here */
}
@@ -530,7 +559,13 @@
#if defined (PATTERN_MATCHING)
if ((w[0] == '=' || w[0] == '!') && w[1] == '~' && w[2] == '\0')
{
- value = patcomp (argv[pos], argv[pos + 2], w[0] == '=' ? EQ : NE);
+ /*************************************************************************
+ * I want regex matching, not Csh version of '==' or '!='.
+ * --William Park <opengeometry@yahoo.ca>
+ *
+ * value = patcomp (argv[pos], argv[pos + 2], w[0] == '=' ? EQ : NE);
+ */
+ value = binary_test (w, argv[pos], argv[pos + 2], 0);
pos += 3;
return (value);
}
- Next message: Keith Thompson: "Re: NEWBIE. / and // dirs"
- Previous message: Stephane CHAZELAS: "Re: days-between"
- Next in thread: tri10o_at_bsod.org: "Re: (patch for Bash) regex conditional tests"
- Reply: tri10o_at_bsod.org: "Re: (patch for Bash) regex conditional tests"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|