Re: retain_quoted v0.003

From: Chris F.A. Johnson (cfajohnson_at_gmail.com)
Date: 10/11/05


Date: Tue, 11 Oct 2005 17:09:15 -0400

On 2005-10-11, John Kelly wrote:
> retain_quoted v0.003
>
> What is it?
>
> A sed script which escapes blanks and backslashes inside quoted
> strings.
[snip]
> There is yet a problem, though.
>
> As mentioned, read does not understand quoted strings. So inside the
> variable holding the value found in column 4, we have a pair of quotes
> surrounding the string. To the user, this is unacceptable, because he
> wanted the quotes only for preserving the literal value of the string,
> not for accompanying it into the variable. The empty string in column
> 2 confirms this notion.
>
> To solve the complete problem, we need an unquote function, which may
> be forthcoming any day now. :-)

   I thought that had already been dealt with.

> In the meantime, here is "retain_quoted," the sed script. It begins
> with version 0.003, superseding the formerly named "sed escapes" which
> ended with version 0.002.
>
> Version 0.003 has significant improvements. It handles quoted strings
> like the shell, so double quotes can be escaped inside a double quoted
> string. And it's remarkably more efficient, handling whole substrings
> instead of one character at a time.
>
> It's commented to be understood, even though it is "sed." And I'm not
> joking. :-D

    This script actually confirms my dislike of sed for much other
    than simple search and replace. I'd prefer to write this in C.

    The following program is not thoroughly tested and may need
    tweaking, but it only took a few minutes to write:

#include <stdio.h>

int main(void)
{
  int c;
  int inq = 0;
  int escaped = 0;

  while ( (c = getchar()) != EOF ) {
    switch (c) {
      case '"':
        if ( escaped == 1 ) escaped = 0 ;
        else if ( inq == 0 ) inq = 1;
        else inq = 0;
        break;

      case ' ':
        if ( inq == 1 ) putchar( '\\');
        break;

      case '\\':
        ++escaped;
        break;
    }
    putchar( c );
    if ( escaped >= 2 ) escaped = 0;
  }
  return 0;
}

> #!/bin/sed -rf
>
> # I, John Kelly, the author of this original work, hereby release it to
> # the public domain. Do with it what you wish, except complain; it has
> # no warranty of any kind. Such effected Tuesday, October 11, 2005.
>
> # append priming newline
> s/$/\n/
>
>: scan_string
> /^\n/b exit
>
> # find opening quote
> /^(([^"'\n]*([^\"'\n]|\\"|\\'))*)("|')(.*)/{
>
> # rotate it to the front
> s//\4\5\1/
>
> # find closing single quote
> /^(')[^'\n]*\1/{
> s//&\n/
> b retain_quoted
> }
>
> # find closing double quote
> /^(")(([^"\n]*([^\"\n]|\\"))*)\1/{
> s//&\n/
> b retain_quoted
> }
>
> # did not find closing quote
> b unquoted
>
> : retain_quoted
> h # save string
> s/(.).*/\1/ # isolate quote character
> x # swap it for string
> G # append it to string
> s/(.*).(.)/\1\2/ # remove extraneous newline
> h # save string
> s/^(.)([^\n]*)\1\n.*/\2/ # isloate quotation
> s/[\[:blank:]]/\\&/g # insert backslash escapes
> x # swap quotation for string
> s/^([^\n]*)\n// # strip old quotation from string
> x # swap string for quotation
> G # append string to quotation
> s/^([^\n]*)\n(.*)(.)/\3\1\3\n\2/ # restore quotation marks
>
> s/^([^\n]*)\n(.*)/\2\1/ # rotate quoted to the rear
> b scan_string
> }
>
>: unquoted
> s/^([^\n]*)(.*)/\2\1/ # rotate unquoted to the rear
>
>: exit
> # remove priming newline
> s/^.//

-- 
    Chris F.A. Johnson                     <http://cfaj.freeshell.org>
    ==================================================================
    Shell Scripting Recipes: A Problem-Solution Approach, 2005, Apress
    <http://www.torfree.net/~chris/books/cfaj/ssr.html>


Relevant Pages

  • Re: Difference between and : symbols
    ... lisp> ... You can inhibit evaluation with the special operator QUOTE. ... The specific thing that confused you is why ASDF functions ... But what ASDF really wants is a string. ...
    (comp.lang.lisp)
  • Re: The Rise of "Logical Punctuation".
    ... Americans are getting used to seeing logical punctuation, ... [begin quote] ... I want you first to consider the string 'the string' and the string ... the quotation mark must follow the punctuation mark. ...
    (alt.usage.english)
  • Re: country names
    ... In "Syntactic Structures" I find on page 109 the following ... linear sequence to form string of symbols by an operation called ... Thus, for Chomsky, in this passage, the word 'morphophonemics' ...
    (sci.lang)
  • Re: fastest way to change case of string
    ... string rather than concatenating a new string is indeed much faster. ... Dim quoted As Collection ... Const QUOTE = "'" ... ChangeCaseX = StrConv ...
    (microsoft.public.vb.general.discussion)
  • MD5 Hash with single quote = grief in dao.findfirst
    ... I know when you need to create a query string and the data contains a single ... quote, you must double the quote as an escape sequence. ... Hundreds of assumption cells combined into one 16 character Hash. ... I build the criteria SQL string. ...
    (microsoft.public.access.modulesdaovba)