Re: Implementation errors in strtol()

From: Joerg Wunsch (freebsd-current_at_uriah.heep.sax.de)
Date: 01/20/05

  • Next message: Joerg Wunsch: "Re: Implementation errors in strtol()"
    Date: Thu, 20 Jan 2005 23:21:37 +0100
    To: current@FreeBSD.ORG
    
    

    As Andrey Chernov wrote:

    > Errno may be set in case of error with not documented errno. Thats
    > how I read it, but I may miss something.

    I read that a bit differently.

    > > Still, my major point was that "0x" sequences are falsely rejected as

    > It clearly should be rejected with EINVAL in case base == 16,
    > because 0 alone is not valid HEX sequence

    Nope. "0" alone is a completely valid hexadecimal number,
    representing the value 0. Conversion has to start at the 0 (as it is
    not invalid), and to stop at the x. The string "0x" simply means
    there is *no* optional 0x prefix, but just a number 0 without a prefix
    (followed by a letter that cannot be converted, so it has to be passed
    as final string).

    > > conversion errors, and that strings consisting solely of a plus or
    > > minus sign should not throw an error either, as I read the C standard.

    > +- may produce EINVAL, as POSIX says.

    Where?

    Again, I'd value the C standard higher than Posix. C says they form a
    valid subject sequence. Thus, no error may be flagged.

    I don't have Posix at hand, but SUSPv2 completely follows the C
    standard, with the only addition that EINVAL might be flagged in the
    case of a conversion error. As a subject sequence consisting of a
    sign only (+ or -) does not constitute an empty subject sequence, thus
    no conversion error is permissible. This implies EINVAL must not be
    set.

    > In general please don't forget that strtol(), atol() etc. supposed
    > to parse user input and _detect_ syntax errors, it is their
    > purpose. If they not do it or do it in half, each program forced to
    > use its own parser instead.

    This is no excuse for violating standards. If a user feels that a
    single sign must not be interpreted as a valid 0, they indeed have to
    apply their own checks. It would be no use to them if FreeBSD's
    strtoul (erroneously) flagged it as an error, while any
    standard-compliant implementation would not fulfill their expectations
    anyway.

    As a demonstration, consider this test program:

    #include <stdlib.h>
    #include <stdio.h>
    #include <errno.h>

    const char *s[] = {
            "", "+", "-", "0x", "-0x"
    };

    int
    main(void)
    {
            size_t i;
            char *p;
            long l;

            for (i = 0; i < sizeof s / sizeof s[0]; i++) {
                    errno = 0;
                    l = strtol(s[i], &p, 16);
                    printf("\"%s\" -> %ld, len %u, errno %d\n",
                        s[i], l, p - s[i], errno);
            }
            return 0;
    }

    Below are the results for Solaris 8, FreeBSD 5, Linux 2.x, and HP-UX
    10.20.

    helios% ./foo
    "" -> 0, len 0, errno 0
    "+" -> 0, len 0, errno 0
    "-" -> 0, len 0, errno 0
    "0x" -> 0, len 1, errno 0
    "-0x" -> 0, len 2, errno 0
    j@uriah 1259% ./foo
    "" -> 0, len 0, errno 22
    "+" -> 0, len 0, errno 22
    "-" -> 0, len 0, errno 22
    "0x" -> 0, len 0, errno 22
    "-0x" -> 0, len 0, errno 22
    j@lux 344% ./foo
    "" -> 0, len 0, errno 0
    "+" -> 0, len 0, errno 0
    "-" -> 0, len 0, errno 0
    "0x" -> 0, len 1, errno 0
    "-0x" -> 0, len 2, errno 0
    j@king 105% ./foo
    "" -> 0, len 0, errno 0
    "+" -> 0, len 0, errno 0
    "-" -> 0, len 0, errno 0
    "0x" -> 0, len 0, errno 0
    "-0x" -> 0, len 0, errno 0

    It's quite obvious that any other system differs from FreeBSD here.
    (OK, HP-UX doesn't throw EINVAL at all, even for clearly inconvertible
    strings. But then, it's a pretty old system, more than ten years.)

    As the Posix/SUSPv2 standard say ``may set to EINVAL'', I'd even go
    with the majority and not set EINVAL for an empty string even though
    it technically constitutes a conversion error according to the C
    standard. It's quite pointless to handle a single plus or minus sign
    differently than an empty string, and as both the C and Posix/SUSP
    standards mandate that +/- must not cause a conversion error, just
    don't flag the error for the empty string either.

    -- 
    cheers, J"org               .-.-.   --... ...--   -.. .  DL8DTL
    http://www.sax.de/~joerg/                        NIC: JW11-RIPE
    Never trust an operating system you don't have sources for. ;-)
    _______________________________________________
    freebsd-current@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-current
    To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
    

  • Next message: Joerg Wunsch: "Re: Implementation errors in strtol()"

    Relevant Pages

    • Re: interpreting a null pointer as an empty (null string)
      ... >>pretending that you'd dereferenced a pointer to an empty string. ... Standard; there is no such thing as "invalid practice". ... IOW, once the ... to continue compiling; and if it's finished compiling, ...
      (comp.lang.c)
    • Re: initializing std::string with 0
      ... >that convert from GUI stuff to standard stuff so I have ... >are passed char* representation of the GUI string. ... std::stringeither creates the empty string or perhaps explicitly ...
      (microsoft.public.vc.stl)
    • Re: Convert 08 to decimal 8
      ... That will turn 0 or 000 into the empty string. ... bash, non standard: ...
      (comp.unix.shell)
    • Re: Retrieving values from Data Reader by column name
      ... Obviously this is just pseudo-code and it doesn't work that way, ... and work with empty string instead of null. ... standard, ... Michael ...
      (microsoft.public.dotnet.languages.csharp)
    • Re: Convert 08 to decimal 8
      ... That will turn 0 or 000 into the empty string. ... bash, non standard: ...
      (comp.unix.shell)