Re: How can I obtain 2 and 4-byte data types?

From: Eric Sosman (Eric.Sosman_at_sun.com)
Date: 08/25/03

  • Next message: Pascal Bourguignon: "Re: What protects Unices from Virus like attacks ??"
    Date: Mon, 25 Aug 2003 12:26:36 -0400
    
    

    "Dr. David Kirkby" wrote:
    >
    > There is some numerical code which I wrote in C
    > http://atlc.sourceforge.net/
    >
    > The code is very portable. It did (until yesterday) work on every UNIX
    > box I could lay my hands on (Suns, HP's, Dec Alpha, IBM running every
    > operating system I could find - Solaris, HP-UX, Tru64, AIX, various
    > linux distros etc). It has even been run on a Sony Playstation 2 under
    > NetBSD !!!
    >
    > Yesterday I found a portability issue I can't seem to get around and
    > would like some advice. The machine it won't run on is a Cray Y-MP
    > supercomputer, which has limited data types.
    >
    > sizeof(char)=1
    > sizeof(short)=8
    > sizeof(int)=8
    > sizeof(long)=8
    >
    > The code needs to write Bitmap files. The header of the bitmap needs
    > numbers of 2 and 4 bytes. They need to be written in little-endian
    > format (as on Intel x86). Until yesterday, I was using using chars,
    > shorts, and ints and having routines to convert between big and little
    > endian. However, all my code falls to pieces if there is no data type
    > with 2 and 4 bytes.
    >
    > Any suggestions how to create a data type of 2 and 4 types AND
    > manipulate that data type in a way one can add and subtract with it,
    > and still write it in big and little endian??

        typedef struct { unsigned int data : 16; } twobyte;
        typedef struct { unsigned int data : 32; } fourbyte;

        This doesn't solve the endianness issues, but you apparently
    already know how to deal with those.

    > I know I can do
    >
    > char foo[2], foobar[4];
    >
    > but that is not very convenient.
    >
    > This is the sort of data structure I need to read and write, as its at
    > the beginning of a bitmap file. There are 54-bytes (0x36) of header on
    > a bitmap file, before the data for the actual pixels follows.
    >
    > int16 indicates a 2-byte integer, int32 a 4 byte interger, neither of
    > which the Cray has.
    >
    > struct Bitmap_File_Head_Struct
    > {
    > char zzMagic[2]; /* 00 First two bytes contain BM to indicate
    > bitmap file*/
    > int32 bfSize; /* offset 02 */
    > int16 zzHotX; /* offset 06 */
    > int16 zzHotY; /* offset 08 */
    > int32 bfOffs; /* offset 0A */
    > int32 biSize; /* 0E */
    > } Bitmap_File_Head;
    >
    > struct Bitmap_Head_Struct
    > {
    > int32 biWidth; /* 12 */
    > int32 biHeight; /* 16 */
    > int16 biPlanes; /* 1A */
    > int16 biBitCnt; /* 1C */
    > int32 biCompr; /* 1E */
    > int32 biSizeIm; /* 22 */
    > int32 biXPels; /* 26 */
    > int32 biYPels; /* 2A */
    > int32 biClrUsed; /* 2E */
    > int32 biClrImp; /* offset 32 */
    > /* header ends at 0x36 */
    > } Bitmap_Head;

        You've fallen victim to a trap of the C language. The
    struct data type *looks* like a way to arrange data so as to
    match an externally-defined data layout, but it really isn't
    suited to the purpose. One problem is that the compiler can
    insert padding bytes after any element of a struct (including
    the last), breaking the mapping between the external and
    internal data formats. (It's a little surprising that you
    managed to get this to work on such a wide variety of systems.
    For example, many systems will insert two padding bytes after
    the zzMagic element, messing up all the following offsets.)

        The way to do this sort of thing portably is to do read
    the header as an array of bytes and then compute the struct
    elements' values from those bytes "by hand." On output, you
    generate byte values from struct values and write the byte
    array. For example,

            unsigned char buff[18];
            struct Bitmap_File_Head_Struct head;

            /* input */
            read (fd, buff, sizeof buff);
            head.zzMagic[0] = buff[0];
            head.zzMagic[1] = buff[1];
            head.bfSize = buff[2] + (buff[3] + (buff[4] + (buff[5]
                << 8) << 8) <<8);
            head.zzHotX = buff[6] + (buff[7] << 8);
            head.zzHotY = buff[8] + (buff[9] << 8);
            ...

            /* output */
            buff[0] = head.zzMagic[0];
            buff[1] = head.zzMagic[1];
            buff[2] = head.bfSize;
            buff[3] = head.bfSize >> 8;
            buff[4] = head.bfSize >> 16;
            buff[5] = head.bfSize >> 24;
            buff[6] = head.zzHotX;
            buff[7] = head.zzHotX >> 8;
            buff[8] = head.zzHotY;
            buff[9] = head.zzHotY >> 8;
            ...
            write (fd, buff, sizeof buff);

        Now, this looks like a Royal Pain, and it is undeniably
    tedious. You need suffer the tedium only once, though, because
    you can just create a pair of routines to read and write each
    chunk of formatted data and then call them as needed. As an
    interesting side-effect, note that all the endianness issues
    have suddenly vanished, poof! into thin air: the code above
    works on big-endian, little-endian, middle-endian, and tight-
    endian machines alike. Your need for exact-size integer types
    has also vanished: you no longer care exactly how many bits are
    in the bfSize element of the struct (as long as it's at least 32),
    because any extra bits that might be there just don't matter. [*]

        [*] Well, all right, it's not *quite* as easy as all that.
            If the file format involves numbers that can be negative
            you need to do more work than I've shown above to handle
            the signs properly. But the extra work isn't enormous,
            and you can simply ignore it if all values are known
            to be non-negative.

    -- 
    Eric.Sosman@sun.com
    

  • Next message: Pascal Bourguignon: "Re: What protects Unices from Virus like attacks ??"

    Relevant Pages

    • How can I obtain 2 and 4-byte data types?
      ... Yesterday I found a portability issue I can't seem to get around and ... all my code falls to pieces if there is no data type ... a bitmap file, before the data for the actual pixels follows. ... struct Bitmap_File_Head_Struct ...
      (comp.unix.programmer)
    • Re: How can I obtain 2 and 4-byte data types?
      ... >Yesterday I found a portability issue I can't seem to get around and ... all my code falls to pieces if there is no data type ... >and still write it in big and little endian?? ... they will work on pretty much any platform ever developed. ...
      (comp.unix.programmer)
    • Re: How can I obtain 2 and 4-byte data types?
      ... DDK> Yesterday I found a portability issue I can't seem to get around and ... DDK> would like some advice. ... all my code falls to pieces if there is no data type ... DDK> and still write it in big and little endian?? ...
      (comp.unix.programmer)
    • Re: Little to big endian conversion
      ... numbers in "big endian" order (as is customary among humanoids on this ... > big endian memory am I correct in thinking that it would look no different ... be careful if sending the "struct" as a single entity (and ... Compilers are free to insert "padding" between struct fields (and at the ...
      (comp.lang.c)
    • Re: [Linux-fbdev-devel] fbmem: is bootup logo broken for monochrome LCD ?
      ... pixel). ... The same code works on both big endian and little endian machines III. ... -static inline void color_imageblit(const struct fb_image *image, ... u32 *tab = NULL; ...
      (Linux-Kernel)