Re: Bad performance on alpha? (make buildworld)

From: Chuck Swiger (cswiger_at_mac.com)
Date: 02/25/04

  • Next message: Chuck Swiger: "Re: Bad performance on alpha? (make buildworld)"
    Date: Wed, 25 Feb 2004 15:35:56 -0500
    To: freebsd-performance@freebsd.org, freebsd-alpha@freebsd.org
    
    

    David O'Brien wrote:
    > On Wed, Feb 25, 2004 at 12:19:15AM -0500, Chuck Swiger wrote:
    >>>Maybe in theory, but not necessarily in practice.
    >>
    >>It's been a few years since I'd written a compiler, but my viewpoint isn't
    >>based entirely on theory.
    [ ... ]
    >> Your technical description is accurate, but the points you are making here
    >> seem to support my argument, rather than contradict what I said. :-)
    >
    > You're assuming you're writing a compiler targeting _1_ specific
    > architecture.

    No, sir, I certainly do not make such an assumption.

    Most optimization techniques are architecture-independant: liveness analysis,
    CSE, dead code elimination, moving invariants out of loops, branch threading,
    algorithmic identities and strength-reduction. These optimizations are most
    commonly done working with the 3-argument intermediate code that portable
    compilers (PCC, GCC) typically utilize before target platform code generation
    is actually performed.

    There are a few additional optimizations which are architecture specific, such
    as instruction scheduling and peephole/template optimizations, but these
    optimizations generally make much less difference to performance than the
    architecture-independant optimizations mentioned above. Although on some
    platforms, they can make enough difference that a second pass at CSE or
    instruction rescheduling against the target assembly code can be worth doing.

    > It doesn't matter what is possible, what matters is what
    > GCC does. Please go analysis GCC and report the deficiencies. I
    > personally would love to know what they are, and how to make GCC do
    > better on non-x86 platforms.

    I agree that what GCC does matters, not theories.

    I don't have access to Alpha hardware, which is a barrier although not an
    insuperable one. I'd do better considering SPARC or PPC hardware, which I
    actually have available to me. Still, I won't use this as an excuse:

    A quick look suggests that Alpha code generation is deficient dealing with
    unsigned integers because the architecture uses a "sign extended" format to
    store and convert 32-bit unsigned ints (aka "long words") into the (64-bit,
    aka "quad-word") registers. Dealing with unsigned ints smaller than 32-bits
    very probably is also slow because the Alpha requires operand-size
    byte-alignment for all memory access.

    [ "The Alpha does not directly support byte-level operations such as
    transferring single bytes between memory and registers. In principal, we could
    use the instructions already presented to realize bytelevel manipulations, but
    a large amount of shifting and masking would be required. For example,
    consider the C operation *dest = *src, where both dest and src are of type
    (char *). This operation must read the single byte pointed to by src and
    update the single byte pointed to by dest. Without special byte manipulation
    instructions, this simple operation requires 17 Alpha instructions!" ]

    Supposedly, the ldq_u and stq_u instructions are the right way to handle
    byte-level memory access, and it would be worth looking at how well GCC
    utilizes these opcodes dealing with chars and shorts.

    Some of these issues cannot be addressed by changes to the compiler: I suspect
    that FreeBSD's derivation and focus on the x86 architecture means it uses a
    lot of int8 or int16 values which are fast on Intel hardware, whereas using
    int32 or int64 representations would actually prove much faster on the Alpha
    than using smaller-sized quantities.

    -- 
    -Chuck
    _______________________________________________
    freebsd-performance@freebsd.org mailing list
    http://lists.freebsd.org/mailman/listinfo/freebsd-performance
    To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org"
    

  • Next message: Chuck Swiger: "Re: Bad performance on alpha? (make buildworld)"

    Relevant Pages

    • Re: "A Historical Look at the VAX"
      ... prediction etc is made much harder with variable length instructions, wouldn't raw clock speeds be more dependant on the FAB process than on ... A great deal of the Alpha architecture was specifically aimed at *avoiding* instructions that would slow down any pipeline stage, whereas the VAX architecture gave virtually no thought whatsoever to this. ...
      (comp.os.vms)
    • Re: JSH: Not obvious? Simple math test.
      ... The instructions were about the variable n. ... suppose that roughly 50% of the solutions that work for alpha will NOT ... existence of a quadratic residue modulo p. ... The original equation did not have a quadratic residue, ...
      (sci.crypt)
    • Re: Could an Intel 486 wake up with FC3
      ... The problem is that GCC is not (or at least wasn't when I tested it ... on gcc 2.96) smart enough to optimize for processor X while restricting ... code using the 686 timetable and 586 instructions. ... > any plans on releasing those libraries in i586 versions. ...
      (Fedora)
    • Re: c compilation - gcc vs visual c
      ... I recently compiled a numerically intensive c project under cygwin gcc ... MS focuses a lot more on specific optimizations, ... the simplest approach, however (and the one I currently use in my compiler), ... (silly code), are ones I focus on fixing. ...
      (comp.lang.c)
    • Re: Gforth and gcc "progress"
      ... on the Alpha and "fixed" it by changing the documentation, ... sense to have one data type which is twice as long as the longest "native" ... The first GCC port to 64 bit was before, ...
      (comp.lang.forth)