Re: why mainframes are still used?
From: Mike Bartman (omni_at_foolie.omniphile.com)
Date: 09/09/04
- Next message: Mike Bartman: "Re: why mainframes are still used?"
- Previous message: Roy Omond: "Re: freeware's (5.0) mailcount from a system account? (orphaned .mai files)"
- In reply to: Michael Kraemer: "Re: why mainframes are still used?"
- Next in thread: Michael Kraemer: "Re: why mainframes are still used?"
- Reply: Michael Kraemer: "Re: why mainframes are still used?"
- Reply: JF Mezei: "Re: why mainframes are still used?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Thu, 09 Sep 2004 11:40:46 -0400
On Thu, 9 Sep 2004 12:01:08 +0000 (UTC), m.kraemer@gsi.de (Michael
Kraemer) wrote:
>In article <e3muj0h1bntovb5cm2ge9pi3dv1df28prg@4ax.com>, Mike Bartman <omni@foolie.omniphile.com> writes:
>>
>> I don't know what they had in a 3090. The X/MP 48 had 4 CPUs, and the
>> 890 MFLOPS number was using all of them (it was done by one of my
>> co-workers for a pre-sales demo using customer-supplied Fortran
>> code...with some mods by us). The "48" in an X/MP's model name tells
>> you the CPU count and memory size: 48 = 4 CPUs and 8 million 64-bit
>> words of memory. The one I worked on at NRL was a 12.
>
>Now this sounds more reasonable too me.
>A single CPU doing around 250 MFLOPS.
>I further assume that this was on highly vectorized code.
>Given the vector speedup of 10 you quoted below leaves
>a single CPU with 25 MFLOPS scalar performance.
>According to you a 3090 does also 25 MFLOPs scalar.
>So where exactly is the difference ?
1) We don't know how many CPUs the 3090 was using.
Since I'm not an IBM person, I have no idea if it was a multi-CPU
system or not.
2) You are comparing one system's slowest capabilities to the other's
fastest.
Making the Indy car run in first gear to make the go kart look less
pitiful isn't really much of a demonstration for the go kart.
See the difference?
>No doubt Cray's were "real" supercomputers and faster than IBM's
>but not by your 890/25 ratio. IIRC you could buy a 6-CPU 3090 with
>vector features, capable of doing several 100 MFLOPs on good vectorized
>code. *This* is the number to be compared with your 890.
Then IBM probably should have advertised that setup instead. I'm
basing the 25 MFLOPS figure on an IBM ad I read at the time (late 80s)
which claimed this 25 MFLOPS machine was a supercomputer. Since I was
working for Cray and using real supercomputers on a daily basis at the
time, I found it laughable.
I wasn't in sales, but I heard about our competition all the time in
meetings. Fujitsu, Hitachi and ETA were mentioned frequently. IBM
never came up. Those who were in the market for supercomputers came
to the above named companies, and generally had each one prove that
they were fastest for the work to be done, then bought that one. This
was usually Cray. If IBM had a real supercomputer they'd have been in
the running, and I'd have heard about them in those meetings. That
didn't happen.
>And I highly doubt that IBM praised their barebone scalar 3090 as being
>supercomputer, it was certainly meant including a VF.
The IBM ad I read claimed that the 3090 was a supercomputer and that
it could do 25 MFLOPS. Whether that is "praising" it or not I'll
leave up to you.
>> >The same logic, however, holds also for vectorized vs scalar code,
>> >so your Cray was only worth its money for highly ( maybe >> 80% )
>> >vectorizable code.
>>
>> Perhaps, but vector processing is really common in science and
>> engineering (lots of array operations), which is where supercomputers
>> are needed. With business processing it's usually more a case of I/O
>> speed than raw CPU power that matters. You are usually doing simple
>> things to lots of data, not complex things to a limited dataset.
>
>I think you can't generalize this.
>Certainly there are such codes working on large arrays with high degree
>of vectorization, but there are as well such with lower degree,
>for which an IBM solution was more cost effective (and sure cost is a point !).
>And there are scientific problems, e.g. in particle physics, which can't
>be vectorized at all, but run happily on parallel architectures.
Of course. It's not whether codes exist, but what ratio of processing
a given area requires. Business needs less vector, science and
engineering need more. Business users might occasionally be able to
make use of vector capabilities, but probably not often enough to
warrant the cost. Science and engineering folks will sometimes need
scalar only, but their need for vector is frequent enough to make it
worth the cost.
For those problems that are nearly 100% parallel, more CPUs are always
better. I said up front that there were such problems I believe.
Cray Research was really aware of what their customers needed in terms
of processing...it took an average of 2 years to sell a machine, and
the salesperson was in constant contact with the customers concerning
their actual needs the whole time. This information went into
planning future designs. Seymore stated on more than one occasion
that Cray's customers could make use of a maximum of 15 CPUs on almost
all codes run. The economic balance point, where cost and utility
met, was at 4 CPUs...hence the X/MP and Cray 2 limit of 4 CPUs.
Apparently this point changed later, after I left the company, as I
believe the Y/MP could take up to 15 CPUs. The point is that
apparently most people who needed supercomputers ran a lot of vector
code that wasn't very parallel. For those who ran almost all parallel
code, there were massively parallel processors available, and they
were usually cheaper than a Cray (half to a third the cost, but not
capable of running very fast unless the problem was almost entirely
parallel). Some places, like NRL (Naval Research Lab), got both sorts
of machines, and used each where it did best.
>> Cray also had multiprocessing down pat...spreading a given program
>> across several CPUs with little or no need for programmers to worry
>> about it...the compilers took care of it automatically. If there was
>> somthing in the code that would prevent this from being done, the
>> compilers would note it in the listing.
>>
>> Cray also had a really efficient OS (COS), with some hardware support
>> for doing things like figuring out which software interrupts to
>> service next (for system service calls...dispatch was one
>> instruction...it was based on a bit vector). If VMS is a Cadillac
>> with ride comfort and lots of gadgets, and MS-DOS is a go kart for
>> playing around with in your spare time, then COS was a top fuel
>> dragster...anything that didn't help it go fast was stripped off, and
>> comfort weasn't a factor considered in the design.
>>
>> Even when running scalar only a Cray was faster than most
>> computers...that 8.5 ns clock rate, separate ALUs that could run at
>> the same time in many cases (allowing instruction execution overlap if
>> coded correctly, and the compilers were pretty good at doing that),
>> and a bunch of other techniques since copied by the better single chip
>> CPUs, along with use of ECL, short path lengths, no busses to conflict
>> (all point-to-point wiring to connect modules), etc., and the result
>> was the fastest computers on the planet for many years. Neatest
>> looking ones too...both the "bench seat" Cray 1 and X/MPs and the
>> "aquarium" Cray 2 types.
>
>Sure, but what a waste of CPU cycles !
Excuse me? How so?
>I remember that on some supercomputer sites they required codes to have
>a high degree of vectorization in order to be allowed to use their
>expensive Crays.
That makes sense. If your system is overloaded you get rid of those
that can't really make best use of it first, and leave it to those who
can.
>And again, a 8.5 ns clock rate is not orders of magnitude different
>from a 3090 or any other supercomputer at that time.
True, but the other speed features were pretty rare...that's why Cray
sold machines, even at the cost they had. If you needed speed, the
Cray was it, and speed costs money...though in some environments NOT
having speed costs more. That's why people paid what Cray asked.
For example, oil companies. They were a big part of Cray's market.
They did things like taking echo sounding data and calculating
geologic strata from it. They'd then simulate various drilling and
extraction techniques to see which would work best on the given
geologic structure. One potential customer of Cray at one time was
using IBM 370 machines to do this stuff. It took them over 24 hours
to do a run on a single dataset. As part of their decision making,
they set up their codes on a test machine at Cray. When their guy
called the Cray salesman to find out the results so they could see if
they matched what they'd gotten on their existing machine, the results
files had been accidentally deleted by an operator (it was a test
machine... ;-). "Drat, I'll call you again this time tomorrow
then..." said the unhappy oil guy. "No problem, hang on and I'll
re-run them for you...just take a few minutes." said the Cray
salesman. 15 minutes later they had results to compare...and the oil
guy was seriously amazed at the speed difference of the machine they
were contemplating buying.
At another company they'd run extraction technique simulations on part
of the north coast of Alaska field (20 billion barrels estimate) and
reached a conclusion. Then they bought a Cray, and during the 30 day
acceptance testing they decided to re-run their simulation, but at a
higher resolution than they'd been able to run it on their old
machine. They found that a different extraction technique would get
them 2% more oil out of the ground than the one they'd have gone with
based on their older runs. 2% of 20 billion barrels at
$20-$30/barrel? That's 8 to 12 billion dollars. That Cray paid for
itself many times over before it was even accepted. The Cray sales
guy did try to renegotiate the deal so we got 2% of their 2%, but was
turned down. ;-)
>> >The fraction of such codes is probably even less
>> >than that of parallel codes. Which might be a reason why we don't
>> >see that many vector machines anymore these days (apart from Earth Simulator maybe).
>>
>> If you consider all software, maybe. In the science and engineering
>> area I doubt that that's true. Arrays and operations on them are very
>> common. There's a reason the Livermore Loops and Whetstone benchmarks
>> were heavy on them.
>
>Again, don't generalize. Today's more realistic performance measures (eg SpecFP)
>aren't vector centric anymore, AFAIK.
Who's generalizing here? Seems you are. The Livermore Loops were
entirely realistic. They were taken from code actually run at
Livermore Labs, as a representation of what was actually in use on a
daily basis. The Whetstone benchmark was from a similar exercise
using engineering codes from many places. That's not general, that's
specific, and based on real world use. Where did SpecFP come from?
-- Mike B.
----------------------------------------------------------------
To reply via e-mail, remove the 'foolie.' from the address.
I'm getting sick of all the SPAM...
----------------------------------------------------------------
- Next message: Mike Bartman: "Re: why mainframes are still used?"
- Previous message: Roy Omond: "Re: freeware's (5.0) mailcount from a system account? (orphaned .mai files)"
- In reply to: Michael Kraemer: "Re: why mainframes are still used?"
- Next in thread: Michael Kraemer: "Re: why mainframes are still used?"
- Reply: Michael Kraemer: "Re: why mainframes are still used?"
- Reply: JF Mezei: "Re: why mainframes are still used?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|