Re: Efficiency Question: Large Arrays vs. Indexed Files on Alphas
From: Hoff Hoffman (hoff_at_hp.nospam)
Date: 11/20/03
- Next message: Jack Peacock: "Re: Efficiency Question: Large Arrays vs. Indexed Files on Alphas"
- Previous message: Cheryl Hoefelmeyer: "Efficiency Question: Large Arrays vs. Indexed Files on Alphas"
- In reply to: Cheryl Hoefelmeyer: "Efficiency Question: Large Arrays vs. Indexed Files on Alphas"
- Next in thread: Jack Peacock: "Re: Efficiency Question: Large Arrays vs. Indexed Files on Alphas"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Thu, 20 Nov 2003 22:59:40 GMT
In article <72a72e76.0311201438.3bdafc33@posting.google.com>, hoefelmeyer@hotmail.com (Cheryl Hoefelmeyer) writes:
:We have a GS80 and an ES40, not clustered, each running OpenVMS 7.3...
With the OpenVMS V7.3 XFC V2.0 ECO or later, I would suggest.
:The program will operate on three very large files, and at one point I
:can either
Please elaborate on your particular view of "very large files".
10K records is rather small, for instance.
:(1) decide to read some fields of some records into an array and
:search for information sequentially through the array, or
Sequential searches are slow. (Obviously.)
Depending on what you are up to, you might be able to lay out one
or more RMS indexed files, with one or more likely with multiple
keys present in each file.
:(2) read the information into another file and access it each time via
:single read.
:The number of elements or records in this circumstance is not expected
:to exceed 10,000.
How big is a record here? (Depending on what you are up to, systems
using files and RMS access will keep most everything in cache -- if
you have enough memory, XFC will keep most everything cached for you.)
:Elsewhere in the program, I can either
:(1) maintain records that will eventually be written to an output file
:in an array because they may need to be updated some number of times
:before a final record is written, or
:(2) write the first record to the output file and just update each
:time it is necessary.
Or you can use RMS global buffers and/or XFC caching to manage and
maintain this for you.
:The first option would call for reading sequentially through a very
:large array to find the proper record to update each time a new record
:is added. The second calls for 1-3 file operations per each record
:added.
You might have one to three file calls, but these might not map
to an equivalent number of disk I/O operations. Additionally,
RMS indexed file searches are binary in nature, and -- when the
global buffers are sized appropriately -- the index trees are
kept in memory. XFC provides additional capabilities here, too.
: The number of records maintained here is on the order of
:1,000,000.
A million records is slightly more serious, but still not particularly
large.
:So, for each of these, which is the best option in general?
I'd tend to see if I could keep the whole mess in one or more RMS
indexed files, and use RMS global buffers and XFC caching. RMS is
good at searches and caching, and its easier and quicker to use
existing code.
The real question here is one of application design and application
requirements, and your proposed design is rather close to that of a
database application. MySQL might be an option here, or a commercial
database package could potentially be pressed into service as well.
Performance requirements are another obvious consideration -- if the
cost of writing and tuning outweighs the performance requirements,
a "dumber" and slower design can be a better choice. If performance
is a more central issue, then there are other design considerations
that come into play and there is a corresponding incentive spend (more)
on the design and on the coding and tuning efforts -- and you can tune
RMS file access, as well -- there are some comparatively easy ways to
tune performance, such as increasing the allocation size and extend
size, and upping the buffer sizes...
---------------------------- #include <rtfaq.h> -----------------------------
For additional, please see the OpenVMS FAQ -- www.hp.com/go/openvms/faq
--------------------------- pure personal opinion ---------------------------
Hoff (Stephen) Hoffman OpenVMS Engineering hoff[at]hp.com
- Next message: Jack Peacock: "Re: Efficiency Question: Large Arrays vs. Indexed Files on Alphas"
- Previous message: Cheryl Hoefelmeyer: "Efficiency Question: Large Arrays vs. Indexed Files on Alphas"
- In reply to: Cheryl Hoefelmeyer: "Efficiency Question: Large Arrays vs. Indexed Files on Alphas"
- Next in thread: Jack Peacock: "Re: Efficiency Question: Large Arrays vs. Indexed Files on Alphas"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|