SUMMARY: ES80 memory usage questions

From: Qrjan_Petersson?= (orjan.petersson_at_logcode.com)
Date: 02/10/05

  • Next message: Christian BIACHE: "tu0 and tu1 not working after reboot on DS10 / 4.0F PK8"
    Date: Thu, 10 Feb 2005 10:56:57 +0100
    To: tru64-unix-managers@ornl.gov
    
    

    I did not get any answers from the list but for the first question we found the answer ourselves with some help from HP support.

    The reason we get a few hundred page faults per second on the idle node in our cluster is envmond running with the default
    ENVMON_MONITOR_PERIOD=60. Shutting it down or increasing ENVMON_MONITOR_PERIOD makes the machine behave normally. We also noticed
    this behavior on a 2-CPU ES47, also running 5.1B.

    We have made some progress on the other 2 questions but no definite answer yet. (Any hints would be appreciated).

    -- 
    Örjan Petersson, Logcode SARL 
    > -----Original Message-----
    > From: tru64-unix-managers-owner@ornl.gov [mailto:tru64-unix-managers-owner@ornl.gov] On Behalf Of
    > Örjan Petersson
    > Sent: mercredi 12 janvier 2005 23:12
    > To: tru64-unix-managers@ornl.gov
    > Subject: ES80 memory usage questions
    > 
    > Hi,
    > 
    > I have a few questions on the memory usage on our servers, 2 clustered 4 CPU ES80 with
    > 8 GB RAM running Tru64 5.1B. The cluster is running a telecom billing system in
    > failover mode. Everything runs on one machine and the other just hangs around doing
    > basically nothing. The system is based around an Oracle 8.1.7 database with a 2 GB SGA.
    > The sum of the RSS of all running processes is 2-3 GB.
    > 
    > Q1: vmstat on the standby machine indicates roughly 275 page faults/s even though
    > the machine is idle most of its time. What can be the explanation for that?
    > 
    > eppix@eppix01> vmstat 1
    > Virtual Memory Statistics: (pagesize = 8192)
    >   procs      memory        pages                            intr       cpu
    >   r   w   u  act free wire fault  cow zero react  pin pout  in  sy  cs us sy id
    >   6 329 185  39K 902K  71K   27M   2M  18M     0   4M    0  4K 990  7K  0  5 95
    >   7 332 185  39K 902K  71K   243   40  210     0   71    0  3K 509  8K  0  4 95
    >   6 329 185  39K 902K  71K    36   25  183     0   43    0  3K 333  7K  0  8 92
    >   7 329 184  39K 902K  71K   277   25  183     0   43    0  3K 491  7K  0  2 98
    >   6 329 185  39K 902K  71K   277    0    0     0    0    0  3K 492  6K  0  1 99
    > 
    > 
    > Q2: On the active machine we see a very high page fault count for one of the RADs,
    > 54G versus 360-380M for the others (The machine is up since Jan. 5). What are
    > the possible explanations for this? Can it have something to do with the Oracle
    > SGA shared memory segment that might be on that RAD? Is it possible to verify
    > this hypothesis?
    > 
    > eppix@eppix02> vmstat -R
    > Virtual Memory Statistics: (pagesize = 8192)
    >       procs              memory                                 pages
    > intr       cpu
    >  RAD  r   w   u  st  sw  act actv actu acti free wire wirv wiru fault  cow zero react  pin pout  in
    > sy  cs us sy id
    >    0  5 272 138   0   0 187K  64K  61K  61K 2320  58K  25K    0  376M  36M 217M   15M  41M   1M 405
    > 8K  6K 34 18 48
    >    1  4 176  17   0   0 204K  42K  71K  90K 2031  48K  22K    0   53G  46M 284M    9M  60M 740K
    > 602M  5G  1G 27 34 39
    >    2  4 155  20   0   0 180K  49K  65K  65K 8577  65K  22K    0  380M  33M 213M    8M  40M 843K
    > 263M  5G  2G 32 16 52
    >    3  4 263  21   0   0 185K  49K  34K 101K 2019  67K  17K    0  367M  35M 195M    7M  42M 678K
    > 229K  4G  2G 31 15 54
    >   -- 17 866 196   0   0 758K 205K 232K 319K  14K 239K  87K    0   54G 151M 909M   40M 185M   3M  1G
    > 20G 11G 31 21 48
    > 
    > 
    > Q3: (Related to Q1?) I do not have the impression that we are short on memory but
    > on the active machine we have a couple of thousand page faults/s. Any ideas? Can
    > it have something to do with the ES80 being a NUMA?
    > 
    > eppix@eppix02> vmstat 1
    > Virtual Memory Statistics: (pagesize = 8192)
    >   procs      memory        pages                            intr       cpu
    >   r   w   u  act free wire fault  cow zero react  pin pout  in  sy  cs us sy id
    >  27 853 194 764K  12K 235K   54G 151M 910M   40M 185M   3M  1K 32K 18K 31 21 48
    >  19 871 197 767K 9447 236K  8352  387 2982     0  144    0  5K 45K 19K 41 43 16
    >  22 875 195 768K 7389 236K  6430  795 4347     0  677    0  6K 51K 23K 44 41 15
    >  18 873 197 764K  12K 236K  3241  690 5266     0  423    0 10K 75K 31K 46 38 16
    >  21 872 195 765K  10K 236K  2300  143 2544     0  234    0 11K 71K 30K 60 36  4
    >  27 867 194 767K 9366 236K  4620  285 3802     0  119    0 11K 62K 24K 60 34  6
    > 
    > --
    > Best regards,
    > Örjan Petersson, Logcode SARL, currently with Ericsson Algeria
    > 24 bis, rue Voltaire, FR-42270 St-Priest-en-Jarez, France
    > Phone: +33-4.77.79.65.14, Mobile: +33-6.62.25.37.94, In Algeria: +213-61.699.745
    > 
    > 
    > --
    > Örjan Petersson, Logcode SARL
    > 
    > 
    > 
    

  • Next message: Christian BIACHE: "tu0 and tu1 not working after reboot on DS10 / 4.0F PK8"