Re: AIX Performance issue
From: Khurram Khan (khurram.khan_at_qict.net)
Date: 01/30/04
- Next message: mark taylor: "Re: How to find the hardware series in AIX?"
- Previous message: Steve Nottingham: "Re: sysback alternative?"
- In reply to: Scott Richardson: "Re: AIX Performance issue"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 30 Jan 2004 02:47:12 -0800
Hi Scott,
Thanks for your detailed response, I highly appreciate your efforts.
that the information you have requested is given below.
You do not mention how much memory,
512MB
how much swap space is configured, and
Total Paging Space Percent Used
512MB 35%
what other processes may be running on the system, or
simply one application which is based on Oracle
what/how backups are accomplished.
this is how we take backup
---------------------------------------
> database backup script
> -----------------------------------------
> DATE=`date +"%d%m%H%M"`
> export_file=express${DATE}.dmp
> su - ora732 -c 'mknod /livedb/backup/'${export_file}' p'
> su - ora732 -c 'cat /livedb/backup/'${export_file}' | compress >
> /livedb/backup/
> '${export_file}'.Z &'
> su - ora732 -c 'exp system/manager
> file=/livedb/backup/'${export_file}' full=y l
> og=/livedb/backup/express-backup.log > /dev/null 2>&1'
> su - ora732 -c 'rm -f /livedb/backup/'${export_file}''
You also do not mentioned what else may have changed recently
on the system since this problem has surfaced:
few months ago IBM guys have used nmon which is IBM tool to collect
systems details in terms of read and write and process utilization,
apart of nothing which I can recall
AIX 4.3.2 ?
4.3.2.0
Same for Oracle 7.3.4?
Oracle7 Server Release 7.3.4.0.0
These are both extremely old versions of the Operating
System and RDBMS.
Well that a dilemma from our application VAR we are bound to use same
version of oracle for at least 2 more years, however I am working to
upgrade AIX version but before that I need to ensure that what the
main reason of that problem is their a need to upgrade hardware or ?
Is this a home grown application?
NO
Is there a VAR involved who developed/sold and supports this
application, in this database, on this Operating System?
This application has been developed by a US based company and they
provide its support they have tested almost every thing related with
application but didn't found any problem.
Local IBM AIX support rep says this is a network related issue which
is causing high % or usr and sys on AIX server, for your reference I
have attached the latest sar and entstat output, I will appreciate
your comments.
The problem is that oracle and Application VAR believes that this
problem is related with AIX or hardware whereas IBM local support
believes that this is because of network issue.
I am working with DPmonitor people to get their full trial version for
testing.
AIX aix1 3 4 0041B35A4C00 01/30/04
14:11:34 %usr %sys %wio %idle
14:11:35 36 64 0 0
14:11:36 35 65 0 0
14:11:37 40 60 0 0
14:11:38 47 53 0 0
14:11:39 63 37 0 0
14:11:40 34 66 0 0
14:11:41 38 62 0 0
14:11:42 31 68 0 1
14:11:43 36 63 0 1
14:11:44 37 63 0 0
Average 40 60 0 0
/ >entstat -d ent1
-------------------------------------------------------------
ETHERNET STATISTICS (ent1) :
Device Type: IBM 10/100 Mbps Ethernet PCI Adapter (23100020)
Hardware Address: 00:04:ac:9e:6e:59
Elapsed Time: 0 days 2 hours 3 minutes 8 seconds
Transmit Statistics: Receive Statistics:
-------------------- -------------------
Packets: 42807 Packets: 67798
Bytes: 7301205 Bytes: 4816352
Interrupts: 193 Interrupts: 67382
Transmit Errors: 0 Receive Errors: 0
Packets Dropped: 0 Packets Dropped: 0
Bad Packets: 0
Max Packets on S/W Transmit Queue: 15
S/W Transmit Queue Overflow: 0
Current S/W+H/W Transmit Queue Length: 1
Broadcast Packets: 44 Broadcast Packets: 11672
Multicast Packets: 0 Multicast Packets: 0
No Carrier Sense: 0 CRC Errors: 0
DMA Underrun: 0 DMA Overrun: 0
Lost CTS Errors: 0 Alignment Errors: 0
Max Collision Errors: 0 No Resource Errors: 0
Late Collision Errors: 0 Receive Collision
Errors: 0
Deferred: 0 Packet Too Short Errors:
0
SQE Test: 0 Packet Too Long Errors:
0
Timeout Errors: 0 Packets Discarded by
Adapter: 0
Single Collision Count: 0 Receiver Start Count: 0
Multiple Collision Count: 0
Current HW Transmit Queue Length: 1
General Statistics:
-------------------
No mbuf Errors: 0
Adapter Reset Count: 0
Driver Flags: Up Broadcast Running
Simplex AlternateAddress 64BitSupport
IBM 10/100 Mbps Ethernet PCI Adapter Specific Statistics:
------------------------------------------------
Chip Version: 25
RJ45 Port Link Status : up
Media Speed Selected: Auto negotiation
Media Speed Running: 100 Mbps Full Duplex
Receive Pool Buffer Size: 384
Free Receive Pool Buffers: 128
No Receive Pool Buffer Errors: 69
Inter Packet Gap: 96
Adapter Restarts due to IOCTL commands: 0
Packets with Transmit collisions:
1 collisions: 0 6 collisions: 0 11 collisions: 0
2 collisions: 0 7 collisions: 0 12 collisions: 0
3 collisions: 0 8 collisions: 0 13 collisions: 0
4 collisions: 0 9 collisions: 0 14 collisions: 0
5 collisions: 0 10 collisions: 0 15 collisions: 0
Excessive deferral errors: 0x0
/ >
Thanks & Regards,
Khurram Khan
"Scott Richardson" <CheetahFTL@attbi.com> wrote in message news:<4AYRb.172548$I06.1718340@attbi_s01>...
> "Khurram Khan" <khurram.khan@qict.net> wrote in message
> news:a23090a4.0401280522.78307f1e@posting.google.com...
> > Hi all,
> >
> > We are using F50 along with AIX 4.3.2 server and Oracle database 7.3.4
> > is installed on that server we have approx 90 users on the network 100
> > baseT network; however approx 30 users normally access AIX server
> > through oracle application, every thing was running fine during last 3
> > years but since a month occasionally (once in a week) we encounter
> > problem when system doesn't allow users to commit any thing in oracle
> > application, we have checked all possibilities in oracle and didn't
> > found any error.
> >
> > Lastly during the time when we have encountered that error I have run
> > sar and lpstat command and their output was as given below.
> >
> > / >sar 1 10
> >
> > AIX aix1 3 4 0041B35A4C00 01/28/04
> >
> > 02:29:56 %usr %sys %wio %idle
> > 02:29:57 87 13 0 0
> > 02:29:58 87 13 0 0
> > 02:29:59 89 11 0 0
> > 02:30:00 89 11 0 0
> > 02:30:01 79 21 0 0
> > 02:30:02 86 14 0 0
> > 02:30:03 79 21 0 0
> > 02:30:04 84 16 0 0
> > 02:30:05 85 15 0 0
> > 02:30:06 85 15 0 0
> >
> > Average 85 15 0 0
> > / >iostat 10 2
> >
> > tty: tin tout avg-cpu: % user % sys % idle %
> > iowait
> > 5.1 6.8 38.0 11.2 42.4
> > 8.5
> >
> > Disks: % tm_act Kbps tps Kb_read Kb_wrtn
> > hdisk0 5.0 27.5 5.9 38467844 37423398
> > hdisk1 19.7 362.0 39.2 927759935 71088344
> > cd0 0.0 0.0 0.0 0 0
> >
> > tty: tin tout avg-cpu: % user % sys % idle %
> > iowait
> > 5.3 72.0 80.0 20.0 0.0
> > 0.0
> >
> > Disks: % tm_act Kbps tps Kb_read Kb_wrtn
> > hdisk0 10.5 60.7 13.9 464 152
> > hdisk1 16.0 433.1 52.3 4296 96
> > cd0 0.0 0.0 0.0 0 0
> > / >
> >
> >
> > Mostly when error occurred our database daily backup was in process
> > however database runs daily but error occurs occationally and since
> > long time we are taking database backup with same procedure.
> >
> > ---------------------------------------
> > database backup script
> > -----------------------------------------
> > DATE=`date +"%d%m%H%M"`
> > export_file=express${DATE}.dmp
> > su - ora732 -c 'mknod /livedb/backup/'${export_file}' p'
> > su - ora732 -c 'cat /livedb/backup/'${export_file}' | compress >
> > /livedb/backup/
> > '${export_file}'.Z &'
> > su - ora732 -c 'exp system/manager
> > file=/livedb/backup/'${export_file}' full=y l
> > og=/livedb/backup/express-backup.log > /dev/null 2>&1'
> > su - ora732 -c 'rm -f /livedb/backup/'${export_file}''
> > ----------------------------------------
> > during peak working hours I have again run sar command and got given
> > output
> > / >sar 1 10
> >
> > AIX aix1 3 4 0041B35A4C00 01/28/04
> >
> > 16:04:36 %usr %sys %wio %idle
> > 16:04:37 75 25 0 0
> > 16:04:38 76 24 0 0
> > 16:04:39 79 21 0 0
> > 16:04:40 68 32 0 0
> > 16:04:41 78 22 0 0
> > 16:04:42 83 17 0 0
> > 16:04:43 73 25 2 0
> > 16:04:44 78 22 0 0
> > 16:04:45 77 23 0 0
> > 16:04:46 73 27 0 0
> >
> > Average 76 24 0 0
> > / >
> >
> > after 4 minutes I again run same command and got mentioned output
> >
> > / >sar 1 10
> >
> > AIX aix1 3 4 0041B35A4C00 01/28/04
> >
> > 16:08:59 %usr %sys %wio %idle
> > 16:09:00 46 23 32 0
> > 16:09:01 30 55 15 0
> > 16:09:02 30 49 21 0
> > 16:09:03 42 17 41 0
> > 16:09:04 41 21 38 0
> > 16:09:05 35 27 38 0
> > 16:09:06 37 26 37 0
> > 16:09:07 32 22 46 0
> > 16:09:08 52 16 32 0
> > 16:09:09 48 40 12 0
> >
> > Average 39 30 31 0
> > /
> >
> >
> > after 2 hour sar output
> > / >sar 1 10
> >
> > AIX aix1 3 4 0041B35A4C00 01/28/04
> >
> > 18:08:31 %usr %sys %wio %idle
> > 18:08:32 25 12 4 59
> > 18:08:33 27 12 2 59
> > 18:08:34 21 11 0 68
> > 18:08:35 36 6 6 52
> > 18:08:36 45 5 0 50
> > 18:08:37 29 9 1 61
> > 18:08:38 26 3 5 66
> > 18:08:39 42 7 1 50
> > 18:08:40 22 9 1 68
> > 18:08:41 22 8 8 62
> >
> > Average 30 8 3 60
> > / >
> >
> >
> > I will highly appreciate if any of you can advice any thing to rectify
> > that issue or to identify root cause of the problem
> >
> > Thanks,
> >
> > Khurram
>
> Hello Khurram,
> I have read your post on comp.unix.aix with interest.
>
> You do not mention how much memory, how much swap
> space is configured, and what other processes may be running
> on the system, or what/how backups are accomplished. You
> also do not mentioned what else may have changed recently
> on the system since this problem has surfaced: Perhaps more
> users, even if only minimal increase? A new user application
> or process? Perhaps the scripts and commands you're running
> to try to discover the problem may be contributing to the
> problem? Who knows.
>
> AIX 4.3.2 ? Is it possible the AIX OS needs to be updated,
> to address possible OS bugs? Same for Oracle 7.3.4?
> These are both extremely old versions of the Operating
> System and RDBMS. Is this a home grown application?
> Is there a VAR involved who developed/sold and supports
> this application, in this database, on this Operating System?
>
> It is many times difficult to gain a comprehensive picture of
> everything that may be going on within a system of this size,
> over time, by running some OS commands, or scripts, to try to
> gain inisght as to what the problem or problems could be. It
> could very well be that these scripts and commands, and the
> overhead they require to gather such information, may
> contribute to, or exasperate the problem or issue casuing the
> performance problem(s).
>
> May I suggest you consider running an extremely low overhead
> process that tracks ALL system parameters and metrics, not only
> at the AIX OS level, but also at the ORACLE RDBMS level.
> This low-overhead process is a DPMonitor Performance Agent.
>
> It runs on your AIX/Oracle Server, and keeps track of all that
> goes on with extremely low-overhead, low-level kernel calls,
> over time, 24 hours a day, 7 days a week, and it saves all
> data in a very small footprint sized,compressed format file.
>
> This compressed format data file generated by the DPMonitor
> Performance Agent is sent to a DPMonitor Performance Explorer
> Console process, that reads the compressed format file, and
> generates easy to read, colorful, dynamically scaling graphs that
> clearly show exactly what is going on within your system, around
> the clock, over time. Information such as this is extremely critical
> in helping identify, and the address/resolve the issues which affect
> your application server platform's operational dynamics, at both
> the AIX OS level, and at the Oracle RDBMS level.
>
> May you go to the www.deltek.us website and check out the
> DPMonitor.
>
> I have used this Performance Monitor product at several sites,
> across numerous applications, databases, and operating systems.
> It helps easily point out problem areas on it's graphs that can be
> easily understood even by non-technical management types, and
> furthermore, after you take action to correct and resolve the
> identified problem, you can help prove that the action you took to
> address/resolve the problem actually did solve the problem. If it
> per chance does not, then you have a graph like map to show you
> what other issue has now popped up, which you can attack or
> address and resolve.
>
> Performance tuning, as you may know, is often like peeling an
> onion; cutting through the layers to find the real root cause of a
> problem, and addressing it at it's source. Often times, application
> platform performance bottleneck issues often mask themselves
> as other problems.
>
> I wish you much luck in finding the root cause of your platform's
> performance issue, and hope you consider the DPMonitor to help
> take a lot of the mystery out of finding and resolving these problems.
>
> Regards,
> Scott Richardson
> Sr Systems Engineering Consultant
> Marlborough, MA USA
- Next message: mark taylor: "Re: How to find the hardware series in AIX?"
- Previous message: Steve Nottingham: "Re: sysback alternative?"
- In reply to: Scott Richardson: "Re: AIX Performance issue"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|