Re: NIM Troubleshooting

From: Dohhhh (whynot_at_verizon.net)
Date: 05/11/05

  • Next message: Bruce Gillespie: "Re: Best place to place a job ad for Senior AIX Admin?"
    Date: Wed, 11 May 2005 21:38:58 GMT
    
    

    I agree w/what Ed suggested...also, if the client is still bootable, let
    it come back up; while it's coming up, go to the NIM master, reset the
    client (showmount -e on the master should show the resources allocated
    to the client - that's what you're "resetting")...use the "force = yes"
    where indicated - both fields. If it succeeded showmount -e (on the
    master) should not show the resources exported any longer. Once reset,
    remove the client (from the master), then on the client (assuming it's
    back up at this point), ifconfig -au (or netstat -in); take note of the
    interface being used for the NIM activity, then:

    - rm /etc/nim* (removes the niminfo file)
    - smit nim -> configure (fill in the respective
       fields using the info obatained from the ifconfig/netstat
       command (above); this rebuilds the /etc/niminfo file)
    - once done, F3 back to the install screen
       (if you exit smit, do smit nim -> install)
    - select the respective options/choices...don't forget to
       change the "Accept License" field to "yes" and select the
       appropriate "response file", aka bosinst.data resource...you
       DO have the correct "response" file set up, yes?!

    The issue - from what I've seen, is in the master...it obtains the last
    known interface/adapter resource name (i.e., entX) and if anything
    changes on the client...well <G>...the above should fix that since
    you're "registering" the node/system from the client.

    If the client ISN't bootable, are you sure you're configuring the IP
    information for the correct interface? Also, as Ed suggested, it's
    ALWAYS worth confirming the correct interface is set to the correct
    speed/duplex - in my experience, AIX "wants" 100 full duplex, adapter
    and switch port, on the 10/100 adapters; the IEEE spec for GB adapters
    is to leave both switch port and adapter set to autonegotiate)

    Whew...there's more but hopefully, the above helps some <G>

    Paul
    p.s. Most hubs I've worked with are "self managing" - that is, if the
    adapter you're using IS connected to a hub, leave the adapter set for
    autonegotiation...PS
    ====
    Ed Kwedar wrote:
    > Try changing the following and try again. I realize that you advised you
    > didn't think you had a problem at this layer, but I have seen settings such
    > as yours fail to complete the bootp download of the install kernel.
    >
    > change
    > 1.Gateway to be equal to the NIM server since you are on the same subnet
    > 2. 100,full duplex from auto if on enet switch
    > 100,half duplex if on hub
    > "countskm" <kcounts@helios.acomp.usf.edu> wrote in message
    > news:1115817784.091943.241820@g44g2000cwa.googlegroups.com...
    >
    >>I have been troubleshooting a NIM install on a 7026-6H1
    >>for a few days and I was wondering if anyone has any
    >>ideas for pointing me in the right direction.
    >>
    >>Attached is my issue written up w/ a summary of the problem,
    >>some known goods, and some detailed information recording
    >>the steps I am performing and the debugging I'm trying.
    >>
    >>Any help appreciated!
    >>
    >>Thanks,
    >>
    >>Kevin Counts
    >>
    >>--------------------------------------------------------------------------
    >
    > -------------
    >
    >>Summary of Problem
    >>--------------------------------------------------------------------------
    >
    > -------------
    >
    >>While performing a NIM install, we receive an "ARP Request Failed"
    >>error during the BOOTP initialization:
    >>
    >>BOOTP: Waiting 60 seconds for Spanning Tree
    >>LOAD:ARP: do-arp-req failed : 1
    >>LOAD:ARP: do-arp-req failed : 2
    >>LOAD:ARP: do-arp-req failed : 3
    >>LOAD:ARP: do-arp-req failed : 4
    >>LOAD:ARP: do-arp-req failed : 5
    >>LOAD:ARP: ERROR!!! ARP Request Failed, ABORT !20A80002 !
    >>
    >>--------------------------------------------------------------------------
    >
    > -------------
    >
    >>Known Goods
    >>--------------------------------------------------------------------------
    >
    > -------------
    >
    >> * The client is using a single ethernet connection: IBM 100/10
    >> Ethernet 6:U0.1-P1-I6/E1 with an IP of 10.100.1.221 and MAC of
    >> 00:06:29:ec:c2:ea.
    >>
    >> * Ping to the NIM Master through SMS returns success.
    >>
    >> * The ethernet connection up to the TCP/IP layer is correct because
    >> we can boot the client with a CD install of AIX and telnet into
    >> the NIM Master.
    >>
    >> * The NIM Master's IP is 10.100.1.220 and is on the same /16
    >> subnet.
    >>
    >> * The client is defined in NIM on the NIM Master with the correct
    >> hostname,ip, and mac address as well as /etc/bootptab entry.
    >>
    >> * The BOS Installation is enabled for the client on the NIM
    >> master server.
    >>
    >> * Debugging with /usr/sbin/bootpd -s -d -d -d shows no activity.
    >>
    >> * tcpdump on the NIM master shows only arp activity from the
    >> client's ethernet address but no other traffic.
    >>
    >>--------------------------------------------------------------------------
    >
    > -------------
    >
    >>Detailed Information
    >>--------------------------------------------------------------------------
    >
    > -------------
    >
    >>NIM Master:
    >>IBM 7026-6H1
    >>sucker.wherever.gov (ip: 10.1.100.220, eth: 00:02:55:6a:3a:b7)
    >>
    >>NIM Client:
    >>IBM 7026-6H1
    >>bummer.wherever.gov (ip: 10.1.100.221, eth: 00:06:29:ec:c2:ea)
    >>
    >>
    >>--------------------------------------------------------------------------
    >
    > -------------
    >
    >>NIM Master Server
    >>--------------------------------------------------------------------------
    >
    > -------------
    >
    >>IP Parameters:
    >>
    >>sucker# ifconfig -a | grep -v lo0
    >>en0:
    >>
    >
    > flags=5e080863,c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,
    > 64BIT,CHECKSUM_OFFLOAD,PSEG,CHAIN>
    >
    >> inet 10.100.1.220 netmask 0xffff0000 broadcast 10.100.255.255
    >> inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
    >> inet6 ::1/0
    >> tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1
    >>
    >>--------------------------------------------------------------------------
    >
    > -------------
    >
    >>Ent0 MAC Address:
    >>
    >>sucker# lscfg -vpl ent0 | awk -F. ' /Network Address/ {print $14}'
    >>0002556A3AB7
    >>
    >>--------------------------------------------------------------------------
    >
    > -------------
    >
    >>NIM Client Defined and Enabled
    >>
    >>sucker# lsnim -l client_bummer
    >>client_bummer:
    >> class = machines
    >> type = standalone
    >> default_res = basic_res_grp
    >> connect = shell
    >> platform = chrp
    >> netboot_kernel = mp
    >> if1 = master_net bummer.wherever.gov 000629ecc2ea
    >> cable_type1 = N/A
    >> Cstate = BOS installation has been enabled
    >> prev_state = ready for a NIM operation
    >> Mstate = not running
    >> boot = boot
    >> bosinst_data = bid_ow
    >> lpp_source = 530lpp_res
    >> mksysb = 5300-00master_sysb
    >> nim_script = nim_script
    >> resolv_conf = master_net_conf
    >> spot = 530spot_res
    >> control = master
    >>
    >>Verify /etc/bootptab:
    >>
    >>sucker# grep -v '^#' /etc/bootptab
    >>
    >
    > bummer.wherever.gov:bf=/tftpboot/bummer.wherever.gov:ip=10.100.1.221:ht=ethe
    > rnet:ha=000629ecc2ea:sa=10.100.1.220:sm=255.255.0.0:
    >
    >>--------------------------------------------------------------------------
    >
    > -------------
    >
    >>Bootp debug
    >>
    >>Running prior to client boot on Nim Master.
    >>
    >>sucker# /usr/sbin/bootpd -s -d -d -d
    >>BOOTPD: bootptab mtime is Mon May 9 16:07:36 2005
    >>BOOTPD: reading "/etc/bootptab"
    >>BOOTPD: read 1 entries from "/etc/bootptab"
    >>BOOTPD: dumped 1 entries to "/etc/bootpd.dump".
    >>
    >>Client is booted but no output is generated.
    >>
    >>--------------------------------------------------------------------------
    >
    > -------------
    >
    >>tcpdump debug
    >>
    >>Output during client network install:
    >>
    >>sucker# tcpdump ether host 00:06:29:ec:c2:ea
    >>18:53:06.320477 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov
    >>18:53:08.323451 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov
    >>18:53:10.324310 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov hardware #6
    >>18:53:12.327439 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov hardware #6
    >>18:53:14.328325 arp who-has 10.1.1.1 tell bummer.wherever.gov
    >>18:53:16.331555 arp who-has 10.1.1.1 tell bummer.wherever.gov
    >>18:53:18.334751 arp who-has 10.1.1.1 tell bummer.wherever.gov hardware
    >>#6
    >>18:53:20.335656 arp who-has 10.1.1.1 tell bummer.wherever.gov hardware
    >>#6
    >>18:53:22.626316 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov
    >>18:53:24.628592 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov
    >>18:53:26.632117 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov hardware #6
    >>18:53:28.635228 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov hardware #6
    >>18:53:30.636139 arp who-has 10.1.1.1 tell bummer.wherever.gov
    >>18:53:32.637006 arp who-has 10.1.1.1 tell bummer.wherever.gov
    >>18:53:34.638017 arp who-has 10.1.1.1 tell bummer.wherever.gov hardware
    >>#6
    >>18:53:36.641371 arp who-has 10.1.1.1 tell bummer.wherever.gov hardware
    >>#6
    >>18:53:39.182298 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov
    >>18:53:41.183168 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov
    >>18:53:43.186239 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov hardware #6
    >>18:53:45.189302 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov hardware #6
    >>18:53:47.190135 arp who-has 10.1.1.1 tell bummer.wherever.gov
    >>18:53:49.193227 arp who-has 10.1.1.1 tell bummer.wherever.gov
    >>18:53:51.194132 arp who-has 10.1.1.1 tell bummer.wherever.gov hardware
    >>#6
    >>18:53:53.197289 arp who-has 10.1.1.1 tell bummer.wherever.gov hardware
    >>#6
    >>18:53:56.238775 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov
    >>18:53:58.240797 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov
    >>18:54:00.243960 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov hardware #6
    >>18:54:02.247222 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov hardware #6
    >>18:54:04.248034 arp who-has 10.1.1.1 tell bummer.wherever.gov
    >>18:54:06.248864 arp who-has 10.1.1.1 tell bummer.wherever.gov
    >>18:54:08.251807 arp who-has 10.1.1.1 tell bummer.wherever.gov hardware
    >>#6
    >>18:54:10.253125 arp who-has 10.1.1.1 tell bummer.wherever.gov hardware
    >>#6
    >>18:54:14.293767 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov
    >>18:54:16.297584 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov
    >>18:54:18.303495 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov hardware #6
    >>18:54:20.306705 arp who-has bummer.wherever.gov (00:06:29:ec:c2:ea)
    >>tell bummer.wherever.gov hardware #6
    >>18:54:22.304601 arp who-has 10.1.1.1 tell bummer.wherever.gov
    >>18:54:24.307897 arp who-has 10.1.1.1 tell bummer.wherever.gov
    >>18:54:26.309288 arp who-has 10.1.1.1 tell bummer.wherever.gov hardware
    >>#6
    >>18:54:28.310539 arp who-has 10.1.1.1 tell bummer.wherever.gov hardware
    >>#6
    >>
    >>At this point on the client we see the message:
    >>
    >>LOAD:ARP: ERROR!!! ARP Request Failed, ABORT !20A80002 !
    >>
    >>--------------------------------------------------------------------------
    >
    > -------------
    >
    >>NIM Client
    >>--------------------------------------------------------------------------
    >
    > -------------
    >
    >>>>From SMS:
    >>
    >> * 3 Remote Initial Program Load Setup
    >> o 1. IP Parameters
    >>
    >>Version M2P050203_condor_M2
    >>(c) Copyright IBM Corp. 2000 All rights reserved.
    >>--------------------------------------------------------------------------
    >
    > -----
    >
    >>IP Parameters
    >>
    >>1. Client IP Address [10.100.1.221]
    >>2. Server IP Address [10.100.1.220]
    >>3. Gateway IP Address [10.1.1.1]
    >>4. Subnet Mask [255.255.0.0]
    >>
    >>--------------------------------------------------------------------------
    >
    > -------------
    >
    >>>>From SMS:
    >>
    >> * 2. Adapter Parameters
    >> o 1. IBM 100/10 Ethernet 6:U0.1-P1-I6/E1 000629ecc2ea
    >>
    >>Version M2P050203_condor_M2
    >>(c) Copyright IBM Corp. 2000 All rights reserved.
    >>--------------------------------------------------------------------------
    >
    > -----
    >
    >>Supported Network Types
    >>IBM 100/10 Ethernet Adapter
    >>
    >>1. auto , auto ( rj45 ) <===
    >>2. 10 , half ( rj45 )
    >>3. 10 , full ( rj45 )
    >>4. 100 , half ( rj45 )
    >>5. 100 , full ( rj45 )
    >>
    >>--------------------------------------------------------------------------
    >
    > -------------
    >
    >>>>From SMS:
    >>
    >> * 3 Remote Initial Program Load Setup
    >> o 3. Ping
    >> + 1. IBM 100/10 Ethernet 6:U0.1-P1-I6/E1 000629ecc2ea
    >>
    >>PING: chosen-network-type = ethernet,auto,rj45,auto
    >>PING: client IP = 10.100.1.221
    >>PING: server IP = 10.100.1.220
    >>PING: gateway IP = 10.1.1.1
    >>PING: device /pci@fff7f0a000/pci@b,2/ethernet@1
    >>PING: loc-code U0.1-P1-I6/E1
    >>
    >>PING: Waiting 60 seconds for Spanning Tree ...
    >>PING: Ready to ping:
    >>PING: source hardware address is 0 6 29 ec c2 ea
    >>PING: destination hardware address is 0 2 55 6a 3a b7
    >>PING: source IP address is 10.100.1.221
    >>PING: destination IP address is 10.100.1.220
    >>
    >> .----------------.
    >> | Ping Success. |
    >> `----------------'
    >>
    >>--------------------------------------------------------------------------
    >
    > -------------
    >
    >>>>From SMS
    >>
    >> * 6 MultiBoot?
    >> o 3 Select Install Device
    >> + 2 Ethernet ( loc=U0.1-P1-I6/E1 )
    >>
    >>Version M2P050203_condor_M2
    >>(c) Copyright IBM Corp. 2000 All rights reserved.
    >>--------------------------------------------------------------------------
    >
    > -----
    >
    >>BOOTP: chosen-network-type = ethernet,auto,rj45,auto
    >>BOOTP: server IP = 10.100.1.220
    >>BOOTP: requested filename =
    >>BOOTP: client IP = 10.100.1.221
    >>BOOTP: client HW addr = 0 6 29 ec c2 ea
    >>BOOTP: gateway IP = 10.1.1.1
    >>BOOTP: device /pci@fff7f0a000/pci@b,2/ethernet@1
    >>BOOTP: loc-code U0.1-P1-I6/E1
    >>BOOTP: Cancel = ctl-C
    >>
    >>BOOTP: Waiting 60 seconds for Spanning Tree
    >>BOOTP:ARP: do-arp-req failed : 1
    >>BOOTP:ARP: do-arp-req failed : 2
    >>BOOTP:ARP: do-arp-req failed : 3
    >>BOOTP:ARP: do-arp-req failed : 4
    >>BOOTP:ARP: do-arp-req failed : 5
    >>BOOTP:ARP: ERROR!!! ARP Request Failed, ABORT !20A80002 !
    >>
    >>Install fails.
    >>
    >>--
    >>Thanks,
    >>
    >>Kevin
    >>
    >
    >
    >


  • Next message: Bruce Gillespie: "Re: Best place to place a job ad for Senior AIX Admin?"

    Relevant Pages

    • Re: pushing mksysb to new machine?
      ... NIM is fussy about OS levels. ... The NIM master needs to be the most up-to-date ... Best way I've found to do a NIM mksysb restore is to set off the restore ... The client should reboot and will tftpboot from the NIM ...
      (comp.unix.aix)
    • Re: Binding NIM to a specific IP
      ... NIM Master ... Create an NIM Route Object on the NIM Master - No need for an client ...
      (comp.unix.aix)
    • Re: NIM Question
      ... As the previous posters have said - the NIM master needs to be at the ... spot to that client. ...
      (comp.unix.aix)
    • Re: Alternate disk migration: 5.1 => 5.3 ?
      ... shouldn't the NIM master be at 5.3 to bring the client ... trying to use NIM to do alt-disk upgrades of 4.3.3 to 5.2. ... Using Opera's revolutionary e-mail client: http://www.opera.com/mail/ ...
      (comp.unix.aix)
    • Re: network installation manager
      ... there is a firewall between master and client machines, ... NIM Communication within a Firewall Environment ... master via nimclient calls to the nimesis daemon. ... reserved port range of 1023-513. ...
      (comp.unix.aix)