IBM®
Skip to main content
    Country/region [select]      Terms of use
 
 
     Home      Products      Services & solutions      Support & downloads      My account     
  IBM Wikis > AIX > ... > Installation > AIXV53SANBoot
AIX Log In | Sign Up   View a printable version of the current page.
AIXV53SANBoot
Added by Steve Pittman, last edited by Steve Pittman on Apr 12, 2008  (view change)
Labels: 
(None)

AIX V5.3 boot from SAN

This web page is intended to discuss considerations when booting AIX V5.3 from a LUN on a Storage Area Network.

Before attempting to boot from SAN, confirm that the System p server firmware level is capable of SAN boot. A description of each available firmware level can be accessed from the Microcode downloads web page. Proceed as if downloading microcode, but follow the Desc link associated with a particular firmware level.

The contents of this web page solely reflect the personal views of the authors and do not necessarily represent the views, positions, strategies or opinions of IBM or IBM management. Please use the Add Comment link at the bottom of the page to provide feedback. Note: Until you sign up and log in (using links in the upper right corner of this web page), you will not see the Add Comment link and you can not add a comment.

There are significant advantages when booting from SAN (installing AIX rootvg on LUNs):

  1. advantages conferred by a disk storage subsystem:
    1. better I/O performance due to caching and striping across multiple spindles,
    2. ability to redeploy disk space when a server is retired from service, and
    3. option to use FlashCopy to capture a rootvg backup (but mind the caveats) and
  2. option to move an AIX image from one physical server/LPAR to another.

There are, however, some disadvantages. AIX sysadmins who are aware of the disadvantages, mitigate them, and find disadvantage #3 acceptable can boot from SAN with confidence.

Please help!

If there are other disadvantages not identified here, please use the Add Comment link at the bottom of the page to describe those disadvantages and any ways in which they can be mitigated. Thanks!

(Note: Until you sign up and log in (using links in the upper right corner of this web page), you will not see the Add Comment link and you can not add a comment.

  1. If a SAN hardware problem or an AIX defect causes intermittent loss of access to the SAN, there is no way of capturing a dump to determine what went wrong. The AIX error log and the AIX dump logical volume are in rootvg. If rootvg is on LUNs and AIX can not access LUNs, there is no way to write a dump to the SAN.

    Option to mitigate:

    Configure dump space on a SCSI hdisk dedicated to the LPAR (or on a vSCSI disk which is mapped to an internal SCSI disk allocated to a VIO Server LPAR). Extend rootvg onto the SCSI (or vSCSI) hdisk and configure dump space on it. Because this hdisk is not on the SAN, AIX can write a dump even if access to the SAN is lost. And the /var/adm/ras directory (which contains the AIX errlog) can also be allocated on a SCSI (or vSCSI) hdisk, although this seems less important than configuring non-SAN dump space.

  2. It is difficult to update the Multipath Subsystem Device Driver (SDD or SDDPCM) or other Fibre Channel multipath I/O support (eg, EMC PowerPath or HDS HDLM) when rootvg is on hdisks accessed via multiple Fibre Channel paths, assuming such multi-path access is supported for rootvg hdisks.

    Option to mitigate:

    No mitigation is required when using SDDPCM with AIX MPIO. A new version of SDDPCM can be installed while the current version remains in use. An AIX reboot is required to enable use of the new SDDPCM version for rootvg hdisks. ( As with any system update, it is very important to preserve the option to fall back to a working version should an update render the AIX image unusable. See a note on the AIX V5.3 software maintenance best practices web page for methods of updating AIX while preserving an option to fall back.)

    When using SDD, boot from a LUN accessed via a single path (hdisk) rather than multiple paths (vpath). Mirror AIX on two such LUNs (accessed via different Fibre adapters) so that AIX can survive the failure of one of the Fibre adapters. This mitigation is not optional. As stated in the Multipath Subsystem Device Driver User's Guide:

    SDD does not support:

    • Multipathing to a system boot device
    • Placing system primary paging devices (for example, /dev/hd6) on an SDD vpath device
    • Configuring SDD vpath devices as system primary or secondary dump devices

    When SDD is configured without multipathing to system boot devices, there is no difficulty updating SDD software to a new level. It seems almost certain that the mitigation appropriate for SDD will work equally well for EMC PowerPath, HDS HDLM, and HP AutoPath. (It is likely that, like SDD, third-party multipath device drivers do not support multipathing to rootvg hdisks.)

    Confirm that the version of SDD being used supports SAN boot. The Multipath Subsystem Device Driver User's Guide says:

    Note: SDDPCM supports ESS devices as SAN boot devices, starting from AIX 5.2I and AIX 5.3A. SDDPCM supports DS8000, DS6000, and SAN Volume Controller devices as SAN boot devices, starting from AIX 5.2L and AIX 5.3D.

    The AIX release numbers specified above require translation:

    AIX 5.2I AIX V5.2 ML05
    AIX 5.2L AIX V5.2 ML07
    AIX 5.3A AIX V5.3 ML01
    AIX 5.3D AIX V5.3 ML03


  3. Running AIX on LUNs can be the source of some very mysterious AIX behavior if the SAN occasionally injects delays into I/O operations, particularly paging operations. (According to a comment below by Jim Carstensen, updating SAN zoning will inject delays in some SANs.) And it is certainly the case that AIX hangs lasting several minutes are a big concern in a cluster (HACMP, VCS, etc), where, if a node hangs for a long time and suddenly wakes up, there is the risk of data corruption. It seems imprudent to boot from SAN (or allocate paging space on SAN for) any cluster node unless the cluster's shared volume groups are protected by disk reservation locks. (In this context, booting from vSCSI disks mapped to LUNs is equivalent to booting from SAN.) Please note that vSCSI disks don't currently support SCSI-3 persistent reserves, so it is currently impossible to protect (with disk reservation locks) a cluster's shared volume group if that volume group resides on vSCSI disks.

    If a SAN delay persists long enough that a write I/O request times out and fails and AIX does not crash as a result, there should be concern regarding data and filesystem integrity. While AIX is designed to handle write I/O request failures properly, it is not possible to inject every possible write I/O error in a test environment. Because every write I/O failure scenario can not possibly be tested, there is the potential that an undiscovered AIX software defect will impact data and filesystem integrity when a write I/O failure occurs. Therefore, even if write I/O failures do not cause AIX crashes, such failures must be treated as very serious SAN problems which deserve the greatest possible effort to diagnose and resolve.

  4. Accidentally installing AIX on the wrong LUNs or booting a system from the wrong LUNs. These risks can generally be avoided with prudent SAN administration.

Please note that rootvg can be placed on a vSCSI disk mapped to a LUN without concern for disadvantage #2. That's because the VIO client (to which the rootvg belongs) does not use (nor need) SDD for multipathing. See Figure 4-29, "Configuration for multiple Virtual I/O Server and IBM ESS" in the Advanced POWER Virtualization on IBM System p5 Redbook (SG24-7940-02), which shows that VIO client rootvg hdisks are configured with "MPIO default PCM failover only" when accessing a LUN through dual VIO Servers.

Moving an AIX V5.3 rootvg from one physical server/LPAR to another

Note

Moving an AIX rootvg from one server/LPAR to another isn't supported but does work provided the CPU architectures of the source and target are the same and the source and target have identical PCI adapter configurations.

A supported method of copying an AIX rootvg from one server/LPAR to another is to use the AIX mksysb command.

Please note that if AIX is shut down on one server/LPAR and booted up on another, AIX will come up using Fibre Channel adapters which have different WWPNs than those on which it ran earlier. LUN masking (and probably SAN zoning) must be changed to accommodate the new WWPNs. The Ethernet MAC addresses will change, too. And unless the source and target LPARs have identical PCI adapter configurations, AIX may have difficulty mapping existing IP addresses to Ethernet adapters in the target server/LPAR.

Please note that if an AIX rootvg is moved from an LPAR on one server to an LPAR on another server, difficulties with DLPAR might be seen in the new server, as described in the alternate disk install cloning, improper hostname resolution may cause HMC RSCT errors Redbooks Technote.

Please help!

Please use the Add Comment link at the bottom of the page to inform others of (1) issues not yet documented here which are encountered when moving an AIX rootvg and (2) ways not yet documented here of mitigating issues. Thanks!

(Note: Until you sign up and log in (using links in the upper right corner of this web page), you will not see the Add Comment link and you can not add a comment.)

Some SANs cause delays while updating zoning, and systems with SAN boot are more likely to notice in errpt a momentary interruption of service.  With multipathing these are not real problems, but making sure lines of communication are open between SAN administrators and Unix admins will reduce headaches of trying to diagnose the errpt.

A point of mitigation during upgrades (either of the OS or the multipathing software) is to use AIX alternate disk install where available.  This ensures a workable rootvg to rollback to.

Cheers,

Jim

Posted by James Carstensen at Oct 08, 2007 11:39 | Permalink

Just wanted to point out that SAN boot can also be done with EMC disks and Powerpath.

The caveat is that the disks must be EMC.  Powerpath upgrades are easily done by taking the root disk out of Powerpath control before the upgrade and putting it back afterwards.  That does have the disadvantage of extra reboots.

 Veritas Storage Foundation also has a SAN boot feature available but it only allows for a single LVM disk which means no root clones.  Early 2008, this restriction should be gone.

Posted by Ginny Cherry at Jan 15, 2008 14:11 | Permalink
Powered by Atlassian Confluence, the Enterprise Wiki. (Version: 2.2.10 Build:#528 Nov 29, 2006)
    About IBM Privacy Contact