AIX MPIO Recommendations
This document outlines Multipath I/O (MPIO) best practices for IBM Power for Google (IP4G) deployments, focusing on actions customers can take to ensure optimal performance and availability.
Configuration
Converge handles the underlying MPIO configuration, including redundant paths, adapter diversity, and fabric management.
You can learn more about the general MPIO configuration from the official IBM documentation:
It is important customers understand their Application and select MPIO policies that best suit their Application requirements.
Key Considerations:
- Redundant Paths: Converge provides four physical paths to the backend storage, distributed across two VIOS, for enhanced redundancy.
- Dual Fabric Fiber Channel: Converge uses dual fabric Fiber Channel for all paths to minimize single points of failure.
- Pathing Policy: Customers should understand and adjust the MPIO pathing policy if needed. Common options include:
- Round Robin: Distributes I/O requests evenly across available paths.
- Shortest Queue: Directs I/O to the path with the least congestion.
- Failover Only: Designates a primary path and uses alternative paths only when the primary fails.
Monitoring Available Paths
Regularly monitor the status of MPIO paths to proactively identify potential issues. Customers should consider integrating these MPIO status into their monitoring and alerting. Sample commands for monitoring:
lspath:
lspath
This command displays path status for all devices.
lspath -l <device_name>
This command displays all paths associated with a specific device, including their status (Available, Defined, Failed).
lsmpio:
lsmpio
This command shows detailed information and status for all devices and paths, including their status and path status.
lsmpio -l <device_name>
This command provides detailed information for a specific device.
Scheduled Maintenance
Converge manages all hardware and VIOS maintenance, including firmware upgrades and network changes. Converge sends notifications in advance of any planned maintenance.
Before Maintenance:
Check Path Status: Use lspath or lsmpio to get a baseline of current path status. This will help identify any discrepancies after maintenance.
Resolve any Down Paths: If paths are discovered as down they should be fixed prior to maitnenance to avoid an outage. A standard method for doing so is to:
- Find the failed paths using lspath, note the hdisk and fscsi device
- Remove the failed paths using
rmpath -l hdiskX -p fscsiY
- Rediscover all paths using cfgmgr
- Use lspath to verify the path state
After Maintenance:
Verify Path Status: Use lspath or lsmpio again to confirm that all paths have recovered and are in the “Available” state.
Recover Paths: Sometimes AIX does not automatically recover paths. During these scenarios, customer should attempt to recover the paths. A standard method for doing so is to:
- Find the failed paths using lspath, note the hdisk and fscsi device
- Remove the failed paths using
rmpath -l hdiskX -p fscsiY
- Rediscover all paths using cfgmgr
- Use lspath to verify the path state
Report Issues: If there are any issues with pathing or storage connectivity after maintenance, promptly report them to Converge for resolution.
By following these guidelines and proactively monitoring MPIO paths, customers can ensure the high availability and performance of their applications running on IBM Power for Google.