Diagnosing Multipath IO Timeout

Multipath, SAN, SCSI … etc. might be the most widely used technology and terms in our industry. Nevertheless, missing details would frequently lead to difficulties in explaining curious problems.

For instance, our storage admin migrated old SAN switch to newer ones recently. The procedure would roughly include unplugging and plugging of Fibre Channel lines, which would introduce planned partial path failure. Initially, we believed that path redundancy would make this operation transparent to applications. But in reality, one RAC node was evicted because LMON was hanging on I/O.

— db instance alert file on pntp1dbs02

Sat Mar 24 12:59:25 2018
IPC Send timeout detected. Receiver ospid 5209 [
Sat Mar 24 13:00:25 2018
IPC Send timeout detected. Receiver ospid 5209 [
Sat Mar 24 13:00:38 2018
Received an instance abort message from instance 3
Sat Mar 24 13:00:38 2018
Received an instance abort message from instance 3
Please check instance 3 alert and LMON trace files for detail.
Please check instance 3 alert and LMON trace files for detail.
Sat Mar 24 13:00:38 2018
System state dump requested by (instance=2, osid=5221 (LMS0)), summary=[abnormal instance termination].
System State dumped to trace file /oraapp/oracle/diag/rdbms/ntp1db/NTP1DB2/trace/NTP1DB2_diag_5181_20180324130038.trc
LMS0 (ospid: 5221): terminating the instance due to error 481
Dumping diagnostic data in directory=[cdmp_20180324130038], requested by (instance=2, osid=5221 (LMS0)), summary=[abnormal instance termination].
Sat Mar 24 13:00:43 2018
ORA-1092 : opitsk aborting process
Sat Mar 24 13:00:43 2018
License high water mark = 12
Termination issued to instance processes. Waiting for the processes to exit
Sat Mar 24 13:00:49 2018
Instance termination failed to kill one or more processes
Instance terminated by LMS0, pid = 5221
USER (ospid: 32973): terminating the instance
Sat Mar 24 13:00:54 2018
Termination issued to instance processes. Waiting for the processes to exit
Instance termination failed to kill one or more processes
Instance terminated by USER, pid = 32973

— System State Dump:

SO: 0xff8ca2710, type: 4, owner: 0x1018aa0340, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
proc=0x1018aa0340, name=session, file=ksu.h LINE:12729, pg=0
(session) sid: 628 ser: 1 trans: (nil), creator: 0x1018aa0340
flags: (0x51) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x409) -/-/INC
DID: , short-term DID:
txn branch: (nil)
edition#: 0 oct: 0, prv: 0, sql: (nil), psql: (nil), user: 0/SYS
ksuxds FALSE at location: 0
service name: SYS$BACKGROUND
Current Wait Stack:
1: waiting for 'Disk file Mirror Read'
fileno=0x0, blkno=0x23, filetype=0x1
wait_id=430554017 seq_num=57276 snap_id=1
wait times: snap=6 min 35 sec, exc=6 min 35 sec, total=6 min 35 sec <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
wait times: max=infinite, heur=6 min 35 sec
wait counts: calls=0 os=0
in_wait=1 iflags=0x520
0: waiting for 'control file sequential read'
file#=0x0, block#=0x23, blocks=0x1
wait_id=430554016 seq_num=57275 snap_id=1
wait times: snap=0.000000 sec, exc=0.000012 sec, total=6 min 35 sec
wait times: max=infinite, heur=6 min 35 sec
wait counts: calls=0 os=0
in_wait=1 iflags=0x15a0

Our environment:

Oracle RAC 11.2.0.4.161018
Redhat Enterprise Linux 7.3 X86_64
Multipath : Device Mapper - Multipath I/O (DM-MP)
Server : Inspur NF5280M4 E5-2620 v4 *2 / 64G mem
SAN Switch : Brocade DCX
Storage : IBM V8800

In order to understand the following questions, I did my homework and found the following documents helpful.

1) Why did LMON hang on I/O for more than 6 minutes?

According to this post:
https://www.flagword.net/2014/06/default-linux-io-multipathd-configuration-and-oracle-rac-caveat/

The timeout exists in several layers:
At SCSI layer defined in /sys/class/scsi_device/h:c:t:l/device/timeout.
FC HBA’s driver layer (in our case it was qla2xxx). Use modinfo to list the current settings.
At dm-multipath or block layer.

So the basic idea is that it’s very hard to predict the exact time it would take to failover and it’s worth trying to fiddle with different timeout settings, i.e. already mentioned and fast_io_fail_tmo, dev_loss_tmo from multipath.conf, as well as to look at the problem from the application’s side and update _asm_hbeatiowait accordingly.

2) How long will multipath path timeout one I/O command in particular?
Essentially this is the same question as the above and it depends.

3) Is redundant multipath truly transparent to applications?
Although it’s probable that temporary failures of paths are tolerable, you DO need to pay attention to your actual workload at the failure time.

4) Default Disktimeout = 200 sec, MissCount = 30 sec in Oracle Clusterware.

If you run into similar questions, you may need to investigate the following information.

cat /etc/multipath.conf

# check fast_io_fail_tmo for each rport

for f in /sys/class/fc_remote_ports/rport-*/fast_io_fail_tmo;
do
d=$(dirname $f);
echo $(basename $d):$(cat $d/node_name):$(cat $f);
done

# check dev_loss_tmo for each rport

for f in /sys/class/fc_remote_ports/rport-*/dev_loss_tmo;
do
d=$(dirname $f);
echo $(basename $d):$(cat $d/node_name):$(cat $f);
done

# check device timeout

multipath -ll | egrep -o 'sd[a-z]{1,2}' | \
awk '{printf("/sys/block/%s/device/timeout\n",$1,$1);}' | xargs -i cat {} | more

# check multipath status

multipath -ll
multipathd show config
multipathd show maps status
multipathd show maps stats

# check Read Link Error Status Block for each rport

for f in /sys/class/fc_remote_ports/rport-*;
do
d=$(basename $f);
echo $d
fcrls --port $d
done

# check host WWPN

awk '{print FILENAME,$0}' /sys/class/scsi_host/host[0-9]/link_state
awk '{print FILENAME,$0}' /sys/class/fc_host/host[0-9]/port_name
awk '{print FILENAME,$0}' /sys/class/fc_host/host[0-9]/port_state

# check zone info for the port on SAN switch

zoneshow | grep -A20 -B20 8b:5e:7e$

# check port info for the WWPN on SAN switch

switchshow | grep -i 8b:5e:7e # host7
switchshow | grep -i 8b:5e:78 # host1

# check port stats on SAN switch

portstatsshow 11/27
portstatsshow 3/27

# log session for the following sessions:
OS log
db alert.log
grid alert.log
grid cssd.log
iostat -xmdz 1
multipath -ll

# disable the specific port on SAN switch

date
portdisable 11/27

# path failed almost immediately
# No information in OS/DB alert/grid alert

# enable the specific port on SAN switch

date
portenable 11/27

# disable the specific port on SAN switch

date
portdisable 3/27

# enable the specific port on SAN switch

date
portenable 3/27

 

Reference (You may find the following documents useful as well):

Multipath is not detecting Path Failures fast enough which can result in system reboot (Doc ID 2281635.1)
Multipath: Queue I/O for N sec Time Before Sending SCSI Error to Device (Doc ID 1632772.1)Low polling_interval ( Multipath ) cause high CPU activity / system hang with high number of LUNs while performing Storage failover test (Doc ID 1684696.1)
ASM diskgroup dismount with “Waited 15 secs for write IO to PST” (Doc ID 1581684.1)
Instance Evicted After LMON to LMON IPC Send timeout Due to Storage Issue (Doc ID 2080029.1)

https://access.redhat.com/solutions/137073
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/scsi-command-timer-device-status
CONTROLLING THE SCSI COMMAND TIMER AND DEVICE STATUS

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this:
search previous next tag category expand menu location phone mail time cart zoom edit close