일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | ||||||
2 | 3 | 4 | 5 | 6 | 7 | 8 |
9 | 10 | 11 | 12 | 13 | 14 | 15 |
16 | 17 | 18 | 19 | 20 | 21 | 22 |
23 | 24 | 25 | 26 | 27 | 28 |
- PORT
- 솔라리스
- Bash
- rsync
- Hack
- Linux
- rm -rf
- Disk
- ext3
- PERL
- 디스크
- Performance
- 좌절교육
- Cache
- pid
- cadre
- 쇼펜의상속자 #킨텍스메가쇼 #섬유향수탈취제
- goaccess
- rm
- iptables
- 리눅스
- apachetop
- http
- Python
- 칭찬교육
- DNS
- Kernel
- CPU
- ext4
- windows
- Today
- Total
Ben's
디스크 hotswap 기능의 진실 본문
간혹 백업 디스크를 hotswap 으로 교체할때, 새로 넣은 디스크가 sdb 가 아닌 sdc로 인식되어
불가피하게 야간에 리붓을 하여 바로 잡는 경우가 있습니다.
이때 리붓없이 장치명을 바로 잡는 법이 있습니다.
오늘 실제로 있었던 예를 들어서 설명 드리겠습니다 nhkotest2-030 인데요
백업 디스크 확장을 위해 hotswap을 이용하여 용량이 작은 디스크를 제거하고 500G짜리 디스크를 삽입하였습니다
그런데 헐~ sdc 로 잡힙니다.
***********************************************************************************************
★ [원인] => 정상적으로 제거 프로세스가 진행되기 전에 새로운 디스크가 삽입되어 OS에서 새로운 장치명을 할당한 걸로 추정됩니다
(제거후 최소 30초~1분정도는 지나야 OS에서 디스크 제거 프로세스가 끝나는걸로 보입니다.)
## 정상(디스크 제거 로그)
Apr 25 14:57:38 nhkotest1-002 kernel: ata2: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
Apr 25 14:57:38 nhkotest1-002 kernel: ata2: irq_stat 0x00400040, connection status changed
Apr 25 14:57:38 nhkotest1-002 kernel: ata2: SError: { HostInt PHYRdyChg 10B8B DevExch }
Apr 25 14:57:38 nhkotest1-002 kernel: ata2: hard resetting link
Apr 25 14:57:39 nhkotest1-002 kernel: ata2: SATA link down (SStatus 0 SControl 300)
Apr 25 14:57:39 nhkotest1-002 kernel: ata2: failed to recover some devices, retrying in 5 secs
Apr 25 14:57:44 nhkotest1-002 kernel: ata2: hard resetting link
Apr 25 14:57:44 nhkotest1-002 kernel: ata2: SATA link down (SStatus 0 SControl 300)
Apr 25 14:57:44 nhkotest1-002 kernel: ata2: failed to recover some devices, retrying in 5 secs
Apr 25 14:57:49 nhkotest1-002 kernel: ata2: hard resetting link
Apr 25 14:57:50 nhkotest1-002 kernel: ata2: SATA link down (SStatus 0 SControl 300)
Apr 25 14:57:50 nhkotest1-002 kernel: ata2.00: disabled
Apr 25 14:57:50 nhkotest1-002 kernel: ata2: EH complete
Apr 25 14:57:50 nhkotest1-002 kernel: ata2.00: detaching (SCSI 1:0:0:0)
Apr 25 14:57:50 nhkotest1-002 kernel: sd 1:0:0:0: [sdb] Synchronizing SCSI cache
Apr 25 14:57:50 nhkotest1-002 kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Apr 25 14:57:50 nhkotest1-002 kernel: sd 1:0:0:0: [sdb] Stopping disk
Apr 25 14:57:50 nhkotest1-002 kernel: sd 1:0:0:0: [sdb] START_STOP FAILED
Apr 25 14:57:50 nhkotest1-002 kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
## 비정상(디스크 제거 로그)
Apr 25 15:05:28 nhkotest2-030 kernel: ata2: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
Apr 25 15:05:28 nhkotest2-030 kernel: ata2: irq_stat 0x00400040, connection status changed
Apr 25 15:05:28 nhkotest2-030 kernel: ata2: SError: { HostInt PHYRdyChg 10B8B DevExch }
Apr 25 15:05:28 nhkotest2-030 kernel: ata2: hard resetting link
Apr 25 15:05:29 nhkotest2-030 kernel: ata2: SATA link down (SStatus 0 SControl 300)
Apr 25 15:05:29 nhkotest2-030 kernel: ata2: failed to recover some devices, retrying in 5 secs
Apr 25 15:05:34 nhkotest2-030 kernel: ata2: hard resetting link
Apr 25 15:05:40 nhkotest2-030 kernel: ata2: port is slow to respond, please be patient (Status 0x80)
Apr 25 15:05:44 nhkotest2-030 kernel: ata2: COMRESET failed (errno=-16)
Apr 25 15:05:44 nhkotest2-030 kernel: ata2: hard resetting link
Apr 25 15:05:45 nhkotest2-030 kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr 25 15:05:45 nhkotest2-030 kernel: ata2.00: model number mismatch 'ST3250318AS' != 'WDC WD5003ABYX-18WERA0'
Apr 25 15:05:45 nhkotest2-030 kernel: ata2.00: revalidation failed (errno=-19)
Apr 25 15:05:45 nhkotest2-030 kernel: ata2: failed to recover some devices, retrying in 5 secs
Apr 25 15:05:50 nhkotest2-030 kernel: ata2: hard resetting link
Apr 25 15:05:55 nhkotest2-030 kernel: ata2: port is slow to respond, please be patient (Status 0x80)
Apr 25 15:06:00 nhkotest2-030 kernel: ata2: COMRESET failed (errno=-16)
Apr 25 15:06:00 nhkotest2-030 kernel: ata2: hard resetting link
Apr 25 15:06:00 nhkotest2-030 kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr 25 15:06:00 nhkotest2-030 kernel: ata2.00: model number mismatch 'ST3250318AS' != 'WDC WD5003ABYX-18WERA0'
Apr 25 15:06:00 nhkotest2-030 kernel: ata2.00: revalidation failed (errno=-19)
Apr 25 15:06:00 nhkotest2-030 kernel: ata2.00: disabled
Apr 25 15:06:01 nhkotest2-030 kernel: ata2: soft resetting link
Apr 25 15:06:01 nhkotest2-030 kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr 25 15:06:01 nhkotest2-030 kernel: ata2.00: ATA-8: WDC WD5003ABYX-18WERA0, 01.01S02, max UDMA/133
Apr 25 15:06:01 nhkotest2-030 kernel: ata2.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 31/32)
Apr 25 15:06:01 nhkotest2-030 kernel: ata2.00: configured for UDMA/133
Apr 25 15:06:01 nhkotest2-030 kernel: ata2: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0x9 t4
Apr 25 15:06:01 nhkotest2-030 kernel: ata2: irq_stat 0x00400040, connection status changed
Apr 25 15:06:01 nhkotest2-030 kernel: ata2.00: configured for UDMA/133
Apr 25 15:06:01 nhkotest2-030 kernel: ata2: EH complete
Apr 25 15:06:01 nhkotest2-030 kernel: ata2.00: detaching (SCSI 1:0:0:0)
Apr 25 15:06:01 nhkotest2-030 kernel: sd 1:0:0:0: [sdb] Synchronizing SCSI cache
Apr 25 15:06:01 nhkotest2-030 kernel: sd 1:0:0:0: [sdb] Stopping disk ==> 제거 프로세스 진행중...
Apr 25 15:06:02 nhkotest2-030 kernel: scsi 1:0:0:0: Direct-Access ATA WDC WD5003ABYX-1 01.0 PQ: 0 ANSI: 5 ==> 새로운 디스크 삽입됨...
Apr 25 15:06:02 nhkotest2-030 kernel: sd 1:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
Apr 25 15:06:02 nhkotest2-030 kernel: sd 1:0:0:0: [sdc] Write Protect is off
Apr 25 15:06:02 nhkotest2-030 kernel: sd 1:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Apr 25 15:06:02 nhkotest2-030 kernel: sd 1:0:0:0: [sdc] 976773168 512-byte hardware sectors (500108 MB)
Apr 25 15:06:02 nhkotest2-030 kernel: sd 1:0:0:0: [sdc] Write Protect is off
Apr 25 15:06:02 nhkotest2-030 kernel: sd 1:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Apr 25 15:06:02 nhkotest2-030 kernel: sdc: unknown partition table
Apr 25 15:06:02 nhkotest2-030 kernel: sd 1:0:0:0: [sdc] Attached SCSI disk
Apr 25 15:06:02 nhkotest2-030 kernel: sd 1:0:0:0: Attached scsi generic sg1 type 0
***********************************************************************************************
★ [해결 방법]
1. 교체 대상 디스크 제거
2. rescan 할 대상 host 를 확인합니다
[root@nhkotest2-030 0:0:0:0]# ll
total 0
lrwxrwxrwx 1 root root 0 Apr 25 22:49 device -> ../../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/
lrwxrwxrwx 1 root root 0 Apr 25 22:49 subsystem -> ../../../class/scsi_device/
--w------- 1 root root 4096 Apr 25 22:49 uevent
[root@nhkotest2-030 0:0:0:0]# pwd
/sys/class/scsi_device/0:0:0:0
===> 현재 master disk 는 devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0 이거라고 하네요. 그럼 host0은 절대 건들면 안되겠죠? ^^
3. 깔끔하게 아래 커맨드를 날려줍니다(단, 상황에 따라 host number는 달라질수 있습니다, 여기서는 master disk가 host0(sda) 이니까 대상은 host1(sdb)이 되겠죠?)
=> 아래 커맨드로 host1을 rescan 하여 장치가 없다는걸 OS 쪽에 명시적으로 인식시켜 줍니다.
echo "- - -" > /sys/class/scsi_host/host1/scan
이때 dmesg를 보니 뭔가 남아 있는게 깔끔하게 정리된 느낌이 납니다..
Apr 25 22:35:16 nhkotest2-030 kernel: ata2: soft resetting link
Apr 25 22:35:16 nhkotest2-030 kernel: ata2: SATA link down (SStatus 0 SControl 300)
Apr 25 22:35:16 nhkotest2-030 kernel: ata2: EH complete
4. 새 디스크를 삽입하면 sdb 로 정상적으로 인식합니다.
Apr 25 22:35:55 nhkotest2-030 kernel: ata2: exception Emask 0x10 SAct 0x0 SErr 0x4050002 action 0xa frozen
Apr 25 22:35:55 nhkotest2-030 kernel: ata2: irq_stat 0x00400040, connection status changed
Apr 25 22:35:55 nhkotest2-030 kernel: ata2: SError: { RecovComm PHYRdyChg CommWake DevExch }
Apr 25 22:35:56 nhkotest2-030 kernel: ata2: soft resetting link
Apr 25 22:36:01 nhkotest2-030 kernel: ata2: port is slow to respond, please be patient (Status 0x80)
Apr 25 22:36:03 nhkotest2-030 kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr 25 22:36:03 nhkotest2-030 kernel: ata2.00: ATA-8: WDC WD5003ABYX-18WERA0, 01.01S02, max UDMA/133
Apr 25 22:36:03 nhkotest2-030 kernel: ata2.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 31/32)
Apr 25 22:36:03 nhkotest2-030 kernel: ata2.00: configured for UDMA/133
Apr 25 22:36:03 nhkotest2-030 kernel: ata2: EH complete
Apr 25 22:36:03 nhkotest2-030 kernel: scsi 1:0:0:0: Direct-Access ATA WDC WD5003ABYX-1 01.0 PQ: 0 ANSI: 5
Apr 25 22:36:03 nhkotest2-030 kernel: sd 1:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB)
Apr 25 22:36:03 nhkotest2-030 kernel: sd 1:0:0:0: [sdb] Write Protect is off
Apr 25 22:36:03 nhkotest2-030 kernel: sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Apr 25 22:36:03 nhkotest2-030 kernel: sd 1:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB)
Apr 25 22:36:03 nhkotest2-030 kernel: sd 1:0:0:0: [sdb] Write Protect is off
Apr 25 22:36:03 nhkotest2-030 kernel: sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Apr 25 22:36:03 nhkotest2-030 kernel: sdb: unknown partition table
Apr 25 22:36:03 nhkotest2-030 kernel: sd 1:0:0:0: [sdb] Attached SCSI disk
Apr 25 22:36:03 nhkotest2-030 kernel: sd 1:0:0:0: Attached scsi generic sg1 type 0
***********************************************************************************************
'리눅스' 카테고리의 다른 글
프로세스당 버추얼메모리 사이즈 제한 (0) | 2013.01.21 |
---|---|
conntrack-tools project (0) | 2013.01.21 |
tcping-0.1(패킷 로스 체크) (0) | 2013.01.21 |
powernow-k8 off 방법 및 CPU 개별 off 방법 (0) | 2013.01.21 |
disable CPU cores in linux (0) | 2013.01.21 |