<div dir="ltr"><div><div><div><div><div><div><div><div><div><div><div>Recent testing with smartmon revealed that one of my recordings disks is showing some bad blocks. I wanted to find the file(s) affected and move them to another disk so I can do something to cause the bad blocks to get replaced by spares. My problem is that the HOWTOs I found deal with EXT2/3 formatted partitions and I haven't been able to figure out how to duplicate the process for tracing block numbers to file(s). Here's what I have so far been able to do:<br>
<br></div>1) Identify bad blocks:<br># smartctl -l selftest /dev/sde<br>
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen<br>
Home page is <a href="http://smartmontools.sourceforge.net/" target="_blank">http://smartmontools.sourceforge.net/</a><br>
<br>
=== START OF READ SMART DATA SECTION ===<br>
SMART Self-test log structure revision number 1<br>
Num Test_Description Status Remaining<br>
LifeTime(hours) LBA_of_first_error<br>
# 1 Short offline Completed: read failure 90% 10571<br>
1084296743<br></div><br>This shows at least one bad block, 1084296743, present on the disk.<br><br></div>2) Got data on disk partition:<br><br># fdisk -lu /dev/sde<br>
<br>
Disk /dev/sde: 1000.2 GB, 1000204886016 bytes<br>
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors<br>
Units = sectors of 1 * 512 = 512 bytes<br>
Sector size (logical/physical): 512 bytes / 512 bytes<br>
I/O size (minimum/optimal): 512 bytes / 512 bytes<br>
Disk identifier: 0x000db3e7<br>
<br>
Device Boot Start End Blocks Id System<br>
/dev/sde1 63 1953520064 976760001 83 Linux<br><br></div>This shows that the first block in the partition is block 64 on the disk.<br><br></div>3) Identify partition mountpoint:<br># df -h|grep sde<br>
/dev/sde1 932G 838G 95G 90% /video3<br><br></div>This shows that /dev/sde1 is mounted as /video3.<br><br></div>4) Identify formatting of the partition:<br># grep video3 /etc/fstab<br>
# /video3 was on /dev/sdd1 during installation<br>
UUID=d8fb0131-65d1-482e-9afb-<div id=":17r">c0189353b84b /video3 jfs<br>
defaults 0 2<br><br></div><div id=":17r">This shows that the partition is a JFS partition, not an EXT2/3 partition (nor anything else, for that matter).<br><br></div><div id=":17r">5) Test for blocks that report problems near the one of interest:<br>
# i=1084296730<br>
# while [ $i -lt 1084296750 ]; do<br>
> echo $i<br>
> dd if=/dev/sde of=/dev/null bs=512 count=1 skip=$i<br>
> let i+=1<br>
> done<br>
...snip...<br>
1084296736<br>
dd: reading `/dev/sde': Input/output error<br>
0+0 records in<br>
0+0 records out<br>
0 bytes (0 B) copied, 28.6135 s, 0.0 kB/s<br>
...snip<br>
<br></div><div id=":17r">This shows that blocks 1084296736 thru 1084296831 have problems.<br><br></div><div id=":17r">6) Calculate the first logical JFS block using this formula:<br></div><div id=":17r">(First Bad Physical Block - First Partition Physical Block) * Bytes per Block / Bytes per Partition Logical Block<br>
<br># expr 1084296736 - 63<br>
1084296673<br>
# expr 1084296673 \* 512 / 4096<br>
135537084<br>
<br></div><div id=":17r">This shows that the first logical block affected is block 135537084.<br><br></div>7) Get info to translate logical block number to inode number and then translate that inode number into a filename.<br>
<br></div>This is where I get stumped. The HOWTO I found shows how to do this for an EXT2/3 partition using debugfs, so I thought I could do the same thing using jfs_debugfs, but I can't figure out how to get a valid inode value to use with jfs_debugfs to get the filename(s).<br>
<br></div>If I use the 'i' (lowercase i "eye") or 'I' (uppercase I "EYE") options to the display command in jfs_debug, it looks like I should translate logical block 135537084 to inode -1635135392 (di_number output using the 'i' option or iagnum using the 'I' option) based on the following results:<br>
<br># jfs_debugfs /dev/sde1<br>
jfs_debugfs version 1.1.12, 24-Aug-2007<br>
<br>
Aggregate Block Size: 4096<br>
<br>
> d 135537084 0 i<br>
Block: 135537084 Real Address 0x81421bc000<br>
[1] di_inostamp: 0x1c742f61 [19] di_mtime.tv_nsec: 0xc0749d49<br>
[2] di_fileset: 1243045541 [20] di_otime.tv_sec:<br>
0xdc1f418d<br>
[3] di_number: -1635135392 [21] di_otime.tv_nsec:<br>
0x1f603111<br>
[4] di_gen: -766206507 [22] di_acl.flag: 0x66<br>
[5] di_ixpxd.len: 3235549 [23] di_acl.rsrvd: Not Displayed<br>
[6] di_ixpxd.addr1: 0x54 [24] di_acl.size: 0x40d30d80<br>
[7] di_ixpxd.addr2: 0x06010000 [25] di_acl.len: 5581489<br>
di_ixpxd.address: 360877981696 [26] di_acl.addr1: 0x62<br>
[8] di_size: 0xed13ca900322260a [27] di_acl.addr2: 0xd347078f<br>
[9] di_nblocks: 0xd746af90069cd9ed di_acl.address: 424451442575<br>
[10] di_nlink: -1245156732 [28] di_ea.flag: 0x0b<br>
[11] di_uid: 1070598493 [29] di_ea.rsrvd:<br>
Not Displayed<br>
[12] di_gid: 552302989 [30] di_ea.size:<br>
0x941bcb25<br>
[13] di_mode: 0x3882aed2 [31] di_ea.len: 10209947<br>
0127322 l-wu [32] di_ea.addr1: 0xb3<br>
[14] di_atime.tv_sec: 0x69a9690b [33] di_ea.addr2: 0x7456872e<br>
[15] di_atime.tv_nsec: 0x2d9b4085 di_ea.address: 770750973742<br>
[16] di_ctime.tv_sec: 0xdd1dd831 [34] di_next_index: -2110134214<br>
[17] di_ctime.tv_nsec: 0x1562ad04 [35] di_acltype: 0xd7477518<br>
[18] di_mtime.tv_sec: 0x1160055d<br>
- hit Enter to continue, e[x]it -<br>> d 135537084 0 I<br>
Block: 135537084 Real Address 0x81421bc000<br>
[1] agstart: 5338839946511003489 [12]<br>
extsmap[0]: 20eb798d<br>
[2] iagnum: -1635135392 [13] extsmap[1]: 3882aed2<br>
[3] inofreefwd: -766206507 [14] extsmap[2]: 69a9690b<br>
[4] inofreeback: 1412521693 [15] extsmap[3]: 2d9b4085<br>
[5] extfreefwd: 100728832 [16] nfreeinos:<br>
-585246671<br>
[6] extfreeback: 52569610 [17] nfreeexts:<br>
358788356<br>
[7] iagfree: -317470064 [18] pad:<br>
Not Displayed<br>
[8] inosmap[0]: 069cd9ed [19] wmap: Type 'w'<br>
[9] inosmap[1]: d746af90 [20] pmap: Type 'p'<br>
[10] inosmap[2]: b5c86a84 [21] inoext: Type 'i'<br>
[11] inosmap[3]: 3fd0095d<br>
- hit Enter to continue, e[x]it -<br><br>But negative numbers just seem wrong, and of course I get an error when I try to use it in the command for displaying inode information. Also, I found a reference to a bug that might apply where JFS mistakenly put the negative error number in the inode field, corrupting the data.<br>
<br></div>Any advice?<br><br></div>Craig.<br></div>