Re: RFC: porting NetBSD fsdb enhancements to FreeBSD



Hi Matt,

On Fri, 28 Oct 2005, 00:59-0400, Matt Emmerton wrote:

Recently I've had to do some low-level surgery on some disks that have gone
bad in order to recover some of the data.
This has required me to zero out blocks on disk, patch up the affected
files, and pull the data off the disks.

I was toying around the with fsdb tool, but couldn't figure out a way to map
blocks to inodes (although the 'blocks' command does the mapping in the
other direction quite nicely.)

Poking around I found that someone has added this functionality (via a
"findblk" command) to NetBSD's fsdb (back in 2003!), which I have grafted
onto a 4.x box here with relative ease.

NetBSD Mailing List Posting:
http://groups.google.com/group/mailing.netbsd.tech.userlevel/browse_thread/thread/18acceb04cf5aadb/2a891d67edf9279%232a891d67edf9279?sa=X&oi=groupsr&start=0&num=3)
NetBSD CVS:
http://cvsweb.netbsd.org/bsdweb.cgi/src/sbin/fsdb/fsdb.c.diff?r1=1.24&r2=1.25&f=h

Is this something that folks would like to see on FreeBSD? I've got
RELENG_5_4 and RELENG_6_0 boxes here in my office so I can whip up the
patches and do some testing in short order.

I think it is a useful functionality. Here is a patch based on NetBSD
code for HEAD, should work for RELENG_5 and RELENG_6 also.

For those who is unfamiliar: findblk command gets up to 32 _disk_
blocks as parameters and tries to find the inode(s) owning these
blocks.

You need to differentiate disk and file system blocks. To a find
a disk block number from the given file system block you need to
obtain fs_fsbtodb constant for the specific file system. E.g. for my
/home:

# dumpfs /dev/ad0s1 | head | grep fsbtodb
frag 8 shift 3 fsbtodb 2

fsbtodb is 2. This is a power of 2. So

disk block = file system block * 2^2 = file system block * 4

Here is a real life example:

# ls -i ~/fsdb/fsdb
1961933 /home/maxim/fsdb/fsdb
^^^^^^^- ------------- - ----- inode number
# ./fsdb/fsdb -r /dev/ad0s1e
** /dev/ad0s1e (NO WRITE)
Examining file system `/dev/ad0s1e'
Last Mounted on /home
current inode: directory
I=2 MODE=40755 SIZE=512
MTIME=Apr 14 02:10:21 2006 [0 nsec]
CTIME=Apr 14 02:11:12 2006 [0 nsec]
ATIME=May 22 00:15:46 2006 [0 nsec]
OWNER=maxim GRP=wheel LINKCNT=9 FLAGS=0 BLKCNT=4 GEN=3d75e23a
fsdb (inum: 2)> inode 1961933
current inode: regular file
I=1961933 MODE=100755 SIZE=1063137
MTIME=May 22 23:51:35 2006 [0 nsec]
CTIME=May 22 23:51:35 2006 [0 nsec]
ATIME=May 22 23:51:34 2006 [0 nsec]
OWNER=maxim GRP=maxim LINKCNT=1 FLAGS=0 BLKCNT=840 GEN=bb75d52
fsdb (inum: 1961933)> blocks
Blocks for inode 1961933:
Direct blocks:
7857760, 7857976, 7858016, 7858120, 7858152, 7858168, 7858176,
7858288, 7858376, 7858392, 7858408, 7858416
Indirect blocks:
7858480, 7858568, 7858584, 7858616, 7858672, 7858824, 7858832, 7859128,
7859144, 7859152, 7859168, 7859176, 7859184, 7859200, 7859352, 7859368,
7859376, 7859384, 7859456, 7859464, 7859472, 7859616, 7859624, 7859632,
7859640, 7857840, 7858136, 7858368, 7858384, 7858400, 7858592, 7859360,
7857912, 7858032, 7858840, 7857920, 7857944, 7857952, 7857992, 7858024,
7858040, 7858128, 7857928, 7857960, 7857968, 7858144, 7858848, 7860456,
7860464, 7860472, 7860480, 7860488, 7860504,
fsdb (inum: 1961933)> findblk 31431680 # 7857920 file system block
31431680: data block of inode 1961933

Testers are welcome!

Index: fsdb.8
===================================================================
RCS file: /home/ncvs/src/sbin/fsdb/fsdb.8,v
retrieving revision 1.28
diff -u -p -r1.28 fsdb.8
--- fsdb.8 12 Feb 2005 23:23:53 -0000 1.28
+++ fsdb.8 22 May 2006 19:31:44 -0000
@@ -28,7 +28,7 @@
.\"
.\" $FreeBSD: src/sbin/fsdb/fsdb.8,v 1.28 2005/02/12 23:23:53 trhodes Exp $
.\"
-.Dd September 14, 1995
+.Dd May 22, 2006
.Dt FSDB 8
.Os
.Sh NAME
@@ -117,6 +117,12 @@ Print out the active inode.
Print out the block list of the active inode.
Note that the printout can become long for large files, since all
indirect block pointers will also be printed.
+.Pp
+.It Cm findblk Ar disk block number ...
+Find the inode(s) owning the specified disk block(s) number(s).
+Note that these are not absolute disk blocks numbers, but offsets from the
+start of the partition.
+.Pp
.It Cm uplink
Increment the active inode's link count.
.Pp
Index: fsdb.c
===================================================================
RCS file: /home/ncvs/src/sbin/fsdb/fsdb.c,v
retrieving revision 1.32
diff -u -p -r1.32 fsdb.c
--- fsdb.c 21 Apr 2006 20:33:16 -0000 1.32
+++ fsdb.c 22 May 2006 19:51:42 -0000
@@ -52,6 +52,13 @@ static const char rcsid[] =

static void usage(void) __dead2;
int cmdloop(void);
+static int compare_blk32(uint32_t *wantedblk, uint32_t curblk);
+static int compare_blk64(uint64_t *wantedblk, uint64_t curblk);
+static int founddatablk(uint64_t blk);
+static int find_blks32(uint32_t *buf, int size, uint32_t *blknum);
+static int find_blks64(uint64_t *buf, int size, uint64_t *blknum);
+static int find_indirblks32(uint32_t blk, int ind_level, uint32_t *blknum);
+static int find_indirblks64(uint64_t blk, int ind_level, uint64_t *blknum);

static void
usage(void)
@@ -129,6 +136,7 @@ CMDFUNC(uplink); /* incr link */
CMDFUNC(downlink); /* decr link */
CMDFUNC(linkcount); /* set link count */
CMDFUNC(quit); /* quit */
+CMDFUNC(findblk); /* find block */
CMDFUNC(ls); /* list directory */
CMDFUNC(rm); /* remove name */
CMDFUNC(ln); /* add name */
@@ -160,6 +168,7 @@ struct cmdtable cmds[] = {
{ "uplink", "Increment link count", 1, 1, FL_WR, uplink },
{ "downlink", "Decrement link count", 1, 1, FL_WR, downlink },
{ "linkcount", "Set link count to COUNT", 2, 2, FL_WR, linkcount },
+ { "findblk", "Find inode owning disk block(s)", 2, 33, FL_RO, findblk},
{ "ls", "List current inode as directory", 1, 1, FL_RO, ls },
{ "rm", "Remove NAME from current inode directory", 2, 2, FL_WR | FL_ST, rm },
{ "del", "Remove NAME from current inode directory", 2, 2, FL_WR | FL_ST, rm },
@@ -415,6 +424,262 @@ CMDFUNCSTART(ls)
return 0;
}

+static int findblk_numtofind;
+static int wantedblksize;
+
+CMDFUNCSTART(findblk)
+{
+ ino_t inum, inosused;
+ uint32_t *wantedblk32;
+ uint64_t *wantedblk64;
+ struct cg *cgp = &cgrp;
+ int c, i, is_ufs2;
+
+ wantedblksize = (argc - 1);
+ is_ufs2 = sblock.fs_magic == FS_UFS2_MAGIC;
+ ocurrent = curinum;
+
+ if (is_ufs2) {
+ wantedblk64 = calloc(wantedblksize, sizeof(uint64_t));
+ if (wantedblk64 == NULL)
+ err(1, "malloc");
+ for (i = 1; i < argc; i++)
+ wantedblk64[i - 1] = dbtofsb(&sblock, strtoull(argv[i], NULL, 0));
+ } else {
+ wantedblk32 = calloc(wantedblksize, sizeof(uint32_t));
+ if (wantedblk32 == NULL)
+ err(1, "malloc");
+ for (i = 1; i < argc; i++)
+ wantedblk32[i - 1] = dbtofsb(&sblock, strtoull(argv[i], NULL, 0));
+ }
+ findblk_numtofind = wantedblksize;
+ /*
+ * sblock.fs_ncg holds a number of cylinder groups.
+ * Iterate over all cylinder groups.
+ */
+ for (c = 0; c < sblock.fs_ncg; c++) {
+ /*
+ * sblock.fs_ipg holds a number of inodes per cylinder group.
+ * Calculate a highest inode number for a given cylinder group.
+ */
+ inum = c * sblock.fs_ipg;
+ /* Read cylinder group. */
+ getblk(&cgblk, cgtod(&sblock, c), sblock.fs_cgsize);
+ memcpy(cgp, cgblk.b_un.b_cg, sblock.fs_cgsize);
+ /*
+ * Get a highest used inode number for a given cylinder group.
+ * For UFS1 all inodes initialized at the newfs stage.
+ */
+ if (is_ufs2)
+ inosused = cgp->cg_initediblk;
+ else
+ inosused = sblock.fs_ipg;
+
+ for (; inosused > 0; inum++, inosused--) {
+ /* Skip magic inodes: 0, WINO, ROOTINO. */
+ if (inum < ROOTINO)
+ continue;
+ /*
+ * Check if the block we are looking for is just an inode block.
+ *
+ * ino_to_fsba() - get block containing inode from its number.
+ * INOPB() - get a number of inodes in one disk block.
+ */
+ if (is_ufs2 ?
+ compare_blk64(wantedblk64, ino_to_fsba(&sblock, inum)) :
+ compare_blk32(wantedblk32, ino_to_fsba(&sblock, inum))) {
+ printf("block %llu: inode block (%d-%d)\n",
+ (unsigned long long)fsbtodb(&sblock,
+ ino_to_fsba(&sblock, inum)),
+ (inum / INOPB(&sblock)) * INOPB(&sblock),
+ (inum / INOPB(&sblock) + 1) * INOPB(&sblock));
+ findblk_numtofind--;
+ if (findblk_numtofind == 0)
+ goto end;
+ }
+ /* Get on-disk inode aka dinode. */
+ curinum = inum;
+ curinode = ginode(inum);
+ /* Find IFLNK dinode with allocated data blocks. */
+ switch (DIP(curinode, di_mode) & IFMT) {
+ case IFDIR:
+ case IFREG:
+ if (DIP(curinode, di_blocks) == 0)
+ continue;
+ break;
+ case IFLNK:
+ {
+ uint64_t size = DIP(curinode, di_size);
+ if (size > 0 && size < sblock.fs_maxsymlinklen &&
+ DIP(curinode, di_blocks) == 0)
+ continue;
+ else
+ break;
+ }
+ default:
+ continue;
+ }
+ /* Look through direct data blocks. */
+ if (is_ufs2 ?
+ find_blks64(curinode->dp2.di_db, NDADDR, wantedblk64) :
+ find_blks32(curinode->dp1.di_db, NDADDR, wantedblk32))
+ goto end;
+ for (i = 0; i < NIADDR; i++) {
+ /*
+ * Does the block we are looking for belongs to the
+ * indirect blocks?
+ */
+ if (is_ufs2 ?
+ compare_blk64(wantedblk64, curinode->dp2.di_ib[i]) :
+ compare_blk32(wantedblk32, curinode->dp1.di_ib[i]))
+ if (founddatablk(is_ufs2 ? curinode->dp2.di_ib[i] :
+ curinode->dp1.di_ib[i]))
+ goto end;
+ /*
+ * Search through indirect, double and triple indirect
+ * data blocks.
+ */
+ if (is_ufs2 ? (curinode->dp2.di_ib[i] != 0) :
+ (curinode->dp1.di_ib[i] != 0))
+ if (is_ufs2 ?
+ find_indirblks64(curinode->dp2.di_ib[i], i,
+ wantedblk64) :
+ find_indirblks32(curinode->dp1.di_ib[i], i,
+ wantedblk32))
+ goto end;
+ }
+ }
+ }
+end:
+ curinum = ocurrent;
+ curinode = ginode(curinum);
+ return 0;
+}
+
+static int
+compare_blk32(uint32_t *wantedblk, uint32_t curblk)
+{
+ int i;
+
+ for (i = 0; i < wantedblksize; i++) {
+ if (wantedblk[i] != 0 && wantedblk[i] == curblk) {
+ wantedblk[i] = 0;
+ return 1;
+ }
+ }
+ return 0;
+}
+
+static int
+compare_blk64(uint64_t *wantedblk, uint64_t curblk)
+{
+ int i;
+
+ for (i = 0; i < wantedblksize; i++) {
+ if (wantedblk[i] != 0 && wantedblk[i] == curblk) {
+ wantedblk[i] = 0;
+ return 1;
+ }
+ }
+ return 0;
+}
+
+static int
+founddatablk(uint64_t blk)
+{
+
+ printf("%llu: data block of inode %d\n",
+ (unsigned long long)fsbtodb(&sblock, blk), curinum);
+ findblk_numtofind--;
+ if (findblk_numtofind == 0)
+ return 1;
+ return 0;
+}
+
+static int
+find_blks32(uint32_t *buf, int size, uint32_t *wantedblk)
+{
+ int blk;
+ for (blk = 0; blk < size; blk++) {
+ if (buf[blk] == 0)
+ continue;
+ if (compare_blk32(wantedblk, buf[blk])) {
+ if (founddatablk(buf[blk]))
+ return 1;
+ }
+ }
+ return 0;
+}
+
+static int
+find_indirblks32(uint32_t blk, int ind_level, uint32_t *wantedblk)
+{
+#define MAXNINDIR (MAXBSIZE / sizeof(uint32_t))
+ uint32_t idblk[MAXNINDIR];
+ int i;
+
+ bread(fsreadfd, (char *)idblk, fsbtodb(&sblock, blk), (int)sblock.fs_bsize);
+ if (ind_level <= 0) {
+ if (find_blks32(idblk, sblock.fs_bsize / sizeof(uint32_t), wantedblk))
+ return 1;
+ } else {
+ ind_level--;
+ for (i = 0; i < sblock.fs_bsize / sizeof(uint32_t); i++) {
+ if (compare_blk32(wantedblk, idblk[i])) {
+ if (founddatablk(idblk[i]))
+ return 1;
+ }
+ if (idblk[i] != 0)
+ if (find_indirblks32(idblk[i], ind_level, wantedblk))
+ return 1;
+ }
+ }
+#undef MAXNINDIR
+ return 0;
+}
+
+static int
+find_blks64(uint64_t *buf, int size, uint64_t *wantedblk)
+{
+ int blk;
+ for (blk = 0; blk < size; blk++) {
+ if (buf[blk] == 0)
+ continue;
+ if (compare_blk64(wantedblk, buf[blk])) {
+ if (founddatablk(buf[blk]))
+ return 1;
+ }
+ }
+ return 0;
+}
+
+static int
+find_indirblks64(uint64_t blk, int ind_level, uint64_t *wantedblk)
+{
+#define MAXNINDIR (MAXBSIZE / sizeof(uint64_t))
+ uint64_t idblk[MAXNINDIR];
+ int i;
+
+ bread(fsreadfd, (char *)idblk, fsbtodb(&sblock, blk), (int)sblock.fs_bsize);
+ if (ind_level <= 0) {
+ if (find_blks64(idblk, sblock.fs_bsize / sizeof(uint64_t), wantedblk))
+ return 1;
+ } else {
+ ind_level--;
+ for (i = 0; i < sblock.fs_bsize / sizeof(uint64_t); i++) {
+ if (compare_blk64(wantedblk, idblk[i])) {
+ if (founddatablk(idblk[i]))
+ return 1;
+ }
+ if (idblk[i] != 0)
+ if (find_indirblks64(idblk[i], ind_level, wantedblk))
+ return 1;
+ }
+ }
+#undef MAXNINDIR
+ return 0;
+}
+
int findino(struct inodesc *idesc); /* from fsck */
static int dolookup(char *name);

%%%

--
Maxim Konovalov
_______________________________________________
freebsd-hackers@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@xxxxxxxxxxx"



Relevant Pages

  • evms plugin for hptraid support <<<pre-alpha>>>
    ... the code is based on the local disk manager plugin and i hope i have got ... +# the GNU General Public License for more details. ... +struct hptdisk { ... +static int loadHPTConf{ ...
    (Linux-Kernel)
  • [ANNOUNCE] Highpoint-Tech Plugin 0.0.1 for EVMS 2.3.0
    ... raid volume. ... the next thing that i want to do is to implement the "whole disk" ... +static int loadHPTConf{ ... +static struct hptraid * hptRAIDScan{ ...
    (Linux-Kernel)
  • [PATCH 3/8] cciss: new disk register/deregister routines
    ... static int cciss_revalidate(struct gendisk *disk); ... static int sendcmd(__u8 cmd, int ctlr, void *buff, size_t size, ... +/* This function will add and remove logical drives from the Logical ...
    (Linux-Kernel)
  • Re: [PATCH 3/8] cciss: new disk register/deregister routines
    ... to fire and try to do some work on a disk that was being deleted. ... static int cciss_revalidate; ... static int sendcmd(__u8 cmd, int ctlr, void *buff, size_t size, ... +/* This function will add and remove logical drives from the Logical ...
    (Linux-Kernel)
  • Re: SG_IO and security
    ... +static int sg_allowed_cmd ... static int pcd_block_media_changed(struct gendisk *disk) ... static int scd_block_ioctl(struct inode *inode, struct file *file, ...
    (Linux-Kernel)