🚀 go-pugleaf

RetroBBS NetNews Server

Inspired by RockSolid Light RIP Retro Guy

Thread View: gmane.linux.debian.user
5 messages
5 total messages Started by Tomek Kruszona Tue, 09 Jun 2009 14:16
XFS frequent crashes on PE1950 with perc 5/e and 2xMD1000
#307550
Author: Tomek Kruszona
Date: Tue, 09 Jun 2009 14:16
293 lines
12394 bytes
Hello!

I have a problem with system in the configuration described in subject
(Dell PE1950 III + PERC 5/E + 2xMD1000)

System is running Debian Lenny AMD64 version with all available updates.

I have 6 VD's 2TB each (for 32bit system compatibility). Each VD is a
LVM2 PV

I made a LVM2 volume and formatted this into XFS. Previously it was only
one MD1000 connected to PERC controller.

But two days ago i added next MD1000 added new PV's to LVM2 and extended
XFS with xfs_growfs.

After some time I got kernel panic like this:

[46925.374954] Filesystem "dm-0": XFS internal error xfs_trans_cancel at
line 11
63 of file fs/xfs/xfs_trans.c.  Caller 0xffffffffa02b2e82
[46925.374954] Pid: 12269, comm: smbd Not tainted 2.6.26-2-amd64 #1
[46925.374954]
[46925.374954] Call Trace:
[46925.374954]  [<ffffffffa02b2e82>]
:xfs:xfs_iomap_write_allocate+0x360/0x385
[46925.374954]  [<ffffffffa02c095e>] :xfs:xfs_trans_cancel+0x55/0xed
[46925.374954]  [<ffffffffa02b2e82>]
:xfs:xfs_iomap_write_allocate+0x360/0x385
[46925.374954]  [<ffffffffa02b38f4>] :xfs:xfs_iomap+0x21b/0x297
[46925.374954]  [<ffffffffa02c9637>] :xfs:xfs_map_blocks+0x2d/0x5f
[46925.374954]  [<ffffffffa02ca74e>] :xfs:xfs_page_state_convert+0x2a2/0x54f
[46925.374954]  [<ffffffffa02cab5a>] :xfs:xfs_vm_writepage+0xb4/0xea
[46925.374954]  [<ffffffff802770db>] __writepage+0xa/0x23
[46925.374954]  [<ffffffff802775a0>] write_cache_pages+0x182/0x2b1
[46925.374954]  [<ffffffff802770d1>] __writepage+0x0/0x23
[46925.374954]  [<ffffffff8027770b>] do_writepages+0x20/0x2d
[46925.374954]  [<ffffffff80271900>] __filemap_fdatawrite_range+0x51/0x5b
[46925.374954]  [<ffffffffa02cd2b0>] :xfs:xfs_flush_pages+0x4e/0x6d
[46925.374954]  [<ffffffffa02c5548>] :xfs:xfs_setattr+0x695/0xd28
[46925.374954]  [<ffffffff803b1326>] sock_common_recvmsg+0x30/0x45
[46925.374954]  [<ffffffffa02d08cc>] :xfs:xfs_write+0x6de/0x722
[46925.374954]  [<ffffffffa02cf1e3>] :xfs:xfs_vn_setattr+0x11c/0x13a
[46925.374954]  [<ffffffff802add8f>] notify_change+0x174/0x2f5
[46925.374954]  [<ffffffff80299f09>] do_truncate+0x5e/0x79
[46925.374954]  [<ffffffff8029df53>] sys_newfstat+0x20/0x29
[46925.374954]  [<ffffffff8029a00e>] sys_ftruncate+0xea/0x107
[46925.374954]  [<ffffffff8020beca>] system_call_after_swapgs+0x8a/0x8f
[46925.374954]
[46925.374954] xfs_force_shutdown(dm-0,0x8) called from line 1164 of
file fs/xfs
/xfs_trans.c.  Return address = 0xffffffffa02c0977
[46925.374954] Filesystem "dm-0": Corruption of in-memory data detected.
 Shutti
ng down filesystem: dm-0
[46925.376874] Please umount the filesystem, and rectify the problem(s)
[46934.390143] Filesystem "dm-0": xfs_log_force: error 5 returned.
[46925.374954] Filesystem "dm-0": XFS internal error xfs_trans_cancel at
line 11
63 of file fs/xfs/xfs_trans.c.  Caller 0xffffffffa02b2e82
[46925.374954] Pid: 12269, comm: smbd Not tainted 2.6.26-2-amd64 #1
[46925.374954]
[46925.374954] Call Trace:
[46925.374954]  [<ffffffffa02b2e82>]
:xfs:xfs_iomap_write_allocate+0x360/0x385
[46925.374954]  [<ffffffffa02c095e>] :xfs:xfs_trans_cancel+0x55/0xed
[46925.374954]  [<ffffffffa02b2e82>]
:xfs:xfs_iomap_write_allocate+0x360/0x385
[46925.374954]  [<ffffffffa02b38f4>] :xfs:xfs_iomap+0x21b/0x297
[46925.374954]  [<ffffffffa02c9637>] :xfs:xfs_map_blocks+0x2d/0x5f
[46925.374954]  [<ffffffffa02ca74e>] :xfs:xfs_page_state_convert+0x2a2/0x54f
[46925.374954]  [<ffffffffa02cab5a>] :xfs:xfs_vm_writepage+0xb4/0xea
[46925.374954]  [<ffffffff802770db>] __writepage+0xa/0x23
[46925.374954]  [<ffffffff802775a0>] write_cache_pages+0x182/0x2b1
[46925.374954]  [<ffffffff802770d1>] __writepage+0x0/0x23
[46925.374954]  [<ffffffff8027770b>] do_writepages+0x20/0x2d
[46925.374954]  [<ffffffff80271900>] __filemap_fdatawrite_range+0x51/0x5b
[46925.374954]  [<ffffffffa02cd2b0>] :xfs:xfs_flush_pages+0x4e/0x6d
[46925.374954]  [<ffffffffa02c5548>] :xfs:xfs_setattr+0x695/0xd28
[46925.374954]  [<ffffffff803b1326>] sock_common_recvmsg+0x30/0x45
[46925.374954]  [<ffffffffa02d08cc>] :xfs:xfs_write+0x6de/0x722
[46925.374954]  [<ffffffffa02cf1e3>] :xfs:xfs_vn_setattr+0x11c/0x13a
[46925.374954]  [<ffffffff802add8f>] notify_change+0x174/0x2f5
[46925.374954]  [<ffffffff80299f09>] do_truncate+0x5e/0x79
[46925.374954]  [<ffffffff8029df53>] sys_newfstat+0x20/0x29
[46925.374954]  [<ffffffff8029a00e>] sys_ftruncate+0xea/0x107
[46925.374954]  [<ffffffff8020beca>] system_call_after_swapgs+0x8a/0x8f
[46925.374954]
[46925.374954] xfs_force_shutdown(dm-0,0x8) called from line 1164 of
file fs/xfs
/xfs_trans.c.  Return address = 0xffffffffa02c0977
[46925.374954] Filesystem "dm-0": Corruption of in-memory data detected.
 Shutti
ng down filesystem: dm-0
[46925.376874] Please umount the filesystem, and rectify the problem(s)
[46934.390143] Filesystem "dm-0": xfs_log_force: error 5 returned.
[47112.408317] Pid: 15211, comm: umount Tainted: G      D
2.6.26-2-amd64 #1
[47112.408317]
[47112.408317] Call Trace:
[47112.408317]  [<ffffffff80234a20>] warn_on_slowpath+0x51/0x7a
[47112.408317]  [<ffffffff802b6c5b>] __mark_inode_dirty+0xe0/0x179
[47112.408317]  [<ffffffff802460ef>] bit_waitqueue+0x10/0x97
[47112.424115]  [<ffffffff802461b4>] wake_up_bit+0x11/0x22
[47112.424196]  [<ffffffff802b6364>] __writeback_single_inode+0x44/0x29d
[47112.424280]  [<ffffffff802b6928>] sync_sb_inodes+0x1b1/0x293
[47112.424362]  [<ffffffff802b6aa4>] sync_inodes_sb+0x9a/0xa6
[47112.424445]  [<ffffffff8029c6ed>] __fsync_super+0xb/0x6f
[47112.424527]  [<ffffffff8029c75a>] fsync_super+0x9/0x16
[47112.424608]  [<ffffffff8029c976>] generic_shutdown_super+0x21/0xee
[47112.424692]  [<ffffffff8029ca50>] kill_block_super+0xd/0x1e
[47112.424773]  [<ffffffff8029cb0c>] deactivate_super+0x5f/0x78
[47112.424855]  [<ffffffff802afe06>] sys_umount+0x2f9/0x353
[47112.424938]  [<ffffffff80221fac>] do_page_fault+0x5d8/0x9c8
[47112.428111]  [<ffffffff8029e0e4>] sys_newstat+0x19/0x31
[47112.428111]  [<ffffffff8031dc73>] __up_write+0x21/0x10e
[47112.428111]  [<ffffffff8020beca>] system_call_after_swapgs+0x8a/0x8f
[47112.428111]
[47112.428111] ---[ end trace ba717a82a77cfd6a ]---
[47112.428111] Filesystem "dm-0": xfs_log_force: error 5 returned.
[47112.428111] Filesystem "dm-0": xfs_log_force: error 5 returned.
[47112.428111] xfs_force_shutdown(dm-0,0x1) called from line 420 of file
fs/xfs/
xfs_rw.c.  Return address = 0xffffffffa02c8d33
[47112.428111] Filesystem "dm-0": xfs_log_force: error 5 returned.
[47112.428111] Filesystem "dm-0": xfs_log_force: error 5 returned.
[47112.428111] xfs_force_shutdown(dm-0,0x1) called from line 420 of file
fs/xfs/
xfs_rw.c.  Return address = 0xffffffffa02c8d33
[47112.428177] ------------[ cut here ]------------
[47112.428246] WARNING: at fs/fs-writeback.c:381
__writeback_single_inode+0x44/0
x29d()
[47112.428345] Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler
ipv6 xfs
ext2 mbcache loop snd_pcm snd_timer snd soundcore snd_page_alloc
rng_core psmous
e i5000_edac iTCO_wdt button pcspkr serio_raw edac_core shpchp
pci_hotplug dcdba
s evdev reiserfs dm_mirror dm_log dm_snapshot dm_mod raid1 md_mod sg
sr_mod cdro
m ide_pci_generic ide_core ses enclosure ata_piix sd_mod e1000e
megaraid_sas bnx
2 firmware_class ata_generic libata dock uhci_hcd ehci_hcd mptsas
mptscsih mptba
se scsi_transport_sas scsi_mod thermal processor fan thermal_sys
[47112.432208] Pid: 15211, comm: umount Tainted: G      D W
2.6.26-2-amd64 #1
[47112.432294]
[47112.432304] Call Trace:
[47112.432443]  [<ffffffff80234a20>] warn_on_slowpath+0x51/0x7a
[47112.432528]  [<ffffffff80278daa>] pagevec_lookup_tag+0x1a/0x21
[47112.432614]  [<ffffffff80271846>] wait_on_page_writeback_range+0xc8/0x113
[47112.432709]  [<ffffffff802b6c5b>] __mark_inode_dirty+0xe0/0x179
[47112.432794]  [<ffffffff802460ef>] bit_waitqueue+0x10/0x97
[47112.432876]  [<ffffffff802461b4>] wake_up_bit+0x11/0x22
[47112.432959]  [<ffffffff802b6364>] __writeback_single_inode+0x44/0x29d
[47112.433073]  [<ffffffffa02c8d33>] :xfs:xfs_bwrite+0xb0/0xbb
[47112.433161]  [<ffffffffa02b5993>] :xfs:xfs_log_need_covered+0x15/0x8c
[47112.433242]  [<ffffffff802b6928>] sync_sb_inodes+0x1b1/0x293
[47112.433328]  [<ffffffff802b6aa4>] sync_inodes_sb+0x9a/0xa6
[47112.433411]  [<ffffffff8029c75a>] fsync_super+0x9/0x16
[47112.433493]  [<ffffffff8029c976>] generic_shutdown_super+0x21/0xee
[47112.433577]  [<ffffffff8029ca50>] kill_block_super+0xd/0x1e
[47112.433661]  [<ffffffff8029cb0c>] deactivate_super+0x5f/0x78
[47112.433743]  [<ffffffff802afe06>] sys_umount+0x2f9/0x353
[47112.433825]  [<ffffffff80221fac>] do_page_fault+0x5d8/0x9c8
[47112.433908]  [<ffffffff8029e0e4>] sys_newstat+0x19/0x31
[47112.433994]  [<ffffffff8031dc73>] __up_write+0x21/0x10e
[47112.434078]  [<ffffffff8020beca>] system_call_after_swapgs+0x8a/0x8f
[47112.434147]
[47112.434147] ---[ end trace ba717a82a77cfd6a ]---
[47113.504506] Filesystem "dm-0": xfs_log_force: error 5 returned.
[47113.504506] Filesystem "dm-0": xfs_log_force: error 5 returned.
[47113.504506] Filesystem "dm-0": xfs_log_force: error 5 returned.
[47113.504506] Filesystem "dm-0": xfs_log_force: error 5 returned.
[47113.506718] Filesystem "dm-0": xfs_log_force: error 5 returned.
[47113.516457] VFS: Busy inodes after unmount of dm-0. Self-destruct in
5 second
s.  Have a nice day...


I've found some similar issues but no solution :(

here is my

$omreport storage vdisk controller=0

output:

List of Virtual Disks on Controller PERC 5/E Adapter (Slot 1)

Controller PERC 5/E Adapter (Slot 1)
ID                  : 0
Status              : Ok
Name                : Array0
State               : Ready
Progress            : Not Applicable
Layout              : RAID-5
Size                : 1,953.12 GB (2097149902848 bytes)
Device Name         : /dev/sdc
Type                : SAS
Read Policy         : Adaptive Read Ahead
Write Policy        : Write Back
Cache Policy        : Not Applicable
Stripe Element Size : 64 KB
Disk Cache Policy   : Disabled

ID                  : 1
Status              : Ok
Name                : Array1
State               : Ready
Progress            : Not Applicable
Layout              : RAID-5
Size                : 1,951.13 GB (2095006613504 bytes)
Device Name         : /dev/sdd
Type                : SAS
Read Policy         : Adaptive Read Ahead
Write Policy        : Write Back
Cache Policy        : Not Applicable
Stripe Element Size : 64 KB
Disk Cache Policy   : Disabled

ID                  : 2
Status              : Ok
Name                : Array2
State               : Ready
Progress            : Not Applicable
Layout              : RAID-5
Size                : 1,953.12 GB (2097151737856 bytes)
Device Name         : /dev/sde
Type                : SAS
Read Policy         : Adaptive Read Ahead
Write Policy        : Write Back
Cache Policy        : Not Applicable
Stripe Element Size : 64 KB
Disk Cache Policy   : Disabled

ID                  : 3
Status              : Ok
Name                : Array3
State               : Ready
Progress            : Not Applicable
Layout              : RAID-5
Size                : 1,953.12 GB (2097151737856 bytes)
Device Name         : /dev/sdf
Type                : SAS
Read Policy         : Adaptive Read Ahead
Write Policy        : Write Back
Cache Policy        : Not Applicable
Stripe Element Size : 64 KB
Disk Cache Policy   : Disabled

ID                  : 4
Status              : Ok
Name                : Array4
State               : Ready
Progress            : Not Applicable
Layout              : RAID-5
Size                : 1,953.12 GB (2097151737856 bytes)
Device Name         : /dev/sdg
Type                : SAS
Read Policy         : Adaptive Read Ahead
Write Policy        : Write Back
Cache Policy        : Not Applicable
Stripe Element Size : 64 KB
Disk Cache Policy   : Disabled

ID                  : 5
Status              : Ok
Name                : Array5
State               : Ready
Progress            : Not Applicable
Layout              : RAID-5
Size                : 1,957.88 GB (2102253060096 bytes)
Device Name         : /dev/sdh
Type                : SAS
Read Policy         : Adaptive Read Ahead
Write Policy        : Write Back
Cache Policy        : Not Applicable
Stripe Element Size : 64 KB
Disk Cache Policy   : Disabled

I was thinking... maybe XFS on LVM2 requires some specific PERC VD
setup? I had same issue with gentoo 32-bit with 2.6.25 kernel and with
one MD1000. But the problem happend once a month. Now it's getting
worse. last 24 hours - 2 crashes :(

Any ideas?

Best regards,
Tomek Kruszona

Re: XFS frequent crashes on PE1950 with perc 5/e and 2xMD1000
#307676
Author: Andrew Reid
Date: Wed, 10 Jun 2009 22:00
39 lines
1230 bytes
On Tuesday 09 June 2009 08:16:35 Tomek Kruszona wrote:
> Hello!
>
> I have a problem with system in the configuration described in subject
> (Dell PE1950 III + PERC 5/E + 2xMD1000)
>
> System is running Debian Lenny AMD64 version with all available updates.
>
> I have 6 VD's 2TB each (for 32bit system compatibility). Each VD is a
> LVM2 PV
>
> I made a LVM2 volume and formatted this into XFS. Previously it was only
> one MD1000 connected to PERC controller.
>
> But two days ago i added next MD1000 added new PV's to LVM2 and extended
> XFS with xfs_growfs.
>
> After some time I got kernel panic like this:

  This strongly resembles an issue I had on a file server --
I don't have my notes handy, but it had to do with an issue
in which the kernel was interacting badly with a particular
motherboard chipset.

  The workaround was to reboot with the "iommu=soft" option
passed to the kernel.

  My problem was with an "etch" kernel, and it was my understanding
that newer kernels were not expected to have this problem, so
I may be off-base, but that's my experience.

  It sounds like this is at least an easy thing to try -- I really
wish I could find my notes...

				-- A.

--
Andrew Reid / reidac@bellatlantic.net

Re: XFS frequent crashes on PE1950 with perc 5/e and 2xMD1000
#307677
Author: Kelly Harding
Date: Thu, 11 Jun 2009 03:17
28 lines
921 bytes
2009/6/9 Tomek Kruszona <bloodyscarion@gmail.com>:
> Hello!
>
> I have a problem with system in the configuration described in subject
> (Dell PE1950 III + PERC 5/E + 2xMD1000)
>
> System is running Debian Lenny AMD64 version with all available updates.
>
> I have 6 VD's 2TB each (for 32bit system compatibility). Each VD is a
> LVM2 PV
>
> I made a LVM2 volume and formatted this into XFS. Previously it was only
> one MD1000 connected to PERC controller.
>
> But two days ago i added next MD1000 added new PV's to LVM2 and extended
> XFS with xfs_growfs.
>

Might be a bit of an obvious thing, but have you tried running memtest
to rule out dodgy memory?
usually when I see anything similar to this I run a memtest to be sure
(on a few occasions it has proven to be the memory.

Could also be a driver bug related to multiple MD1000s? no experience
with Dell perc hardware sadly though to be any further help.

Kelly

Re: XFS frequent crashes on PE1950 with perc 5/e and 2xMD1000
#307705
Author: Tomek Kruszona
Date: Thu, 11 Jun 2009 18:24
17 lines
670 bytes
Andrew Reid wrote:
>   This strongly resembles an issue I had on a file server --
> I don't have my notes handy, but it had to do with an issue
> in which the kernel was interacting badly with a particular
> motherboard chipset.
>
>   The workaround was to reboot with the "iommu=soft" option
> passed to the kernel.
>
>   My problem was with an "etch" kernel, and it was my understanding
> that newer kernels were not expected to have this problem, so
> I may be off-base, but that's my experience.
>
>   It sounds like this is at least an easy thing to try -- I really
> wish I could find my notes...
I'll try this option. I just need to wait for next crash ;)

Re: XFS frequent crashes on PE1950 with perc 5/e and 2xMD1000
#307706
Author: Tomek Kruszona
Date: Thu, 11 Jun 2009 18:25
17 lines
611 bytes
Kelly Harding wrote:
> Might be a bit of an obvious thing, but have you tried running memtest
> to rule out dodgy memory?
> usually when I see anything similar to this I run a memtest to be sure
> (on a few occasions it has proven to be the memory.
Memory is ok. Memtest passed. Moreover it's happening on more then one
machine.

>
> Could also be a driver bug related to multiple MD1000s? no experience
> with Dell perc hardware sadly though to be any further help.

I don't think so. I had this issue before when I had one MD1000
connected to PERC. I also happens with LSI 8880EM2 controller.

Best regards

Thread Navigation

This is a paginated view of messages in the thread with full content displayed inline.

Messages are displayed in chronological order, with the original post highlighted in green.

Use pagination controls to navigate through all messages in large threads.

Back to All Threads