New user's registration have been closed due to high spamming and low trafic on this forum. Please contact forum admins directly if you need an account. Thanks !

Dead bubba

Got problems with your B2 or B3? Share and get helped!
Post Reply
helder
Posts: 24
Joined: 17 Jun 2007, 16:21

Dead bubba

Post by helder »

Hi,

My Bubba 2 stopped responding yesterday :cry: .

I was able to login via ssh but for instance the htop command would freeze the console, same thing with a ps aux. The load on the system was over the top (about 59.0), the system was very unresponsive. I checked vmstat and there was no activity. Since more or less 38 days ago I had the system frozen because hald was using up all the swap and memory available, I thought that it was the same issue (a buggy hald).

I tried to reboot but the system didn't come up, the led just kept blinking so I had to unplug bubba...

I was able to login using the rescue image. There are lots of kernel Oops in /var/log/kern.log, for instance:

Code: Select all

Sep 29 18:33:35 bubba kernel: Unable to handle kernel paging request for data at address 0x419c01ec
Sep 29 18:33:35 bubba kernel: Faulting instruction address: 0xc00e38d0
Sep 29 18:33:35 bubba kernel: Oops: Kernel access of bad area, sig: 11 [#2]
Sep 29 18:33:35 bubba kernel: MPC831x RDB
Sep 29 18:33:35 bubba kernel: Modules linked in: iptable_filter ip_tables x_tables nfsd nfs_acl exportfs dm_snapshot dm_mirror
 dm_region_hash dm_log mpc8xxx_wdt usbhid
Sep 29 18:33:35 bubba kernel: NIP: c00e38d0 LR: c00e41b0 CTR: 00000008
Sep 29 18:33:35 bubba kernel: REGS: c0e01cf0 TRAP: 0300   Tainted: G      D     (2.6.32.13)
Sep 29 18:33:35 bubba kernel: MSR: 00009032 <EE,ME,IR,DR>  CR: 84008424  XER: 20000000
Sep 29 18:33:35 bubba kernel: DAR: 419c01ec, DSISR: 20000000
Sep 29 18:33:35 bubba kernel: TASK = c5cf80c0[3990] 'imapd' THREAD: c0e00000
(I've attached the zipped kern.log)

When I try to boot the disk seems to be power cycling at 3 to 5 second intervals (more or less) just as if the disk loses power, will open bubba next to see if there's a loose cable or something.

Does anybody have suggestions to diagnose this problem? Any help will be very much appreciated!

Thanks,
Helder
Attachments
kern.zip
Sample from /var/log/kern.log
(20.7 KiB) Downloaded 445 times
helder
Posts: 24
Joined: 17 Jun 2007, 16:21

Re: Dead bubba

Post by helder »

Nothing loose inside my bubba...

I was able to mount bubba's disk on another computer. And I'm doing a backup of everything. Every once in a while I get a string of errors:

Code: Select all

Sep 30 22:43:51 rafeirice kernel: [ 2247.439136] ata4.00: exception Emask 0x10 SAct 0x0 SErr 0x90000 action 0xe frozen
Sep 30 22:43:51 rafeirice kernel: [ 2247.439146] ata4: SError: { PHYRdyChg 10B8B }
Sep 30 22:43:51 rafeirice kernel: [ 2247.439155] ata4.00: cmd c8/00:08:e1:c6:6e/00:00:00:00:00/e1 tag 0 dma 4096 in
Sep 30 22:43:51 rafeirice kernel: [ 2247.439157]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x14 (ATA bus error)
Sep 30 22:43:51 rafeirice kernel: [ 2247.439162] ata4.00: status: { DRDY }
Sep 30 22:43:51 rafeirice kernel: [ 2247.439178] ata4: hard resetting link
Sep 30 22:43:52 rafeirice kernel: [ 2248.202556] ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Sep 30 22:43:52 rafeirice kernel: [ 2248.230071] ata4.00: configured for UDMA/100
Sep 30 22:43:52 rafeirice kernel: [ 2248.230090] ata4: EH complete
Sep 30 22:43:52 rafeirice kernel: [ 2248.235268] sd 3:0:0:0: [sdb] 976773168 512-byte hardware sectors (500108 MB)
Sep 30 22:43:52 rafeirice kernel: [ 2248.235527] sd 3:0:0:0: [sdb] Write Protect is off
Sep 30 22:43:52 rafeirice kernel: [ 2248.235533] sd 3:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Sep 30 22:43:52 rafeirice kernel: [ 2248.236081] sd 3:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Since it seems to only reset the SATA link I hope this doesn't generate data loss...

This seems to indicate that I've an hard disk problem. I'm thinking that if the hd is damaged mostly in the swap partition this could generate the kernel Oops that I've mentioned on the first message.

Any ideas?

Thanks,
Helder
tor
Posts: 703
Joined: 06 Dec 2006, 12:24
Contact:

Re: Dead bubba

Post by tor »

Hi Helder,

I would say that you have a malfunctioning disk. There are a few errors in kern.log, also saying the same thing, you attached.

Apart from that there seems like you run out of dma memory quite a lot, are you running this system with wifi? (We have seen those problems there)

I would suggest that you contact support@excito.com about this.

/Tor
Co-founder OpenProducts and Ex Excito Developer
helder
Posts: 24
Joined: 17 Jun 2007, 16:21

Re: Dead bubba

Post by helder »

Hi Tor,

It's the disk indeed. I bought another disk and I'm going to install it today, can't wait for the disk warranty, this is a very busy bubba!

About the DMA problems, I've no wifi on the server. After the re-install I'll look into the log to see if something shows up.

Thanks,
Helder
Post Reply