taskq Kernel Crashes

Tutton, James James.Tutton at redstone.co.uk
Mon Feb 11 11:03:51 GMT 2008


Cheers Mark,

First to answer your questions.

Started recently?
No. It's been happening on and off since these boxes were built back in =
July 07.  Initially I was in the dark as to the cause as the boxes are =
located in a data center so the resolution was to reboot them and get =
them back up.  I have since managed to get an IP KVM on them so have =
managed to see the errors and start my so far futile efforts to track =
the cause.

How frequent?
It varies from as little as 8 hours on two occasions I can think of, and =
a max of about 28 days.  But the only common thing is the majority of =
the time it seems to be 14/15 day mark.  I would consider a weekly =
scheduled reboot if it was always the fortnight mark but with the =
randomness of the issue it seems a bodged solution at best and not =
guaranteed to avoid the issue either.

Found and read all the google results and would agree that im not alone =
in this.  But haven't found a clear resolution out there either.

As for the crash dump.  I do have at least one but am a little unsure =
about how I should get a kernel.debug to debug against.  The Kernel I am =
running is FreeBSD 6.2-RELEASE-p9 which has come from freebsd-update but =
there doesn't seem to be a debug version of it.  Should I just build my =
own kernel from the sources on the machine and then run with that or =
will that remove all the updates and security updates that are in =
FreeBSD 6.2-RELEASE-p9 or will freebsd-update have updated the kernel =
source on the system as well?

FreeBSD 6.3 is not really an option as some of the software we run isn't =
supported yet although I am chasing this as an alternative.  But as you =
say this may not be resolved as think I have seen at least one reference =
to this exact crash on the 7 Beta.

James

-----Original Message-----
From: freebsd-users-admin at uk.freebsd.org =
[mailto:freebsd-users-admin at uk.freebsd.org] On Behalf Of Mark Blackman
Sent: 11 February 2008 10:06
To: Tutton, James
Cc: freebsd-users at uk.freebsd.org
Subject: Re: taskq Kernel Crashes


On 11 Feb 2008, at 09:39, Tutton, James wrote:

> Hi,
>
> I have 2 HP DL140 G3 servers both have identical specifications and
> setups.  The problem I have is they both seem to encounter Kernel
> panics at seemingly random intervals.

Looks like a well known problem according to google, but without
clear resolution.

Started recently? How frequent? I think for this case, you might
consider getting
a crash dump  and putting together a kernel with symbols if you have
the time and willingness.

http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html

A crash dump might have enough clues to guess what the root issue is,
although
I've seen suggestions of ACPI BIOS problems in one case (firmware
needs upgrading).

OTOH, you might try 6.3-RELEASE on the off-chance this bug has
already been
tracked down and eliminated, although I'd guess that's not the case.

- Mark

>  The panic is always the same:
>
> #########################################################
> Kernel trap 12 with interrupts disabled
>
>
> Fatal trap 12 : page fault while in kernel mode
> Cupid =3D 2; apic id =3D 06
> Fault virtual address       =3D 0x104
> Fault code                     =3D supervisor read, page not present
> Instruction pointer          =3D 0x20 :0xc066c771
> Stack pointer                 =3D 0x28 :0xe4f9bc90
> Frame pointer                =3D 0x28 :0xe4f9bc90
> Code segment               =3D base 0x0, limit 0xfffff, type 0x1b
>                                     =3D DPL 0, pres 1, def32 1, gran 1
> Processor eflags            =3D resume, IOPL =3D 0
> Current process             =3D 5 (thread taskq)
> Trap number                  =3D 12
> Panic : page fault
> Cupid =3D 2
> Uptime : 15d8h9m22s
> Cannot dump. No dump device defined
> Automatic reboot in 15 seconds - press a key on console to abort
> Rebooting...
> Cpu_reset : Stopping other CPUs
> #########################################################
>
>
> After the panic the system locks up completely and doesn't respond
> to any keyboard commands. The only way to reboot the system is a
> physically power.
>
>
> Other information that people might find useful
>
> #########################################################
> dmesg
> #########################################################
>
>
> Copyright (c) 1992-2007 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993,
> 1994
>         The Regents of the University of California. All rights
> reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 6.2-RELEASE-p9 #0: Thu Nov 29 04:22:49 UTC 2007
>     root at i386-builder.daemonology.net:/usr/obj/usr/src/sys/SMP
> Timecounter "i8254" frequency 1193182 Hz quality 0
> CPU: Intel(R) Xeon(R) CPU            5140  @ 2.33GHz (2327.51-MHz
> 686-class CPU)
>   Origin =3D "GenuineIntel"  Id =3D 0x6f6  Stepping =3D 6
>
> =
Features=3D0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,P
> GE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
> >
>
> =
Features2=3D0x4e3bd<SSE3,RSVD2,MON,DS_CPL,VMX,EST,TM2,<b9>,CX16,<b14>,<b
> 15>,<b18>>
>   AMD Features=3D0x20100000<NX,LM>
>   AMD Features2=3D0x1<LAHF>
>   Cores per package: 2
> real memory  =3D 3488677888 (3327 MB)
> avail memory =3D 3413889024 (3255 MB)
> ACPI APIC Table: <PTLTD          APIC  >
> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
>  cpu0 (BSP): APIC ID:  0
>  cpu1 (AP): APIC ID:  1
> ioapic0 <Version 2.0> irqs 0-23 on motherboard
> ioapic1 <Version 2.0> irqs 24-47 on motherboard
> kbd1 at kbdmux0
> ath_hal: 0.9.17.2 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413,
> RF5413)
> acpi0: <PTLTD   RSDT> on motherboard
> acpi0: Power Button (fixed)
> Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0
> cpu0: <ACPI CPU> on acpi0
> acpi_throttle0: <ACPI CPU Throttling> on cpu0
> cpu1: <ACPI CPU> on acpi0
> acpi_throttle1: <ACPI CPU Throttling> on cpu1
> acpi_throttle1: failed to attach P_CNT
> device_attach: acpi_throttle1 attach returned 6
> pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
> pci0: <ACPI PCI bus> on pcib0
> pcib1: <ACPI PCI-PCI bridge> at device 2.0 on pci0
> pci1: <ACPI PCI bus> on pcib1
> pcib2: <ACPI PCI-PCI bridge> irq 16 at device 0.0 on pci1
> pci2: <ACPI PCI bus> on pcib2
> pcib3: <ACPI PCI-PCI bridge> irq 16 at device 0.0 on pci2
> pci3: <ACPI PCI bus> on pcib3
> pcib4: <ACPI PCI-PCI bridge> at device 0.3 on pci1
> pci7: <ACPI PCI bus> on pcib4
> mpt0: <LSILogic SAS/SATA Adapter> port 0x2000-0x20ff mem
> 0xdc210000-0xdc213fff,0xdc200000-0xdc20ffff irq 24 at device 1.0 on
> pci7
> mpt0: [GIANT-LOCKED]
> mpt0: MPI Version=3D1.5.14.0
> mpt0: mpt_cam_event: 0x16
> mpt0: Unhandled Event Notify Frame. Event 0x16 (ACK not required).
> mpt0: mpt_cam_event: 0x16
> mpt0: Unhandled Event Notify Frame. Event 0x16 (ACK not required).
> mpt0: mpt_cam_event: 0x16
> mpt0: Unhandled Event Notify Frame. Event 0x16 (ACK not required).
> mpt0: mpt_cam_event: 0x12
> mpt0: Unhandled Event Notify Frame. Event 0x12 (ACK not required).
> mpt0: mpt_cam_event: 0x12
> mpt0: Unhandled Event Notify Frame. Event 0x12 (ACK not required).
> mpt0: mpt_cam_event: 0x16
> mpt0: Unhandled Event Notify Frame. Event 0x16 (ACK not required).
> mpt0: mpt_cam_event: 0x16
> mpt0: Unhandled Event Notify Frame. Event 0x16 (ACK not required).
> mpt0: mpt_cam_event: 0x16
> mpt0: Unhandled Event Notify Frame. Event 0x16 (ACK not required).
> mpt0: mpt_cam_event: 0xb
> mpt0: Unhandled Event Notify Frame. Event 0xb (ACK not required).
> pcib5: <ACPI PCI-PCI bridge> at device 3.0 on pci0
> pci8: <ACPI PCI bus> on pcib5
> pcib6: <ACPI PCI-PCI bridge> at device 4.0 on pci0
> pci12: <ACPI PCI bus> on pcib6
> pcib7: <PCI-PCI bridge> at device 5.0 on pci0
> pci13: <PCI bus> on pcib7
> pcib8: <ACPI PCI-PCI bridge> at device 6.0 on pci0
> pci14: <ACPI PCI bus> on pcib8
> pcib9: <PCI-PCI bridge> at device 7.0 on pci0
> pci15: <PCI bus> on pcib9
> pcib10: <ACPI PCI-PCI bridge> at device 28.0 on pci0
> pci19: <ACPI PCI bus> on pcib10
> bge0: <Broadcom BCM5750 B1, ASIC rev. 0x4101> mem
> 0xdc300000-0xdc30ffff irq 16 at device 0.0 on pci19
> miibus0: <MII bus> on bge0
> brgphy0: <BCM5750 10/100/1000baseTX PHY> on miibus0
> brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX,
> 1000baseTX, 1000baseTX-FDX, auto
> bge0: Ethernet address: 00:1b:78:d2:a9:c2
> pcib11: <ACPI PCI-PCI bridge> at device 28.1 on pci0
> pci20: <ACPI PCI bus> on pcib11
> bge1: <Broadcom BCM5750 B1, ASIC rev. 0x4101> mem
> 0xdc400000-0xdc40ffff irq 17 at device 0.0 on pci20
> miibus1: <MII bus> on bge1
> brgphy1: <BCM5750 10/100/1000baseTX PHY> on miibus1
> brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX,
> 1000baseTX, 1000baseTX-FDX, auto
> bge1: Ethernet address: 00:1b:78:d2:a9:c3
> uhci0: <UHCI (generic) USB controller> port 0x1800-0x181f irq 23 at
> device 29.0 on pci0
> uhci0: [GIANT-LOCKED]
> usb0: <UHCI (generic) USB controller> on uhci0
> usb0: USB revision 1.0
> uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
> uhub0: 2 ports with 2 removable, self powered
> uhci1: <UHCI (generic) USB controller> port 0x1820-0x183f irq 23 at
> device 29.1 on pci0
> uhci1: [GIANT-LOCKED]
> usb1: <UHCI (generic) USB controller> on uhci1
> usb1: USB revision 1.0
> uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
> uhub1: 2 ports with 2 removable, self powered
> uhci2: <UHCI (generic) USB controller> port 0x1840-0x185f irq 23 at
> device 29.2 on pci0
> uhci2: [GIANT-LOCKED]
> usb2: <UHCI (generic) USB controller> on uhci2
> usb2: USB revision 1.0
> uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
> uhub2: 2 ports with 2 removable, self powered
> ehci0: <EHCI (generic) USB 2.0 controller> mem
> 0xdc000000-0xdc0003ff irq 23 at device 29.7 on pci0
> ehci0: [GIANT-LOCKED]
> usb3: EHCI version 1.0
> usb3: companion controllers, 2 ports each: usb0 usb1 usb2
> usb3: <EHCI (generic) USB 2.0 controller> on ehci0
> usb3: USB revision 2.0
> uhub3: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
> uhub3: 6 ports with 6 removable, self powered
> pcib12: <ACPI PCI-PCI bridge> at device 30.0 on pci0
> pci21: <ACPI PCI bus> on pcib12
> pci21: <display, VGA> at device 2.0 (no driver attached)
> isab0: <PCI-ISA bridge> at device 31.0 on pci0
> isa0: <ISA bus> on isab0
> atapci0: <Intel 63XXESB2 UDMA100 controller> port
> 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1860-0x186f at device 31.1 on
> pci0
> ata0: <ATA channel 0> on atapci0
> ata1: <ATA channel 1> on atapci0
> pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
> acpi_button0: <Power Button> on acpi0
> sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags
> 0x10 on acpi0
> sio0: type 16550A
> pmtimer0 on isa0
> orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff,
> 0xdc000-0xdffff on isa0
> atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
> atkbd0: <AT Keyboard> irq 1 on atkbdc0
> kbd0 at atkbd0
> atkbd0: [GIANT-LOCKED]
> ppc0: parallel port not found.
> sc0: <System console> at flags 0x100 on isa0
> sc0: VGA <16 virtual consoles, flags=3D0x300>
> sio1: configured irq 3 not in bitmap of probed irqs 0
> sio1: port may not be enabled
> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff
> on isa0
> ukbd0: ServerEngines SE USB Device, rev 1.10/0.01, addr 2, iclass 3/1
> kbd2 at ukbd0
> ums0: ServerEngines SE USB Device, rev 1.10/0.01, addr 2, iclass 3/1
> ums0: 8 buttons and Z dir.
> Timecounters tick every 1.000 msec
> acd0: DVDROM <DV-28E-V/C.AB> at ata0-master UDMA33
> SMP: AP CPU #1 Launched!
> da0 at mpt0 bus 0 target 1 lun 0
> da0: <LSILOGIC Logical Volume 3000> Fixed Direct Access SCSI-2 device
> da0: 300.000MB/s transfers, Tagged Queueing Enabled
> da0: 237464MB (486326272 512 byte sectors: 255H 63S/T 30272C)
> Trying to mount root from ufs:/dev/da0s1a
> WARNING: /var was not properly dismounted
> #########################################################
> I have tried updating the kernel and system with freebsd-update and
> although this applied a number of updates it hasn't had any effect
> on this issue.
> I was about to write the whole thing off as hardware compatibility
> issue when another box running FreeBSD threw exactly the same error
> on Friday.  The knew box is an intel box and has absolutly nothing
> in common with the other two on the hardware front.  So im now back
> to looking at the OS for answers.
> Any help and advise very much appreachiated as although I know my
> way round BSD and linux even more so.  Im no expert when it comes
> down to Kernel level issues and bugs.
> James
>
> **********************************************************************
> DISCLAIMER:
> This correspondence may contain information which is confidential
> or proprietary or both.  Any dissemination, distribution, copying
> or use of this communication without prior permission of the
> addressee is strictly prohibited. If you are not the intended
> recipient you may not disclose, copy or use this information.  If
> you have received this message in error, please contact the sender
> to discuss its return or destruction.
>
> The contents, comments and views contained or expressed within this
> correspondence do not necessarily reflect those of Redstone, its
> subsidiaries, affiliates, associates or sister companies and are
> not intended to create legal relations with the recipient.
>
> Redstone may monitor email traffic data and also the content of
> email for the purposes of security and staff training.
>
> If you would like to know more about Redstone, visit us on the web
> at www.redstone.co.uk or contact our Head Office on 0845-200-2200.
>
> Redstone Managed Solutions Limited
> Registered in England & Wales with Company Number: 03021292
> Registered Office: 80 Great Eastern Street, London EC2A 3RS
> **********************************************************************


------ FreeBSD UK Users' Group  -  Mailing List ------
http://listserver.uk.freebsd.org/mailman/listinfo/freebsd-users=20
=20
**********************************************************************
DISCLAIMER:
This correspondence may contain information which is confidential or =
proprietary or both.  Any dissemination, distribution, copying or use of =
this communication without prior permission of the addressee is strictly =
prohibited. If you are not the intended recipient you may not disclose, =
copy or use this information.  If you have received this message in =
error, please contact the sender to discuss its return or destruction.

The contents, comments and views contained or expressed within this =
correspondence do not necessarily reflect those of Redstone, its =
subsidiaries, affiliates, associates or sister companies and are not =
intended to create legal relations with the recipient.

Redstone may monitor email traffic data and also the content of email =
for the purposes of security and staff training.=20

If you would like to know more about Redstone, visit us on the web at =
www.redstone.co.uk or contact our Head Office on 0845-200-2200.


Redstone Managed Solutions Limited
Registered in England & Wales with Company Number: 03021292
Registered Office: 80 Great Eastern Street, London EC2A 3RS
**********************************************************************




More information about the Ukfreebsd mailing list