[Novalug] need help: server freezing -- How to troubleshoot
richard.ertel at gmail.com
Fri Oct 23 09:31:37 EDT 2009
1. you mean beyond doing "apt-get dist-upgrade"? like compiling my own
kernel? that sounds... hard. or have things changed since 1999?
2. currently the 1.0 TB RAID-1 drives are disconnected. just the 1.5TB
LVM drives are connected, and i am copying files to the volume. i need
to throw some reads on there too. can i just copy stuff to /dev/null?
3. that was a Kubuntu 9.10 Beta live USB, just used to make sure i
could boot to usb for future troubleshooting. was only connected for
30 minutes maybe, and is certainly not connected right now.
On Fri, Oct 23, 2009 at 09:23, Bryan J. Smith <b.j.smith at ieee.org> wrote:
> Couple of things ...
> 1. Update your kernel. Looks like you're using a stock Ubuntu kernel.
> 2. I see actions on the /dev/sdb (1TB) drive around some of the freezes.
> Have you tried just the 1.5TB drives on their own, without the 1TB drives?
> I know you said you never had an issue with the 1TB drives, but it can't
> hurt to try the 1TB and 1.5TB pairs on their own.
> 3. What is that 4GB USB flash drive I see sometimes? Are you leaving
> that plugged in 24x7? If so, then try running without it plugged in. Or was
> it just when you plugged something in?
> All-in-all this really doesn't have much, other than noting some MD
> operations. I guess we'll need to dive into SMART.
> SMART can report statistics and it can do its own diagnostic runs.
> This LJ gives a good intro:
> ----- Original Message ----
> From: Richard Ertel <richard.ertel at gmail.com>
> To: Jay Hart <jhart at kevla.org>
> Cc: Novalug <novalug at calypso2.tux.org>
> Sent: Fri, October 23, 2009 8:42:56 AM
> Subject: Re: [Novalug] need help: server freezing -- How to troubleshoot
> Jay, here is my /var/log/messages: http://pastebin.com/m1414a669
> ugh it's big.
> i was running the system with just the two raid-1 drives last night
> (oct 22), then it locked, so i powered it down and went to bed. turned
> it back on this morning (oct 23) around 6am. anyone looking at this
> log should start there and work backwards to the previous boot, i
> On Thu, Oct 22, 2009 at 21:56, Jay Hart <jhart at kevla.org> wrote:
>> Please post your messages file here. Paul has a good idea, but hardware
>> problems are not always captured in log files if the *Sg&S(# PC locks up prior
>> to entry being written.
>> I have successfully troubleshot hundreds of PCs, and the first thing I always
>> try to do is go bare bones and see if problem still exists, then start adding
>> one thing back into the system at a time. Used this type of method in the
>> Nuclear Navy to great effect.
>> So post your messages file, so we can look it over. Dmesg on startup would be
>> nice too. If you post the DMESG file, go with a full up configured PC.
>>> Richard Ertel wrote:
>>>> ok, so my fileserver is locking up. seems to always happen, anywhere
>>>> from 1 minute to 4 hours after booting. if i disconnect all four SATA
>>>> hard drives (all for storage) and just have the boot drive (PATA)
>>>> connected, it seems to stay up indefinitely.
>>>> i've ran the SATA drives that i thought were problematic through
>>>> Seagate's SeaTools, and they passed all tests.
>>>> i've looked through /var/log/messages for entries when the lockup
>>>> occurred, but nothing looks odd (to me, what do i know?)
>>>> can anyone tell me where to start troubleshooting to get to the bottom of
>>>> Ubuntu Server 8.04.3, all updates as of this morning.
>>> Rich Ertel,
>>> On the one hand, the responses that you have received from Jay Hart and
>>> Gerald Williams are not bad, indeed, they are good ideas. On the other
>>> hand, that is not the best way to troubleshoot. IMO, blindly guessing as
>>> to the cause (of a problem) is rarely the best way to troubleshoot. When
>>> troubleshooting, you should always _first_ attempt to generate
>>> diagnostic information (DI), and, in order to do that, you must identify
>>> the tools (e.g., software packages) that will help you to generate DI.
>>> Some of these tools are built into a standard Linux distribution, but
>>> others must be installed. I cannot recall the names of any such tools
>>> for HDD's and HDD controllers, but I am certain that they exist.
>>> Paul Bain
>>> Novalug mailing list
>>> Novalug at calypso.tux.org
>> Novalug mailing list
>> Novalug at calypso.tux.org
> Novalug mailing list
> Novalug at calypso.tux.org
More information about the Novalug