[Novalug] need help: server freezing -- How to troubleshoot
Dave Ashby
dave.ashby at 1993.usna.com
Fri Oct 23 17:37:07 EDT 2009
Not to send you down a rabbit-trail, but when I was looking to upgrade
my HDD's a couple months back I was debating between the 1 TB and 1.5 TB
Seagate drives. Some of the reviewers indicated that they had issues
with the 1.5 TB drives - generally traced to an issue with the HDD
firmware or the mobo firmware (some reported issues with their NVidia
chipsets and drives > 1 TB in size). Have you checked the reviews for
your 1.5 TB drives to see if other folks were having similar mystery
lockups? I know the Seagate 1.5 TB reviews are full of them.....
-dave
Bryan J. Smith wrote:
> Couple of things ...
>
> 1. Update your kernel. Looks like you're using a stock Ubuntu kernel.
>
> 2. I see actions on the /dev/sdb (1TB) drive around some of the freezes.
> Have you tried just the 1.5TB drives on their own, without the 1TB drives?
> I know you said you never had an issue with the 1TB drives, but it can't
> hurt to try the 1TB and 1.5TB pairs on their own.
>
>
> 3. What is that 4GB USB flash drive I see sometimes? Are you leaving
> that plugged in 24x7? If so, then try running without it plugged in. Or was
> it just when you plugged something in?
>
> All-in-all this really doesn't have much, other than noting some MD
> operations. I guess we'll need to dive into SMART.
>
> SMART can report statistics and it can do its own diagnostic runs.
> This LJ gives a good intro:
> http://www.linuxjournal.com/article/6983
>
>
>
>
> ----- Original Message ----
> From: Richard Ertel <richard.ertel at gmail.com>
> To: Jay Hart <jhart at kevla.org>
> Cc: Novalug <novalug at calypso2.tux.org>
> Sent: Fri, October 23, 2009 8:42:56 AM
> Subject: Re: [Novalug] need help: server freezing -- How to troubleshoot
>
> Jay, here is my /var/log/messages: http://pastebin.com/m1414a669
> ugh it's big.
>
> i was running the system with just the two raid-1 drives last night
> (oct 22), then it locked, so i powered it down and went to bed. turned
> it back on this morning (oct 23) around 6am. anyone looking at this
> log should start there and work backwards to the previous boot, i
> guess.
>
> On Thu, Oct 22, 2009 at 21:56, Jay Hart <jhart at kevla.org> wrote:
>
>> Richard,
>>
>> Please post your messages file here. Paul has a good idea, but hardware
>> problems are not always captured in log files if the *Sg&S(# PC locks up prior
>> to entry being written.
>>
>> I have successfully troubleshot hundreds of PCs, and the first thing I always
>> try to do is go bare bones and see if problem still exists, then start adding
>> one thing back into the system at a time. Used this type of method in the
>> Nuclear Navy to great effect.
>>
>> So post your messages file, so we can look it over. Dmesg on startup would be
>> nice too. If you post the DMESG file, go with a full up configured PC.
>>
>> Jay
>>
>>
>>> Richard Ertel wrote:
>>>
>>>> *sigh*
>>>>
>>>> ok, so my fileserver is locking up. seems to always happen, anywhere
>>>> from 1 minute to 4 hours after booting. if i disconnect all four SATA
>>>> hard drives (all for storage) and just have the boot drive (PATA)
>>>> connected, it seems to stay up indefinitely.
>>>>
>>>> i've ran the SATA drives that i thought were problematic through
>>>> Seagate's SeaTools, and they passed all tests.
>>>>
>>>> i've looked through /var/log/messages for entries when the lockup
>>>> occurred, but nothing looks odd (to me, what do i know?)
>>>>
>>>> can anyone tell me where to start troubleshooting to get to the bottom of
>>>> this?
>>>>
>>>> Ubuntu Server 8.04.3, all updates as of this morning.
>>>>
>>> Rich Ertel,
>>>
>>> On the one hand, the responses that you have received from Jay Hart and
>>> Gerald Williams are not bad, indeed, they are good ideas. On the other
>>> hand, that is not the best way to troubleshoot. IMO, blindly guessing as
>>> to the cause (of a problem) is rarely the best way to troubleshoot. When
>>> troubleshooting, you should always _first_ attempt to generate
>>> diagnostic information (DI), and, in order to do that, you must identify
>>> the tools (e.g., software packages) that will help you to generate DI.
>>> Some of these tools are built into a standard Linux distribution, but
>>> others must be installed. I cannot recall the names of any such tools
>>> for HDD's and HDD controllers, but I am certain that they exist.
>>>
>>> Sincerely,
>>> Paul Bain
>>>
>>> _______________________________________________
>>> Novalug mailing list
>>> Novalug at calypso.tux.org
>>> http://calypso.tux.org/mailman/listinfo/novalug
>>>
>>>
>> _______________________________________________
>> Novalug mailing list
>> Novalug at calypso.tux.org
>> http://calypso.tux.org/mailman/listinfo/novalug
>>
>>
> _______________________________________________
> Novalug mailing list
> Novalug at calypso.tux.org
> http://calypso.tux.org/mailman/listinfo/novalug
>
> _______________________________________________
> Novalug mailing list
> Novalug at calypso.tux.org
> http://calypso.tux.org/mailman/listinfo/novalug
>
More information about the Novalug
mailing list