[mythtv-users] warning for anyone with western digital green drives

Sun Jan 8 00:13:28 UTC 2012

Just for the record this is the output from smartctl for one of the drives:

smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.0.0-12-generic] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD10EARX-00N0YB0
Serial Number:    WD-WCC0S0207719
LU WWN Device Id: 5 0014ee 25b70161a
Firmware Version: 51.0AB51
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Wed Dec 21 10:16:06 2011 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x85)    Offline data collection activity
                    was aborted by an interrupting command from host.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)    The previous self-test
routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         (17760) seconds.
Offline data collection
capabilities:              (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      ( 174) minutes.
Conveyance self-test routine
recommended polling time:      (   5) minutes.
SCT capabilities:            (0x30b5)    SCT Status supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE     
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   194   173   051    Pre-fail 
Always       -       7565
  3 Spin_Up_Time            0x0027   118   116   021    Pre-fail 
Always       -       7100
  4 Start_Stop_Count        0x0032   100   100   000    Old_age  
Always       -       382
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail 
Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age  
Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age  
Always       -       252
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age  
Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age  
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age  
Always       -       365
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age  
Always       -       252
193 Load_Cycle_Count        0x0032   200   200   000    Old_age  
Always       -       1667
194 Temperature_Celsius     0x0022   117   106   000    Old_age  
Always       -       30
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age  
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age  
Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age  
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age  
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age  
Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining 
LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       70%      
248         566231424
# 2  Short offline       Completed without error       00%      
247         -
# 3  Extended offline    Completed: read failure       70%      
242         566231424
# 4  Extended offline    Completed: read failure       70%      
221         566231424
# 5  Extended offline    Completed: read failure       70%      
215         566231424
# 6  Extended offline    Completed: read failure       70%      
213         566231424
# 7  Extended offline    Completed: read failure       70%      
206         566231424
# 8  Extended offline    Completed: read failure       70%      
194         566231424
# 9  Conveyance offline  Completed without error       00%      
189         -
#10  Extended offline    Completed: read failure       70%      
187         566231424
#11  Short offline       Completed without error       00%      
150         -
#12  Extended offline    Completed: read failure       70%      
150         566231424
#13  Short offline       Completed without error       00%      
149         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

On 08/01/12 00:05, PJR wrote:
> The reason I said the discs were 'faulty' was because on running
> smartctrl on both discs it terminated with an error, it reported Read
> Failures consistently for the same LBA_of_first_error _not_ because of
> the Load_Cycle_count value (I mentioned this in my first post to this
> thread and the reason the discs have been returned to the supplier). 
> I was just reporting my Load_Cycle_count_values since this seemed to
> be a direction the thread was going and the info might be useful.
>
>
> On 07/01/12 17:28, Simon Hobson wrote:
>> PJR wrote:
>>> My two WDC WD10EARX-00N0YB0 (possibly faulty) drives have:
>>>
>>> 9 Power_on_Hours            190
>>> 193 Load_Cycle_Count     1064
>>>
>>> 9 Power_on_Hours            175
>>> 193 Load_Cycle_Count     1064
>>>
>>> I assume, if they weren't faulty this is not good?
>> Why faulty ?
>> The whole essence of this thread is that these drives have (by 
>> default) very aggressive power saving. If idle for just 8 seconds 
>> they will unload the heads - which I assume means moving them to a 
>> safe zone on the disk and lifting them. After this, I suspect there's 
>> another fairly short timer before they also spin down the drive.
>>
>> On a typical Unix [like] system, there are frequent disk accesses - 
>> checking this, logging that, etc, etc. So what tends to happen is the 
>> drive goes idle, unloads the heads, the system accesses it so it 
>> loads the heads, rinse and repeat often. So the Load_Cycle_Count 
>> which counts these will increment quite rapidly.
>>
>> The suggestion is that it better to increase the timeout, so under 
>> normal use they won't unload the heads very often, if at all.
>> Say you accessed the drive every 30 seconds. On each access, the 
>> drive would load the heads, do the access, then 8 seconds later 
>> unload the heads again. So that's 120 load cycles per hours, or 2880 
>> load cycles per day !
>> Increase the timeout to a minute or two, and under the same scenario 
>> you'd probably have none.
>>
>>
>> Which reminds me I need to twiddle with some drives I use for 
>> backups. I bought a SATA card for the old Mac I use to run 
>> Retrospect, but kept getting some rather oddball errors, kernel 
>> panics, and so on. I thought the card might be faulty, but then 
>> twigged ...
>> I've done a cron job which every minute updates a file on any of the 
>> backup disks that are mounted - which prevents the drive going to 
>> sleep and spinning down. I suspect the load cycle count will be going 
>> up quite nicely on them - up to 1440 per day while the drive isn't 
>> actually being accessed by the backup software.
>> What was happening was that the backup software accesses the drive to 
>> see what it is, then goes off to find a client - it then scans the 
>> client, does a bit of thinking to decide what files need copying, and 
>> then starts copying files by which time the drive has got bored and 
>> spun down resulting in an error and/or kernel panic.
>>
>
>
> _______________________________________________
> mythtv-users mailing list
> mythtv-users at mythtv.org
> http://www.mythtv.org/mailman/listinfo/mythtv-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.mythtv.org/pipermail/mythtv-users/attachments/20120108/65718a2b/attachment-0001.html