I'm using KDE Neon with the latest version of Plasma. Sometimes I get a warning that my SSD has poor health and may die soon. When I check the SMART stats the drive seems fairly healthy. Is this just a Plasma bug?
I'm using KDE Neon with the latest version of Plasma. Sometimes I get a warning that my SSD has poor health and may die soon. When I check the SMART stats the drive seems fairly healthy. Is this just a Plasma bug?
I'd recommend you to make backups either way. I've had a SSD with SMART status "good" very suddenly die before, so don't take any chances!
Yeah, I keep everything important backed up. I'm definitely fine if it dies. I'm just curious really.
This happened to me a few weeks ago and the pain is still fresh. Please tell us your data is safely backed up, OP.
The SMART stats in question and SSD model name would definitely help in answering that question.
It's a relatively recent 1TB Samsung 980.
rtctl 7.2 2020-12-30 r5155 [x86_64-linux-6.5.3-060503-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Number: Samsung SSD 980 1TB
Serial Number: S64ANJ0RA44661N
Firmware Version: 1B4QFXO7
PCI Vendor/Subsystem ID: 0x144d
IEEE OUI Identifier: 0x002538
Total NVM Capacity: 1,000,204,886,016 [1.00 TB]
Unallocated NVM Capacity: 0
Controller ID: 5
NVMe Version: 1.4
Number of Namespaces: 1
Namespace 1 Size/Capacity: 1,000,204,886,016 [1.00 TB]
Namespace 1 Utilization: 553,282,572,288 [553 GB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 002538 da11440dac
Local Time is: Thu Oct 5 13:48:48 2023 PDT
Firmware Updates (0x16): 3 Slots, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x0055): Comp DS_Mngmt Sav/Sel_Feat Timestmp
Log Page Attributes (0x0f): S/H_per_NS Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg
Maximum Data Transfer Size: 512 Pages
Warning Comp. Temp. Threshold: 82 Celsius
Critical Comp. Temp. Threshold: 85 Celsius
Namespace 1 Features (0x10): NP_Fields
Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 5.24W - - 0 0 0 0 0 0
1 + 4.49W - - 1 1 1 1 0 0
2 + 2.19W - - 2 2 2 2 0 500
3 - 0.0500W - - 3 3 3 3 210 1200
4 - 0.0050W - - 4 4 4 4 1000 9000
Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0
=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 37 Celsius
Available Spare: 100%
Available Spare Threshold: 10%
Percentage Used: 1%
Data Units Read: 8,707,548 [4.45 TB]
Data Units Written: 16,750,179 [8.57 TB]
Host Read Commands: 60,932,777
Host Write Commands: 210,324,713
Controller Busy Time: 348
Power Cycles: 802
Power On Hours: 384
Unsafe Shutdowns: 64
Media and Data Integrity Errors: 1
Error Information Log Entries: 1
Warning Comp. Temperature Time: 2470
Critical Comp. Temperature Time: 0
Temperature Sensor 1: 37 Celsius
Temperature Sensor 2: 47 Celsius
Thermal Temp. 2 Transition Count: 54637
Thermal Temp. 2 Total Time: 114793
Error Information (NVMe Log 0x01, 16 of 64 entries)
No Errors Logged
Media and Data Integrity Errors: 1
Is most likely the reason for the alert. But as long as this number does not increase it is fine.
Be careful with that specific ssd. There's a firmware bug that can switch it to read-only mode with no fix afterwards. I don't recognize your firmware version so I don't think it's one of the patched ones (ones with first char > 4 are patched I think)
Does your drive do any weird bit packing in its smart data? Some of mine used to do weird shit like store read errors plus total sectors read packed in one field which constantly threw errors until I added the proper data format in a config file. Without specifying a data format smartmon just assumed the drive was throwing absolutely huge numbers of errors and threw warnings every run.
Try googling the specific error message, or your drive model plus the error and see if anything pops up.
Backup your data, use zram, cache to RAM, also compare the TBW to your SSD endurance to get an estimation of how much life it has left.