LINUX.ORG.RU
ФорумTalks

Энтерпрайз, бессмысленный и беспощадный

 


0

2

Краткий список проблем, пофикшенных в очередной версии прошивки RAID-контроллера известной фирмы. Слава Богу, сервер «бомбанул» раньше, чем его поставили на боевое дежурство.

http://h20564.www2.hpe.com/hpsc/swd/public/detail?sp4ts.oid=7274906&swIte...

Upgrade Requirement: Recommended - HPE recommends users update to this version at their earliest convenience.

SATA SSD’s were incorrectly marked as worn out.

Poor performance on the first logical volume with drive write cache enabled.

Controller might hang or show as failed in the IML with 20+ SATA SSD’s configured in a RAID6 volume with Smart Path enabled and IO running.

System might hang at POST following a reboot.

System might lockup at POST following a reboot (POST Lockup 0x13)

Duplicate SCSI devices shown in Linux after issuing a bus rescan command.

Recovered paths were not restored when using a SmartArray P742m controller connected to an external drive enclosure and a shared storage external enclosure.

Smart Array P741m controller missing the embedded SmartCache license key.

System might stop responding in response to a rare error (POST Lockup 0x13).

Drives might not appear as HPE authentic drives when used in a zoning configuration.

Deferred drive firmware flashing might hang at POST with some models of drives.

Deleted

Symptoms:

A kernel panic sometimes occurs if an HDLM device is used concurrently by multiple processes that use HDLM devices.

In addition, if a kernel panic occurs, the following message is output to the console log:

[10262.612190] BUG: unable to handle kernel NULL pointer dereference at 0000000000000398

Conditions of occurrence:

This symptom may occur when all of the following conditions are met:

1) A path failure (cable failure) occurs, and recovery is performed.

2) The HDLM device that became accessible in 1) is used by a process (*1).

3) The HDLM device in 2) is used by a process other than the process in 2) (*1).

4) The process in 2) and the process in 3) start concurrently.

Footnotes:

(*1): This includes cases when the HDLM device is used by the following utilities, commands, or functions provided by HDLM:

Utilities:

- dlmcfgmgr (executed with the -r parameter specified)

- dlmcfgmgr (executed with the -i parameter specified)

- dlmchname

- dlmstart

- dlmgetras

- dlmmkinitrd

Commands:

- dlnkmgr online/add

- dracut

Functions:

- Path health check

- Auto failback

Cause:

This problem is due to a defect in HDLM for Linux in the exclusion control function related to the processing that counts the number of times a SCSI device managed by HDLM was opened. As a result, HDLM executes the close processing for SCSI devices that are not opened, and a kernel panic occurs.

Impact to customer operations:

Frequency of problem: Low.

Severity of problem: High.

Recovery from problem:

Restart the server.

Давайте мериться алертами и чендж логами.

cipher ★★★★★
()
Ответ на: комментарий от abraziv_whiskey

сравнил тоже, спутник и рейд-контроллер

Harald ★★★★★
()

Хотя бы данные целы... Я чуть не поседел когда читал release notes для ibm svc - в рекоммендованной ibm версии прошивки была бага, которая могла убивать тома для которых настроена репликация и снепшоты (включая реплику и снепшот). Зато эта версия исправляла другую багу, которая нам до этого уже убила все данные на массиве

и ту да же -

Potential data loss scenario when using compressed volumes on SVC and Storwize V7000 running software versions 7.3, 7.4 or 7.5. > Multiple node asserts leading to loss of access to data when changing a volume throttle rate to a value of more than 10000 IOPs or 40MBps

и вот еще вкусно

Temporary loss of paths for FCoE hosts after 497 days uptime due to FCoE driver timer problem

There is a serious issue on 6.2.0.0 releases and higher that will result in node canisters rebooting after 208 days of continuous uptime since their last power cycle or software upgrade

хотя последний баг унаследован линуксового ядра

user_undefined
()

Ой ну прям открыли Америку. Список пофикшенных багов.

Вот так тоже можно ведь:

The following table lists open bugs at the time of this Release Note publication.

CSCto19832 OpenLDAP needs to be upgraded or patched
CSCuu92845 9.4.1 - Traceback in ctm_sw_rng.c on 5515 boot
CSCva82479 Doc: ASA SSH config options for addressing slow SCP copy to ASA
CSCuz90696 DOC: Update supported VPN clients on ASA in multi-context mode
CSCva77178 EIGRP does not populate routing table on FPR4K with ASA software
CSCva35990 Traceback on CP Process with H323 inspection, rip h323_service_early_msg
CSCva43992 IKEv2 RA cert auth. Unable to allocate new session. Max sessions reached
CSCva80364 256 Block depletion due to high syslog generation.
CSCva88128 ASA UDP syslog inconsistent no of conn table entries across platforms
CSCva39094 ASA traceback in CLI thread while making MPF changes
CSCva69346 Unable to relay DHCP discover packet from ASA when NAT is matched
CSCuy85511 libxml2 htmlParseNameComplex() Function Denial of Service Vulnerabilit
CSCux85525 XMLSoft libxml2 Encoding Conversion Denial of Service Vulnerability
CSCux85528 XMLSoft libxml2 XML Entity Processing Denial of Service Vulnerability
CSCux85532 XMLSoft libxml2 xmlNextChar Function Memory Corruption Vulnerability
CSCux85527 XMLSoft libxml2 xmlParserInputGrow Function Denial of Service Vulnerab
CSCux85533 XMLSoft libxml2 xmlParseXMLDecl Function Denial of Service Vulnerabili
CSCuz05856 XMLSoft libxml2 xmlStringGetNodeList Function Memory Exhaustion Denial
CSCva31378 ASA crash at Thread Name: rtcli async executor process
CSCuz83966 ASA: Cut Through proxy is not working with TLS1.2
CSCva33271 WebVPN smart tunnels fail when ASA identity cert uses SHA256 signature 

И ничего особенного в этом нет, пофиксят и это когда-нить.

eabi
()

Это ты еще не видишь список багов, который у вендора висит в непофикшенных =)

bigbit ★★★★★
()

чего хотел сообщить?

платишь больше бабла - получаешь меньше проблем. платишь меньше - получаешь больше проблем.

купи EMC хранилище за пару лямов бакинских - и у тебя будет гарантия, что даже после падения тебе в течение 4 часов всё наладят.

Iron_Bug ★★★★★
()
Ответ на: комментарий от eabi

О, в libxml2 много всего интересного, которое работает не так, как надо.

cvs-255 ★★★★★
()
Вы не можете добавлять комментарии в эту тему. Тема перемещена в архив.