Привет, ЛОР.
После перехода на открытые дрова видеокарта стала зависать под нагрузкой через случайные промежутки времени - от 2 до 20 минут, иногда больше. В логах вываливается вот такое:
Oct 2 23:45:38 localhost kernel: [ 864.022715] pcieport 0000:00:02.0: AER: Uncorrected (Non-Fatal) error received: id=0010
Oct 2 23:45:38 localhost kernel: [ 864.022725] pcieport 0000:00:02.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0010(Requester ID)
Oct 2 23:45:38 localhost kernel: [ 864.022728] pcieport 0000:00:02.0: device [8086:2f04] error status/mask=00004000/00000000
Oct 2 23:45:38 localhost kernel: [ 864.022730] pcieport 0000:00:02.0: [14] Completion Timeout (First)
Oct 2 23:45:38 localhost kernel: [ 864.022736] pcieport 0000:00:02.0: AER: Device recovery failed
Oct 2 23:45:38 localhost kernel: [ 864.040235] pcieport 0000:00:02.0: AER: Uncorrected (Non-Fatal) error received: id=0010
Oct 2 23:45:38 localhost kernel: [ 864.040242] pcieport 0000:00:02.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0010(Requester ID)
Oct 2 23:45:38 localhost kernel: [ 864.040244] pcieport 0000:00:02.0: device [8086:2f04] error status/mask=00004000/00000000
Oct 2 23:45:38 localhost kernel: [ 864.040246] pcieport 0000:00:02.0: [14] Completion Timeout (First)
Oct 2 23:45:38 localhost kernel: [ 864.040252] pcieport 0000:00:02.0: AER: Device recovery failed
Oct 2 23:45:38 localhost kernel: [ 864.049196] pcieport 0000:00:02.0: AER: Uncorrected (Fatal) error received: id=0010
Oct 2 23:45:38 localhost kernel: [ 864.049204] pcieport 0000:00:02.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, id=0010(Requester ID)
Oct 2 23:45:38 localhost kernel: [ 864.049214] pcieport 0000:00:02.0: device [8086:2f04] error status/mask=00004020/00000000
Oct 2 23:45:38 localhost kernel: [ 864.049216] pcieport 0000:00:02.0: [ 5] Surprise Down Error
Oct 2 23:45:38 localhost kernel: [ 864.049218] pcieport 0000:00:02.0: [14] Completion Timeout (First)
Oct 2 23:45:39 localhost kernel: [ 865.053266] pcieport 0000:00:02.0: AER: Device recovery failed
Oct 2 23:45:48 localhost kernel: [ 874.384757] radeon 0000:02:00.0: ring 0 stalled for more than 10020msec
Oct 2 23:45:48 localhost kernel: [ 874.384761] radeon 0000:02:00.0: GPU lockup (current fence id 0x000000000000de0a last fence id 0x000000000000de10 on ring 0)
Через некоторое время вся система виснет. Если вырубить DPM, карта всё равно виснет, но в логах кроме ring 0 stalled и GPU lockup ничего нет. В какую сторону копать?
Карта 7950. Ядро vanilla 4.7.6, ati 7.7.1, xorg 1.18.4, mesa 12.0.3.