Всем привет. Пытаюсь поднять отказоустойчивый кластер с MSSQL в HA Cluster. Пользуюсь этой инструкцией. Для этих целей использую виртуальные 2 машины с Centos 8. Дисковое устройство - прямой lun пробрашенный на обе машины. Когда полнимал вариант с Always On проблем небыло, а экземпляр отказоустойчивого кластера не поднимается.
Проблема как мне кажется с ресурсом ocf::mssql:fci. Он не хочет стартовать и есть большие проблемы с кодировкой. Сделал простой кластер, с одной нодой, без зависимостей ресурсов. Просто чтобы сервер стартовал внутри Pacemaker.
Node List:
* Online: [ centos8.local ]
Full List of Resources:
* Resource Group: mssql:
* my_lvm (ocf::heartbeat:LVM-activate): Started centos8.local
* my_fs (ocf::heartbeat:Filesystem): Started centos8.local
* VirtualIP (ocf::heartbeat:IPaddr2): Started centos8.local
* mssqlha (ocf::mssql:fci): Stopped
Failed Resource Actions:
* mssqlha_start_0 on centos8.local 'error' (1): call=24, status='complete', exitreason='2024/10/22 11:12:54 Unexpected error: mssql: \37777777720\37777777643 \37777777720\37777777677\37777777720\37777777676\37777777720\37777777673\37777777721\37777777614\37777777720\37777777667\37777777720\37777777676\37777777720\37777777662\37777777720\37777777660\37777777721\37777777602\37777777720\37777777665\37777777720\37777777673\37777777721\37777777617 \37777777720\37777777675\37777777720\37777777665\37777777721\37777777602 \37777777721\37777777600\37777777720\37777777660\37777777720\37777777667\37777777721\37777777600\37777777720\37777777665\37777777721\37777777610\37777777720\37777777665\37777777720\37777777675\37777777720\37777777670\37777777721\37777777617 \37777777720\37777777675\37777777720\37777777660 \37777777720\37777777662\37777777721\37777777613\37777777720\37777777677\37777777720\37777777676\37777777720\37777777673\37777777720\37777777675\37777777720\37777777665\37777777720\37777777675\37777777720\37777777670\37777777720\37777777665 \37777777721', last-rc-change='2024-10-22 11:12:44 +03:00', queued=0ms, exec=10369ms
В логах /var/log/pacemaker/pacemaker.conf
следующие ошибки:
Oct 22 10:39:10 fci(mssqlha)[399037]: INFO: SQL Server started. PID: 399183; user: mssql; command: /opt/mssql/bin/sqlservr
Oct 22 10:39:11 fci(mssqlha)[399037]: INFO: start: 2024/10/22 10:39:11 fci-helper invoked with hostname [localhost]; port [1433]; credentials-file [/var/opt/mssql/secrets/passwd]; application-name [monitor-mssqlha-start]; connection$
Oct 22 10:39:11 fci(mssqlha)[399037]: INFO: start: 2024/10/22 10:39:11 fci-helper invoked with virtual-server-name [mssqlha]
Oct 22 10:39:11 fci(mssqlha)[399037]: INFO: start: 2024/10/22 10:39:11 From RetryExecute - Attempt 1 to connect to the instance at localhost:1433
Oct 22 10:39:11 fci(mssqlha)[399037]: INFO: start: 2024/10/22 10:39:11 Attempt 1 returned error: Unresponsive or down Unable to open tcp connection with host 'localhost:1433': dial tcp 127.0.0.1:1433: getsockopt: connection refused
Oct 22 10:39:12 fci(mssqlha)[399037]: INFO: start: 2024/10/22 10:39:12 From RetryExecute - Attempt 2 to connect to the instance at localhost:1433
Oct 22 10:39:12 fci(mssqlha)[399037]: INFO: start: 2024/10/22 10:39:12 Attempt 2 returned error: Unresponsive or down Unable to open tcp connection with host 'localhost:1433': dial tcp 127.0.0.1:1433: getsockopt: connection refused
Oct 22 10:39:13 fci(mssqlha)[399037]: INFO: start: 2024/10/22 10:39:13 From RetryExecute - Attempt 3 to connect to the instance at localhost:1433
Oct 22 10:39:13 fci(mssqlha)[399037]: INFO: start: 2024/10/22 10:39:13 Attempt 3 returned error: Unresponsive or down Unable to open tcp connection with host 'localhost:1433': dial tcp 127.0.0.1:1433: getsockopt: connection refused
Oct 22 10:39:14 fci(mssqlha)[399037]: INFO: start: 2024/10/22 10:39:14 From RetryExecute - Attempt 4 to connect to the instance at localhost:1433
Oct 22 10:39:14 fci(mssqlha)[399037]: INFO: start: 2024/10/22 10:39:14 Attempt 4 returned error: Unresponsive or down Unable to open tcp connection with host 'localhost:1433': dial tcp 127.0.0.1:1433: getsockopt: connection refused
Oct 22 10:39:15 fci(mssqlha)[399037]: INFO: start: 2024/10/22 10:39:15 From RetryExecute - Attempt 5 to connect to the instance at localhost:1433
Oct 22 10:39:15 fci(mssqlha)[399037]: INFO: start: 2024/10/22 10:39:15 Attempt 5 returned error: Unresponsive or down Unable to open tcp connection with host 'localhost:1433': dial tcp 127.0.0.1:1433: getsockopt: connection refused
Oct 22 10:39:16 fci(mssqlha)[399037]: INFO: start: 2024/10/22 10:39:16 From RetryExecute - Attempt 6 to connect to the instance at localhost:1433
Oct 22 10:39:16 fci(mssqlha)[399037]: INFO: start: 2024/10/22 10:39:16 Attempt 6 returned error: Unresponsive or down Unable to open tcp connection with host 'localhost:1433': dial tcp 127.0.0.1:1433: getsockopt: connection refused
Oct 22 10:39:17 fci(mssqlha)[399037]: INFO: start: 2024/10/22 10:39:17 From RetryExecute - Attempt 7 to connect to the instance at localhost:1433
Oct 22 10:39:17 fci(mssqlha)[399037]: INFO: start: 2024/10/22 10:39:17 Connected to the instance at localhost:1433
Oct 22 10:39:17 fci(mssqlha)[399037]: INFO: start: ERROR: 2024/10/22 10:39:17 Unexpected error: mssql: У пользователя нет разрешения на выполнение этого действия.
Oct 22 10:39:17 fci(mssqlha)[399037]: ERROR: 2024/10/22 10:39:17 Unexpected error: mssql: У пользователя нет разрешения на выполнение этого действия.
Сам sql сервер в логах пишет, что готов принимать сообщения, но потом выключается:
2024-10-22 10:39:16.68 spid20s A self-generated certificate was successfully loaded for encryption.
2024-10-22 10:39:16.69 spid20s Server is listening on [ 'any' <ipv6> 1433] accept sockets 1.
2024-10-22 10:39:16.69 spid20s Server is listening on [ 'any' <ipv4> 1433] accept sockets 1.
2024-10-22 10:39:16.70 Server Server is listening on [ ::1 <ipv6> 1434] accept sockets 1.
2024-10-22 10:39:16.70 Server Server is listening on [ 127.0.0.1 <ipv4> 1434] accept sockets 1.
2024-10-22 10:39:16.71 Server Dedicated admin connection support was established for listening locally on port 1434.
2024-10-22 10:39:16.71 spid20s Server is listening on [ ::1 <ipv6> 1431] accept sockets 1.
2024-10-22 10:39:16.72 spid20s Server is listening on [ 127.0.0.1 <ipv4> 1431] accept sockets 1.
2024-10-22 10:39:16.72 spid20s SQL Server is now ready for client connections. This is an informational message; no user action is required.
2024-10-22 10:39:16.82 spid8s Converting database 'model_replicatedmaster' from version 927 to the current version 957.
2024-10-22 10:39:16.82 spid8s Database 'model_replicatedmaster' running the upgrade step from version 927 to version 928.
2024-10-22 10:39:16.83 spid12s 0 transactions rolled back in database 'model' (3:0). This is an informational message only. No user action is required.
2024-10-22 10:39:17.11 spid8s Database 'model_replicatedmaster' running the upgrade step from version 928 to version 929.
2024-10-22 10:39:17.52 spid8s Database 'model_replicatedmaster' running the upgrade step from version 929 to version 930.
2024-10-22 10:39:17.89 spid8s Database 'model_replicatedmaster' running the upgrade step from version 930 to version 931.
2024-10-22 10:39:18.20 spid8s Database 'model_replicatedmaster' running the upgrade step from version 931 to version 932.
2024-10-22 10:39:18.61 spid8s Database 'model_replicatedmaster' running the upgrade step from version 932 to version 933.
2024-10-22 10:39:18.89 spid8s Database 'model_replicatedmaster' running the upgrade step from version 933 to version 934.
2024-10-22 10:39:19.18 spid8s Database 'model_replicatedmaster' running the upgrade step from version 934 to version 935.
2024-10-22 10:39:19.54 spid8s Database 'model_replicatedmaster' running the upgrade step from version 935 to version 936.
2024-10-22 10:39:19.77 spid8s Database 'model_replicatedmaster' running the upgrade step from version 936 to version 937.
2024-10-22 10:39:19.96 spid8s Database 'model_replicatedmaster' running the upgrade step from version 937 to version 938.
2024-10-22 10:39:20.13 spid8s Database 'model_replicatedmaster' running the upgrade step from version 938 to version 939.
2024-10-22 10:39:20.35 spid8s Database 'model_replicatedmaster' running the upgrade step from version 939 to version 940.
2024-10-22 10:39:20.56 spid8s Database 'model_replicatedmaster' running the upgrade step from version 940 to version 941.
2024-10-22 10:39:20.77 spid8s Database 'model_replicatedmaster' running the upgrade step from version 941 to version 942.
2024-10-22 10:39:20.95 spid8s Database 'model_replicatedmaster' running the upgrade step from version 942 to version 943.
2024-10-22 10:39:21.38 spid8s Database 'model_replicatedmaster' running the upgrade step from version 943 to version 944.
2024-10-22 10:39:21.54 spid8s Database 'model_replicatedmaster' running the upgrade step from version 944 to version 945.
2024-10-22 10:39:21.73 spid8s Database 'model_replicatedmaster' running the upgrade step from version 945 to version 946.
2024-10-22 10:39:21.89 spid8s Database 'model_replicatedmaster' running the upgrade step from version 946 to version 947.
2024-10-22 10:39:22.06 spid8s Database 'model_replicatedmaster' running the upgrade step from version 947 to version 948.
2024-10-22 10:39:22.23 spid8s Database 'model_replicatedmaster' running the upgrade step from version 948 to version 949.
2024-10-22 10:39:22.40 spid8s Database 'model_replicatedmaster' running the upgrade step from version 949 to version 950.
2024-10-22 10:39:22.58 spid8s Database 'model_replicatedmaster' running the upgrade step from version 950 to version 951.
2024-10-22 10:39:22.87 spid8s Database 'model_replicatedmaster' running the upgrade step from version 951 to version 952.
2024-10-22 10:39:23.07 spid8s Database 'model_replicatedmaster' running the upgrade step from version 952 to version 953.
2024-10-22 10:39:23.42 spid8s Database 'model_replicatedmaster' running the upgrade step from version 953 to version 954.
2024-10-22 10:39:23.67 spid8s Database 'model_replicatedmaster' running the upgrade step from version 954 to version 955.
2024-10-22 10:39:23.86 spid8s Database 'model_replicatedmaster' running the upgrade step from version 955 to version 956.
2024-10-22 10:39:24.06 spid8s Database 'model_replicatedmaster' running the upgrade step from version 956 to version 957.
Сервер при запуске под mssql запускается коррректно и к нему можно подключиться, то есть с правами проблем нет:
su - mssql
/opt/mssql/bin/sqlservr
Файрвол и seLinux выключен.
Не понимаю в чем проблема. Судя по логам sqlсервер запустился и готов принимать подключения, а ocf::mssql:fci не может к нему подключиться.
Заранее благодарю за содействие.
Перемещено hobbit из general