История изменений
Исправление HighMan, (текущая версия) :
Не. Ни фига проблема не решилась.
# patronictl -c /etc/patroni/patroni.yml list pg_cluster
+ Cluster: pg_cluster ------+--------------+---------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+----------+----------------+--------------+---------+----+-----------+
| host-100 | 192.168.12.100 | Leader | running | 5 | |
| host-101 | 192.168.12.101 | Sync Standby | running | 5 | 0 |
| host-102 | 192.168.12.102 | Replica | running | 5 | 0 |
+----------+----------------+--------------+---------+----+-----------+
# systemctl status haproxy.service
....
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_master/host-100 is DOWN, reason: Layer7 invalid response, check duration: 2ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:03 host-100 haproxy[1568265]: [ALERT] (1568265) : sendmsg()/writev() failed in logger #1: No such file or directory (errno=2)
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_master/host-101 is DOWN, reason: Layer7 invalid response, check duration: 5ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_master/host-102 is DOWN, reason: Layer7 invalid response, check duration: 4ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:03 host-100 haproxy[1568265]: [ALERT] (1568265) : proxy 'postgres_master' has no server available!
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_replicas/host-100 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 10ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_replicas_sync/host-100 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 6ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_replicas_sync/host-102 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 12ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_replicas_async/host-100 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 7ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_replicas_async/host-101 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 11ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Как видно из вывода patronictl лидером является host-100, но шибанутый haproxy все равно выдает код 503.
Откуда он его берет - не ведомо.
# curl 192.168.12.100:8008/master -v
* Trying 192.168.12.100:8008...
* Connected to 192.168.12.100 (192.168.12.100) port 8008
> GET /master HTTP/1.1
> Host: 192.168.12.100:8008
> User-Agent: curl/8.6.0
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Server: BaseHTTP/0.6 Python/3.9.18
< Date: Mon, 01 Jul 2024 07:32:38 GMT
< Content-Type: application/json
<
* Closing connection
{"state": "running", "postmaster_start_time": "2024-06-28 17:24:29.485522+03:00", "role": "master", "server_version": 120019, "xlog": {"location": 721420288}, "timeline": 5, "replication": [{"usename": "patroni_replica", "application_name": "host-101", "client_addr": "192.168.12.101", "state": "streaming", "sync_state": "sync", "sync_priority": 1}, {"usename": "patroni_replica", "application_name": "host-102", "client_addr": "192.168.12.102", "state": "streaming", "sync_state": "async", "sync_priority": 0}], "dcs_last_seen": 1719819151, "database_system_identifier": "7385553500470229262", "patroni": {"version": "3.0.2", "scope": "pg_cluster"}}
Код 200!
# curl 192.168.12.101:8008/master -v
* Trying 192.168.12.101:8008...
* Connected to 192.168.12.101 (192.168.12.101) port 8008
> GET /master HTTP/1.1
> Host: 192.168.12.101:8008
> User-Agent: curl/8.6.0
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 503 Service Unavailable
< Server: BaseHTTP/0.6 Python/3.9.18
< Date: Mon, 01 Jul 2024 07:33:48 GMT
< Content-Type: application/json
<
* Closing connection
{"state": "running", "postmaster_start_time": "2024-06-28 17:23:36.056707+03:00", "role": "replica", "server_version": 120019, "xlog": {"received_location": 721420288, "replayed_location": 721420288, "replayed_timestamp": "2024-06-28 17:44:52.823814+03:00", "paused": false}, "sync_standby": true, "timeline": 5, "dcs_last_seen": 1719819221, "database_system_identifier": "7385553500470229262", "patroni": {"version": "3.0.2", "scope": "pg_cluster"}}
Вот тут честный код возврата 503, абы host-101 не является в данный момент лидером.
Если же курлануть curl 192.168.12.101:8008/sync -v, то код возврата будет 200, так как host-101(192.168.12.101) является Sync Standby
Исправление HighMan, :
Не. Ни фига проблема не решилась.
# patronictl -c /etc/patroni/patroni.yml list pg_cluster
+ Cluster: pg_cluster ------+--------------+---------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+----------+----------------+--------------+---------+----+-----------+
| host-100 | 192.168.12.100 | Leader | running | 5 | |
| host-101 | 192.168.12.101 | Sync Standby | running | 5 | 0 |
| host-102 | 192.168.12.102 | Replica | running | 5 | 0 |
+----------+----------------+--------------+---------+----+-----------+
# systemctl status haproxy.service
....
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_master/host-100 is DOWN, reason: Layer7 invalid response, check duration: 2ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:03 host-100 haproxy[1568265]: [ALERT] (1568265) : sendmsg()/writev() failed in logger #1: No such file or directory (errno=2)
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_master/host-101 is DOWN, reason: Layer7 invalid response, check duration: 5ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_master/host-102 is DOWN, reason: Layer7 invalid response, check duration: 4ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:03 host-100 haproxy[1568265]: [ALERT] (1568265) : proxy 'postgres_master' has no server available!
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_replicas/host-100 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 10ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_replicas_sync/host-100 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 6ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_replicas_sync/host-102 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 12ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_replicas_async/host-100 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 7ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_replicas_async/host-101 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 11ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Как видно из вывода patronictl лидером является host-100, но шибанутый haproxy все равно выдает код 503.
Откуда он его берет - не ведомо.
# curl 192.168.12.100:8008/master -v
* Trying 192.168.12.100:8008...
* Connected to 192.168.12.100 (192.168.12.100) port 8008
> GET /master HTTP/1.1
> Host: 192.168.12.100:8008
> User-Agent: curl/8.6.0
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Server: BaseHTTP/0.6 Python/3.9.18
< Date: Mon, 01 Jul 2024 07:32:38 GMT
< Content-Type: application/json
<
* Closing connection
{"state": "running", "postmaster_start_time": "2024-06-28 17:24:29.485522+03:00", "role": "master", "server_version": 120019, "xlog": {"location": 721420288}, "timeline": 5, "replication": [{"usename": "patroni_replica", "application_name": "host-101", "client_addr": "192.168.12.101", "state": "streaming", "sync_state": "sync", "sync_priority": 1}, {"usename": "patroni_replica", "application_name": "host-102", "client_addr": "192.168.12.102", "state": "streaming", "sync_state": "async", "sync_priority": 0}], "dcs_last_seen": 1719819151, "database_system_identifier": "7385553500470229262", "patroni": {"version": "3.0.2", "scope": "pg_cluster"}}
Код 200!
# curl 192.168.12.101:8008/master -v
* Trying 192.168.12.101:8008...
* Connected to 192.168.12.101 (192.168.12.101) port 8008
> GET /master HTTP/1.1
> Host: 192.168.12.101:8008
> User-Agent: curl/8.6.0
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 503 Service Unavailable
< Server: BaseHTTP/0.6 Python/3.9.18
< Date: Mon, 01 Jul 2024 07:33:48 GMT
< Content-Type: application/json
<
* Closing connection
{"state": "running", "postmaster_start_time": "2024-06-28 17:23:36.056707+03:00", "role": "replica", "server_version": 120019, "xlog": {"received_location": 721420288, "replayed_location": 721420288, "replayed_timestamp": "2024-06-28 17:44:52.823814+03:00", "paused": false}, "sync_standby": true, "timeline": 5, "dcs_last_seen": 1719819221, "database_system_identifier": "7385553500470229262", "patroni": {"version": "3.0.2", "scope": "pg_cluster"}}
Вот тут честный код возврата 503, абы host-101 не является в данный момент лидером.
Исходная версия HighMan, :
Не. Ни фига проблема не решилась.
# patronictl -c /etc/patroni/patroni.yml list pg_cluster
+ Cluster: pg_cluster ------+--------------+---------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+----------+----------------+--------------+---------+----+-----------+
| host-100 | 192.168.12.100 | Leader | running | 5 | |
| host-101 | 192.168.12.101 | Sync Standby | running | 5 | 0 |
| host-102 | 192.168.12.102 | Replica | running | 5 | 0 |
+----------+----------------+--------------+---------+----+-----------+
# systemctl status haproxy.service
....
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_master/host-100 is DOWN, reason: Layer7 invalid response, check duration: 2ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:03 host-100 haproxy[1568265]: [ALERT] (1568265) : sendmsg()/writev() failed in logger #1: No such file or directory (errno=2)
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_master/host-101 is DOWN, reason: Layer7 invalid response, check duration: 5ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_master/host-102 is DOWN, reason: Layer7 invalid response, check duration: 4ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:03 host-100 haproxy[1568265]: [ALERT] (1568265) : proxy 'postgres_master' has no server available!
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_replicas/host-100 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 10ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_replicas_sync/host-100 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 6ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_replicas_sync/host-102 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 12ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_replicas_async/host-100 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 7ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING] (1568265) : Server postgres_replicas_async/host-101 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 11ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Как видно из вывода patronictl лидером является host-100, но шибанутый haproxy все равно выдает код 503.
Откуда он его берет - не ведомо.
# curl 192.168.12.100:8008/master -v
* Trying 192.168.12.100:8008...
* Connected to 192.168.12.100 (192.168.12.100) port 8008
> GET /master HTTP/1.1
> Host: 192.168.12.100:8008
> User-Agent: curl/8.6.0
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Server: BaseHTTP/0.6 Python/3.9.18
< Date: Mon, 01 Jul 2024 07:32:38 GMT
< Content-Type: application/json
<
* Closing connection
{"state": "running", "postmaster_start_time": "2024-06-28 17:24:29.485522+03:00", "role": "master", "server_version": 120019, "xlog": {"location": 721420288}, "timeline": 5, "replication": [{"usename": "patroni_replica", "application_name": "host-101", "client_addr": "192.168.12.101", "state": "streaming", "sync_state": "sync", "sync_priority": 1}, {"usename": "patroni_replica", "application_name": "host-102", "client_addr": "192.168.12.102", "state": "streaming", "sync_state": "async", "sync_priority": 0}], "dcs_last_seen": 1719819151, "database_system_identifier": "7385553500470229262", "patroni": {"version": "3.0.2", "scope": "pg_cluster"}}
Код 200!
curl 192.168.12.101:8008/master -v
- Trying 192.168.12.101:8008…
- Connected to 192.168.12.101 (192.168.12.101) port 8008
GET /master HTTP/1.1 Host: 192.168.12.101:8008 User-Agent: curl/8.6.0 Accept: /
- HTTP 1.0, assume close after body < HTTP/1.0 503 Service Unavailable < Server: BaseHTTP/0.6 Python/3.9.18 < Date: Mon, 01 Jul 2024 07:33:48 GMT < Content-Type: application/json <
- Closing connection {«state»: «running», «postmaster_start_time»: «2024-06-28 17:23:36.056707+03:00», «role»: «replica», «server_version»: 120019, «xlog»: {«received_location»: 721420288, «replayed_location»: 721420288, «replayed_timestamp»: «2024-06-28 17:44:52.823814+03:00», «paused»: false}, «sync_standby»: true, «timeline»: 5, «dcs_last_seen»: 1719819221, «database_system_identifier»: «7385553500470229262», «patroni»: {«version»: «3.0.2», «scope»: «pg_cluster»}} Вот тут честный код возврата 503, абы host-101 не является в данный момент лидером.