LINUX.ORG.RU

История изменений

Исправление HighMan, (текущая версия) :

Не. Ни фига проблема не решилась.

# patronictl -c /etc/patroni/patroni.yml list pg_cluster
+ Cluster: pg_cluster ------+--------------+---------+----+-----------+
| Member   | Host           | Role         | State   | TL | Lag in MB |
+----------+----------------+--------------+---------+----+-----------+
| host-100 | 192.168.12.100 | Leader       | running |  5 |           |
| host-101 | 192.168.12.101 | Sync Standby | running |  5 |         0 |
| host-102 | 192.168.12.102 | Replica      | running |  5 |         0 |
+----------+----------------+--------------+---------+----+-----------+

# systemctl status haproxy.service
....
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_master/host-100 is DOWN, reason: Layer7 invalid response, check duration: 2ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:03 host-100 haproxy[1568265]: [ALERT]    (1568265) : sendmsg()/writev() failed in logger #1: No such file or directory (errno=2)
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_master/host-101 is DOWN, reason: Layer7 invalid response, check duration: 5ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_master/host-102 is DOWN, reason: Layer7 invalid response, check duration: 4ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:03 host-100 haproxy[1568265]: [ALERT]    (1568265) : proxy 'postgres_master' has no server available!
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_replicas/host-100 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 10ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_replicas_sync/host-100 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 6ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_replicas_sync/host-102 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 12ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_replicas_async/host-100 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 7ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_replicas_async/host-101 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 11ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.

Как видно из вывода patronictl лидером является host-100, но шибанутый haproxy все равно выдает код 503.

Откуда он его берет - не ведомо.

# curl 192.168.12.100:8008/master -v
*   Trying 192.168.12.100:8008...
* Connected to 192.168.12.100 (192.168.12.100) port 8008
> GET /master HTTP/1.1
> Host: 192.168.12.100:8008
> User-Agent: curl/8.6.0
> Accept: */*
> 
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Server: BaseHTTP/0.6 Python/3.9.18
< Date: Mon, 01 Jul 2024 07:32:38 GMT
< Content-Type: application/json
< 
* Closing connection
{"state": "running", "postmaster_start_time": "2024-06-28 17:24:29.485522+03:00", "role": "master", "server_version": 120019, "xlog": {"location": 721420288}, "timeline": 5, "replication": [{"usename": "patroni_replica", "application_name": "host-101", "client_addr": "192.168.12.101", "state": "streaming", "sync_state": "sync", "sync_priority": 1}, {"usename": "patroni_replica", "application_name": "host-102", "client_addr": "192.168.12.102", "state": "streaming", "sync_state": "async", "sync_priority": 0}], "dcs_last_seen": 1719819151, "database_system_identifier": "7385553500470229262", "patroni": {"version": "3.0.2", "scope": "pg_cluster"}}

Код 200!

# curl 192.168.12.101:8008/master -v
*   Trying 192.168.12.101:8008...
* Connected to 192.168.12.101 (192.168.12.101) port 8008
> GET /master HTTP/1.1
> Host: 192.168.12.101:8008
> User-Agent: curl/8.6.0
> Accept: */*
> 
* HTTP 1.0, assume close after body
< HTTP/1.0 503 Service Unavailable
< Server: BaseHTTP/0.6 Python/3.9.18
< Date: Mon, 01 Jul 2024 07:33:48 GMT
< Content-Type: application/json
< 
* Closing connection
{"state": "running", "postmaster_start_time": "2024-06-28 17:23:36.056707+03:00", "role": "replica", "server_version": 120019, "xlog": {"received_location": 721420288, "replayed_location": 721420288, "replayed_timestamp": "2024-06-28 17:44:52.823814+03:00", "paused": false}, "sync_standby": true, "timeline": 5, "dcs_last_seen": 1719819221, "database_system_identifier": "7385553500470229262", "patroni": {"version": "3.0.2", "scope": "pg_cluster"}}

Вот тут честный код возврата 503, абы host-101 не является в данный момент лидером.

Если же курлануть curl 192.168.12.101:8008/sync -v, то код возврата будет 200, так как host-101(192.168.12.101) является Sync Standby

Исправление HighMan, :

Не. Ни фига проблема не решилась.

# patronictl -c /etc/patroni/patroni.yml list pg_cluster
+ Cluster: pg_cluster ------+--------------+---------+----+-----------+
| Member   | Host           | Role         | State   | TL | Lag in MB |
+----------+----------------+--------------+---------+----+-----------+
| host-100 | 192.168.12.100 | Leader       | running |  5 |           |
| host-101 | 192.168.12.101 | Sync Standby | running |  5 |         0 |
| host-102 | 192.168.12.102 | Replica      | running |  5 |         0 |
+----------+----------------+--------------+---------+----+-----------+

# systemctl status haproxy.service
....
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_master/host-100 is DOWN, reason: Layer7 invalid response, check duration: 2ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:03 host-100 haproxy[1568265]: [ALERT]    (1568265) : sendmsg()/writev() failed in logger #1: No such file or directory (errno=2)
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_master/host-101 is DOWN, reason: Layer7 invalid response, check duration: 5ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_master/host-102 is DOWN, reason: Layer7 invalid response, check duration: 4ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:03 host-100 haproxy[1568265]: [ALERT]    (1568265) : proxy 'postgres_master' has no server available!
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_replicas/host-100 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 10ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_replicas_sync/host-100 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 6ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_replicas_sync/host-102 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 12ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_replicas_async/host-100 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 7ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_replicas_async/host-101 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 11ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.

Как видно из вывода patronictl лидером является host-100, но шибанутый haproxy все равно выдает код 503.

Откуда он его берет - не ведомо.

# curl 192.168.12.100:8008/master -v
*   Trying 192.168.12.100:8008...
* Connected to 192.168.12.100 (192.168.12.100) port 8008
> GET /master HTTP/1.1
> Host: 192.168.12.100:8008
> User-Agent: curl/8.6.0
> Accept: */*
> 
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Server: BaseHTTP/0.6 Python/3.9.18
< Date: Mon, 01 Jul 2024 07:32:38 GMT
< Content-Type: application/json
< 
* Closing connection
{"state": "running", "postmaster_start_time": "2024-06-28 17:24:29.485522+03:00", "role": "master", "server_version": 120019, "xlog": {"location": 721420288}, "timeline": 5, "replication": [{"usename": "patroni_replica", "application_name": "host-101", "client_addr": "192.168.12.101", "state": "streaming", "sync_state": "sync", "sync_priority": 1}, {"usename": "patroni_replica", "application_name": "host-102", "client_addr": "192.168.12.102", "state": "streaming", "sync_state": "async", "sync_priority": 0}], "dcs_last_seen": 1719819151, "database_system_identifier": "7385553500470229262", "patroni": {"version": "3.0.2", "scope": "pg_cluster"}}

Код 200!

# curl 192.168.12.101:8008/master -v
*   Trying 192.168.12.101:8008...
* Connected to 192.168.12.101 (192.168.12.101) port 8008
> GET /master HTTP/1.1
> Host: 192.168.12.101:8008
> User-Agent: curl/8.6.0
> Accept: */*
> 
* HTTP 1.0, assume close after body
< HTTP/1.0 503 Service Unavailable
< Server: BaseHTTP/0.6 Python/3.9.18
< Date: Mon, 01 Jul 2024 07:33:48 GMT
< Content-Type: application/json
< 
* Closing connection
{"state": "running", "postmaster_start_time": "2024-06-28 17:23:36.056707+03:00", "role": "replica", "server_version": 120019, "xlog": {"received_location": 721420288, "replayed_location": 721420288, "replayed_timestamp": "2024-06-28 17:44:52.823814+03:00", "paused": false}, "sync_standby": true, "timeline": 5, "dcs_last_seen": 1719819221, "database_system_identifier": "7385553500470229262", "patroni": {"version": "3.0.2", "scope": "pg_cluster"}}

Вот тут честный код возврата 503, абы host-101 не является в данный момент лидером.

Исходная версия HighMan, :

Не. Ни фига проблема не решилась.

# patronictl -c /etc/patroni/patroni.yml list pg_cluster
+ Cluster: pg_cluster ------+--------------+---------+----+-----------+
| Member   | Host           | Role         | State   | TL | Lag in MB |
+----------+----------------+--------------+---------+----+-----------+
| host-100 | 192.168.12.100 | Leader       | running |  5 |           |
| host-101 | 192.168.12.101 | Sync Standby | running |  5 |         0 |
| host-102 | 192.168.12.102 | Replica      | running |  5 |         0 |
+----------+----------------+--------------+---------+----+-----------+

# systemctl status haproxy.service
....
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_master/host-100 is DOWN, reason: Layer7 invalid response, check duration: 2ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:03 host-100 haproxy[1568265]: [ALERT]    (1568265) : sendmsg()/writev() failed in logger #1: No such file or directory (errno=2)
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_master/host-101 is DOWN, reason: Layer7 invalid response, check duration: 5ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_master/host-102 is DOWN, reason: Layer7 invalid response, check duration: 4ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:03 host-100 haproxy[1568265]: [ALERT]    (1568265) : proxy 'postgres_master' has no server available!
Jul 01 10:27:03 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_replicas/host-100 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 10ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_replicas_sync/host-100 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 6ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_replicas_sync/host-102 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 12ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_replicas_async/host-100 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 7ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Jul 01 10:27:04 host-100 haproxy[1568265]: [WARNING]  (1568265) : Server postgres_replicas_async/host-101 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 11ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.

Как видно из вывода patronictl лидером является host-100, но шибанутый haproxy все равно выдает код 503.

Откуда он его берет - не ведомо.

# curl 192.168.12.100:8008/master -v
*   Trying 192.168.12.100:8008...
* Connected to 192.168.12.100 (192.168.12.100) port 8008
> GET /master HTTP/1.1
> Host: 192.168.12.100:8008
> User-Agent: curl/8.6.0
> Accept: */*
> 
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Server: BaseHTTP/0.6 Python/3.9.18
< Date: Mon, 01 Jul 2024 07:32:38 GMT
< Content-Type: application/json
< 
* Closing connection
{"state": "running", "postmaster_start_time": "2024-06-28 17:24:29.485522+03:00", "role": "master", "server_version": 120019, "xlog": {"location": 721420288}, "timeline": 5, "replication": [{"usename": "patroni_replica", "application_name": "host-101", "client_addr": "192.168.12.101", "state": "streaming", "sync_state": "sync", "sync_priority": 1}, {"usename": "patroni_replica", "application_name": "host-102", "client_addr": "192.168.12.102", "state": "streaming", "sync_state": "async", "sync_priority": 0}], "dcs_last_seen": 1719819151, "database_system_identifier": "7385553500470229262", "patroni": {"version": "3.0.2", "scope": "pg_cluster"}}

Код 200!

curl 192.168.12.101:8008/master -v

  • Trying 192.168.12.101:8008…
  • Connected to 192.168.12.101 (192.168.12.101) port 8008

GET /master HTTP/1.1 Host: 192.168.12.101:8008 User-Agent: curl/8.6.0 Accept: /

  • HTTP 1.0, assume close after body < HTTP/1.0 503 Service Unavailable < Server: BaseHTTP/0.6 Python/3.9.18 < Date: Mon, 01 Jul 2024 07:33:48 GMT < Content-Type: application/json <
  • Closing connection {«state»: «running», «postmaster_start_time»: «2024-06-28 17:23:36.056707+03:00», «role»: «replica», «server_version»: 120019, «xlog»: {«received_location»: 721420288, «replayed_location»: 721420288, «replayed_timestamp»: «2024-06-28 17:44:52.823814+03:00», «paused»: false}, «sync_standby»: true, «timeline»: 5, «dcs_last_seen»: 1719819221, «database_system_identifier»: «7385553500470229262», «patroni»: {«version»: «3.0.2», «scope»: «pg_cluster»}} Вот тут честный код возврата 503, абы host-101 не является в данный момент лидером.