Commit 53be5136 by luoqi

fix(sync): CH 重试补 ETIMEDOUT 等 socket 级网络错误(高并发打远程 DW 超时)

concurrency=5 时每批 5 表并行 → 25 个并行 CH 查询打远程阿里云 DW,公网扛不住
→ read ETIMEDOUT → 整轮 fatal abort(并连锁触发 teardown 中 Prisma "Engine is not
yet connected")。根因:queryJsonWithRetry 瞬时正则只匹配 "timeout",不匹配
"ETIMEDOUT"(无该子串)→ 没重试就直接抛。

补:ETIMEDOUT / ECONNREFUSED / EHOSTUNREACH / ENETUNREACH / EPIPE / "HTTP request
error" / "fetch failed" 进瞬时白名单,可退避重试。

注:并发本身仍是 DW 限制 —— 实测 concurrency=3(15 并行)稳,5(25 并行)超时。
此 host 走 concurrency=3 为上限;重试只兜偶发抖动,不是提并发的许可证。

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
parent d9a77a65
...@@ -64,9 +64,11 @@ export class ClickHouseSourceService { ...@@ -64,9 +64,11 @@ export class ClickHouseSourceService {
} catch (err) { } catch (err) {
lastErr = err; lastErr = err;
const msg = err instanceof Error ? err.message : String(err); const msg = err instanceof Error ? err.message : String(err);
// 确定性错误(SQL 错 / 权限)不重试 — 重试也没用,快速失败 // 确定性错误(SQL 错 / 权限)不重试 — 重试也没用,快速失败。
// ⚠️ ETIMEDOUT 不含 "timeout" 子串 —— 必须显式列(高并发打远程 DW 时 socket read 超时,
// 曾导致整轮 fatal abort 而非重试)。一并补 socket 级网络错误码 + CH client "HTTP request error"。
const transient = const transient =
/empty|timeout|timed out|ECONNRESET|socket|network|EAI_AGAIN|503|502|too many|reset by peer/i.test( /empty|timeout|timed out|ETIMEDOUT|ECONNRESET|ECONNREFUSED|EHOSTUNREACH|ENETUNREACH|EPIPE|socket|network|HTTP request error|fetch failed|EAI_AGAIN|503|502|too many|reset by peer/i.test(
msg, msg,
); );
if (!transient || attempt === maxAttempts) { if (!transient || attempt === maxAttempts) {
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment