Commit 87741f59 by luoqi

fix(deploy): 确定性生产部署脚本 — 根治 compose "不切新镜像" 缺陷

docker compose `up -d --build` 有跨版本反复复发的已知缺陷(docker/compose#9308 #9259):
镜像构建成功但运行中容器不重建(仍跑旧镜像)。本项目 2026-06-10 部署实测中招:
service 容器还在旧镜像 → migrate 报 "no pending" → phone_verified 列没建。

根治思路:生产部署不依赖 compose 的 diff 启发式判定。deploy/deploy-prod.sh:
  git pull --ff-only → 显式 build → up -d --force-recreate → 三道硬验证
  (① 容器镜像 ID == 新构建镜像 ID;② migrate exit 0 + migrate status 无 pending;
   ③ service health 200 + web 200),任一不过即非零退出。
README 把日常更新指到脚本,并警告勿直接 up -d --build。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
parent cfaca4c4
......@@ -32,8 +32,14 @@ curl http://127.0.0.1:3100
日常操作:
```bash
# 更新代码 + 重建
git pull && docker compose -f docker-compose.prod.yml --env-file apps/pac-service/.env --env-file apps/pac-web/.env up -d --build
# 更新代码 + 重建 + 验证(推荐 —— 确定性部署脚本)
# ⚠️ 不要直接用 `up -d --build`:compose 有已知缺陷(docker/compose#9308 #9259),
# 会出现"镜像建了但容器不切新镜像"(2026-06-10 实测中招)。脚本显式 build +
# force-recreate + 部署后硬验证(镜像ID/迁移/health),任一不过即失败。
bash deploy/deploy-prod.sh
# (不推荐)手动等价步骤
git pull && docker compose -f docker-compose.prod.yml --env-file apps/pac-service/.env --env-file apps/pac-web/.env up -d --build --force-recreate
# 看日志
docker compose -f docker-compose.prod.yml logs -f pac-service
......
#!/usr/bin/env bash
# PAC 生产部署(确定性,不信 compose 的 diff 启发式)
#
# 背景:docker compose `up -d --build` 存在跨版本反复复发的已知缺陷 —— 镜像构建了但
# 运行中容器不切新镜像(docker/compose#9308 #9259 等;本项目 2026-06-10 实测中招:
# 服务还跑旧镜像 → migrate 报 "no pending" → 新列没建)。
# 根治:生产部署不依赖 compose 的"要不要重建"判定 —— 显式 build → 显式 force-recreate
# → 部署后硬验证(容器镜像 ID == 新镜像 ID / 迁移无 pending / health 200),任一不过即失败退出。
#
# 用法(服务器上,~/pac):
# bash deploy/deploy-prod.sh # git pull + build + 切换 + 验证
# bash deploy/deploy-prod.sh --no-pull # 跳过 git pull(代码已就位时)
set -euo pipefail
cd "$(dirname "$0")/.."
COMPOSE=(docker compose -f docker-compose.prod.yml
--env-file apps/pac-service/.env --env-file apps/pac-web/.env)
SERVICES=(pac-migrate pac-service pac-web)
log() { printf '\n\033[1;36m== %s ==\033[0m\n' "$*"; }
die() { printf '\033[1;31mFAIL: %s\033[0m\n' "$*" >&2; exit 1; }
if [[ "${1:-}" != "--no-pull" ]]; then
log "git pull --ff-only"
git pull --ff-only origin main
fi
log "HEAD: $(git log --oneline -1)"
log "build 镜像(显式,不和 up 混)"
"${COMPOSE[@]}" build "${SERVICES[@]}"
log "force-recreate(不信 compose 的重建判定)"
"${COMPOSE[@]}" up -d --force-recreate "${SERVICES[@]}"
# ── 部署后硬验证:任一不过 = 部署失败 ─────────────────────────────
log "验证 1/3:容器跑的镜像 == 刚构建的镜像"
proj=$(basename "$PWD")
for svc in pac-service pac-web; do
want=$(docker image inspect "${proj}-${svc}" --format '{{.Id}}')
got=$(docker inspect "${proj}-${svc}-1" --format '{{.Image}}')
[[ "$want" == "$got" ]] || die "$svc 容器仍在旧镜像(want ${want:7:12} got ${got:7:12})— compose 又没重建"
echo " $svc OK (${want:7:12})"
done
log "验证 2/3:数据库迁移无 pending(migrate 容器 exit 0 且 deploy 干净)"
mexit=$(docker inspect "${proj}-pac-migrate-1" --format '{{.State.ExitCode}}')
[[ "$mexit" == "0" ]] || die "migrate 容器 exit=$mexit(看 docker logs ${proj}-pac-migrate-1)"
docker exec "${proj}-pac-service-1" sh -c 'npx prisma migrate status 2>&1' | tail -3 \
| grep -qiE 'up to date|database schema is up to date' || die "迁移有 pending / 状态异常"
echo " migrate OK"
log "验证 3/3:health + web"
for i in $(seq 1 30); do
code=$(curl -s -m 5 -o /dev/null -w '%{http_code}' http://127.0.0.1:3101/pac/v1/health || true)
[[ "$code" == "200" ]] && break; sleep 2
done
[[ "${code:-}" == "200" ]] || die "service health=$code"
webcode=$(curl -s -m 30 -o /dev/null -w '%{http_code}' http://127.0.0.1:3100/ || true)
[[ "$webcode" == "200" ]] || die "web http=$webcode"
echo " health 200 / web 200"
log "部署成功 ✅ $(git log --oneline -1)"
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment