Commit cd6a9c37 by luoqi

revert: 撤销就诊去重的本地时区 hack(1d2759dc)— 根因是宿主违约,PAC 不做兼容

复盘(用户纠正):王振中'就诊2次'根因不在算法,在宿主数据违约:
- 契约:该数仓所有时间 = 北京时间(naive)。PAC 一律按北京解析 → 存 UTC(PAC 内部恒 UTC)。
- 但 fact_appointment_out.in_time 被宿主存成 ClickHouse DateTime(UTC 瞬时,03:09 UTC=北京11:09),
  而非北京裸串。序列化丢 tz 标记 → PAC normalizeDatetime 当北京串又 −8 → UTC 19:09(慢8h)。
- 这是【宿主违约】(该列应北京 naive,却给了 UTC DateTime),宿主该改。
- PAC 不为错误数据在算法层塞时区兼容(本末倒置)。UTC 日去重本身没问题:正常诊所时段
  (北京08-24点=UTC00-16点)不跨 UTC 日,只有本例 in_time 被双转成凌晨才误跨日。

→ 算法恢复 UTC-pure;in_time 双转/契约违约记入 dw-data-source-issues.md,交宿主修。

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
parent 1d2759dc
...@@ -6,7 +6,6 @@ import type { ...@@ -6,7 +6,6 @@ import type {
FeatureExtractorContext, FeatureExtractorContext,
PersonaFeatureDraft, PersonaFeatureDraft,
} from './feature.interface'; } from './feature.interface';
import { localDayKey } from './visit-day.util';
/** /**
* lifecycle_stage 生命周期(B.1.4)— 规则层,snapshot * lifecycle_stage 生命周期(B.1.4)— 规则层,snapshot
...@@ -69,7 +68,7 @@ export class LifecycleStageFeatureExtractor implements FeatureExtractor { ...@@ -69,7 +68,7 @@ export class LifecycleStageFeatureExtractor implements FeatureExtractor {
for (const f of visitFacts) { for (const f of visitFacts) {
if (!f.occurredAt) continue; if (!f.occurredAt) continue;
const t = f.occurredAt.getTime(); const t = f.occurredAt.getTime();
days.add(localDayKey(f.occurredAt)); // 按诊所本地时区去重(避免 UTC 跨午夜虚增就诊次数) days.add(f.occurredAt.toISOString().slice(0, 10));
if (first === null || t < first) first = t; if (first === null || t < first) first = t;
if (last === null || t > last) last = t; if (last === null || t > last) last = t;
} }
......
...@@ -11,7 +11,6 @@ import type { ...@@ -11,7 +11,6 @@ import type {
FeatureExtractorContext, FeatureExtractorContext,
PersonaFeatureDraft, PersonaFeatureDraft,
} from './feature.interface'; } from './feature.interface';
import { localDayKey } from './visit-day.util';
/** /**
* rfm 价值分群(RFM 八象限 + 生命周期)— 统计层 * rfm 价值分群(RFM 八象限 + 生命周期)— 统计层
...@@ -127,7 +126,7 @@ export class RfmFeatureExtractor implements FeatureExtractor { ...@@ -127,7 +126,7 @@ export class RfmFeatureExtractor implements FeatureExtractor {
let firstVisit: Date | null = null; let firstVisit: Date | null = null;
for (const f of visitFacts) { for (const f of visitFacts) {
if (!f.occurredAt) continue; if (!f.occurredAt) continue;
visitDays.add(localDayKey(f.occurredAt)); // 按诊所本地时区去重(避免 UTC 跨午夜虚增频次) visitDays.add(f.occurredAt.toISOString().slice(0, 10));
if (!lastVisit || f.occurredAt > lastVisit) lastVisit = f.occurredAt; if (!lastVisit || f.occurredAt > lastVisit) lastVisit = f.occurredAt;
if (!firstVisit || f.occurredAt < firstVisit) firstVisit = f.occurredAt; if (!firstVisit || f.occurredAt < firstVisit) firstVisit = f.occurredAt;
} }
......
/// 就诊"自然日"去重工具(lifecycle 就诊次数 / RFM 频次共用)。
///
/// ⚠️ 不能用 occurredAt.toISOString()(UTC 日期)做去重:本地同一天、但 UTC 跨午夜的两个事件
/// (如 encounter 本地 09-26 03:09 = UTC 09-25 19:09,treatment 本地 09-26 11:10 = UTC 09-26 03:10)
/// 会被切成 09-25 / 09-26 两天 → 虚增就诊次数(王振中:实 1 次被算 2 次)。
/// 按诊所本地时区取 YYYY-MM-DD 才是"自然就诊日"。
export const CLINIC_TZ = 'Asia/Shanghai'; // 试点全为 +8;多租户后续从 platform.pullConfig.timezone 接进 feature ctx
/// 诊所本地时区下的就诊日 key(en-CA locale 输出即 YYYY-MM-DD)。
export function localDayKey(date: Date, tz: string = CLINIC_TZ): string {
return new Intl.DateTimeFormat('en-CA', { timeZone: tz }).format(date);
}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment