Files
HSAP/docs/DATA_LAKE_GAP.md
Chengfang Lu e72bc061c5 feat: HSAP platform v2 — modular navigation, quality review, audit log, world model simulation
Major changes:
- New frontend (platform/web/): Vite + React 18 + TypeScript + Tailwind
- 4-module navigation: 数据送标 / 模型管理 / 车队管理 / 系统管理
- Data catalog with charts (DMS/ADAS/Lane 3-tab view)
- Quality review workflow (标注质检): Good/Fine/Bad scoring with auto-advance
- Audit enhancements: batch operations, rejection categories, Feishu notifications
- Operation audit log (操作日志)
- World model simulation studio (仿真工坊)
- Dataset version management with snapshots and diff
- ADAS 7-class dataset integration (138K images organized + compressed)
- User management with Feishu integration and pagination
- CRUD/search/filter on all pages, card layout redesign
- PIL-optimized image overlay rendering
- Auto-snapshot on build, in_review workflow stage
- Removed embedded algorithm code (now in workspace)
2026-06-03 11:40:21 +08:00

37 lines
2.2 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 数据入湖清单 vs HSAP 实现差距
对照 [DATA_LAKE_CHECKLIST.md](DATA_LAKE_CHECKLIST.md) 阶段 AE 与当前 `as_platform` 能力。
| 阶段 | 清单要求 | HSAP 现状 | 差距 |
|------|----------|-----------|------|
| **A 上传接入** | zip/目录上传、进度、candidate_id | `POST /api/v1/data/upload/file``DatasetCandidate` 表;**analyzed 后可 `POST .../promote-inbox`** | 无统一 `lake/staging/` 路径约定;进度条依赖前端 upload |
| **A** | staging 区隔离 | 候选写入 DB + 磁盘路径 | 未强制 `lake/staging/<project>/<candidate_id>/` 目录规范 |
| **B 自动分析** | 上传后异步 quality worker | `inspect-upload`、部分 catalog 刷新 | 无独立 QualityWorker JobDMS/Lane 报告未统一落 `quality.json` |
| **B** | DMS/Lane 指标 | Catalog、`catalogDms`、validate 脚本 | Catalog 已展示采样指标(条/饼/竖柱/雷达/划分柱/散点/密度);**非**上传触发全自动 |
| **C 审核流** | 自动提交审核单 | `approvals``submit` API | 已有;与送标 register 联动 |
| **C** | 通过/驳回规范 | `approve`/`reject` | 已有 |
| **D 版本入湖** | 审核后晋级 curated | `ingest_incremental``register_batch` stage | **主路径在 ml.py/as.py**,非 candidate→lake 闸门 |
| **D** | catalog 索引更新 | `GET /catalog` refresh | 已有 |
| **E 运维安全** | 失败可读、重试 | Job 队列、approval 备注 | 部分;上传重试靠前端 |
## 已有可复用组件
- 数据候选:`platform/as_platform/db/models.py``DatasetCandidate`
- 上传 API`server.py``upload/file``inspect-upload`
- 审核:`audit/queue.py``/api/v1/approvals/*`
- 入湖 CLI`as.py build` / `add` + `ingest_incremental.py`
## 建议下一里程碑(未在本汇总 plan 全量实现)
1. 统一 staging 根目录与环境变量 `AS_LAKE_STAGING_ROOT`
2. 上传完成 → 入队 `quality_analyze` Job → 写 `quality.json`
3. 审核通过后调用现有 `ingest_incremental` 并更新 `batch.meta` stage
## 验收脚本
```bash
bash HSAP/scripts/smoke_manifest_alignment.sh
bash HSAP/scripts/smoke_platform_api.sh
curl -sS -H "Authorization: Bearer $TOKEN" http://127.0.0.1:8787/api/v1/pending/gates
```