Commit Graph

9 Commits

Author SHA1 Message Date
df3bf298ee W22: coverage metric + network tests + Tool stream feedback + stdin pipe + session path + dependency check (W22.1-W22.6)
Some checks failed
CI / Determine matrix (push) Has been cancelled
CI / ${{ matrix.os }} / ${{ matrix.build_type }} (push) Has been cancelled
CI / Sanitizer (ASan+UBSan) / ubuntu-24.04 (push) Has been cancelled
CI / Coverage (gcovr) / ubuntu-24.04 (push) Has been cancelled
- W22.1: gcovr 覆盖率度量 + CI coverage job(40% 阈值 warning)
- W22.2: network_plugin 单元测试(parse_headers_json/extract_host_port/SSE/异常保护)
- W22.3: Tool Calling 流式反馈(chat_stream + "[工具调用]/[工具结果]" 状态行)
- W22.4: --prompt stdin pipe(--prompt - 从 stdin 读取)
- W22.5: session 路径健壮化(static 缓存 + mkdir + fallback)
- W22.6: 插件依赖拓扑静态校验(validate_dependencies 循环/缺失检测)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 21:21:24 +08:00
b2b381b9b3 W21: anthropic Stream+Tools + --prompt batch + sanitizer fix + plugin unit tests (W21.1-W21.6)
Some checks failed
CI / Determine matrix (push) Has been cancelled
CI / ${{ matrix.os }} / ${{ matrix.build_type }} (push) Has been cancelled
CI / Sanitizer (ASan+UBSan) / ubuntu-24.04 (push) Has been cancelled
- W21.1: ci-sanitize preset 独立 Linux-clang + ci-threadsan (TSan)
- W21.2: anthropic tool_use content_block 解析 + configure 缓存 tools_json
- W21.3: --prompt 非交互批处理模式
- W21.4: session auto-save 失败告警 + 当前目录 fallback
- W21.5: smoke 补 tool_calls 边界用例 4 块 12 断言
- W21.6: anthropic 11 块 78 CHECK + deepseek 12 块 78 CHECK

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 20:40:58 +08:00
20ead86e88 W20: Tool Calling 闭环 + Stream+Tools + 回归测试 + session auto-save + ASan CI (W20.1-W20.6)
Some checks failed
CI / Determine matrix (push) Has been cancelled
CI / ${{ matrix.os }} / ${{ matrix.build_type }} (push) Has been cancelled
CI / Sanitizer (ASan+UBSan) / ubuntu-24.04 (push) Has been cancelled
- W20.1: CLI tool_calls→execute→result→re-call 循环(5轮上限)
- W20.2: deepseek 流式 tool_calls 增量解析(configure 缓存,无 ABI break)
- W20.3: plugin_loader 回归测试 5 块 32 断言(路径/原子性/mock 日志)
- W20.4: plugin_loader ABI 契约校验(name/version/on_init 字段验证)
- W20.5: ASan/UBSan CMake preset + CI sanitizer job(PR-only Linux)
- W20.6: session auto-save(on_shutdown 写 %APPDATA%/dstalk/session.json)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 20:15:00 +08:00
6f492489c6 W16: close CRITICAL/HIGH findings, integrate metadata gate, complete audit summaries (W16.1-W16.6)
Some checks failed
CI / Determine matrix (push) Has been cancelled
CI / ${{ matrix.os }} / ${{ matrix.build_type }} (push) Has been cancelled
- W16.1 (曹武): F-11.7-1 CLOSED — confirmed W12.4 fix, corrupt binary eliminated
- W16.2 (孙宇): F-11.1-1 FIXED — context_plugin.cpp try/catch on set_max_tokens + on_shutdown
- W16.3 (陈风): F-11.1-2 CLOSED — confirmed W12.1 fix, strdup OOM protection already in place
- W16.4 (胡桐): Integrate check_agents_metadata into refresh_status.py as pre-gate (error→exit 1)
- W16.5 (周岩): Add Findings Summary to W13.3 network audit, register 3 findings
- W16.6 (赵码): Add Findings Summary to W13.1+W13.2 AI audits, register 8 findings (4 already W14-fixed)

Build 0 error, ctest 4/4 pass, metadata check 0 error 0 warning.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 18:45:03 +08:00
0e41c8c6f6 W15: workflow improvements — EXPRESS fast-path, audit→fix closed loop, metadata self-check (W15.1-W15.3)
Some checks failed
CI / Determine matrix (push) Has been cancelled
CI / ${{ matrix.os }} / ${{ matrix.build_type }} (push) Has been cancelled
- W15.1 (杨帆): Add EXPRESS fast-path to §11 state machine (T17/T18, E1-E6 conditions, escalation safety valve)
- W15.2 (王测): Add §14 audit→fix closed loop — findings-registry.md, severity-driven auto-triage, CRITICAL blocking rule
- W15.3 (胡桐): Create scripts/check_agents_metadata.py (5-check: YAML parse, rating range, group/member refs, duplicate IDs)
- Fix YAML orphan bugs in 3 profiles: devops-hu, engineer-sun, security-cao (perf_log entries outside array)
- Pre-fill findings-registry.md with 10 historical findings from W11.1/W11.7 audits

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 18:19:37 +08:00
102cd3e141 Harden plugin runtime: TLS verify, LSP deadlock, path traversal, ABI exception safety (W14)
Some checks failed
CI / Determine matrix (push) Has been cancelled
CI / ${{ matrix.os }} / ${{ matrix.build_type }} (push) Has been cancelled
W14 addresses the five most critical findings from the W13 plugin audits:

- W14.1 network: enable ssl::verify_peer + SSL_set1_host SNI hostname
  verification (fixes TLS bypass, W13.3 CVSS 7.4); add steady_timer DNS
  timeout and bottom-up catch(...) hardening (engineer-zhou)
- W14.2 lsp: fix reader_loop/stop mutex deadlock via stop_nolock/stop_locked
  split (W13.4); wrap 11 vtable/entry functions in try/catch with cv
  notification on reader exit (engineer-sun)
- W14.3 tools: add is_safe_path() rejecting empty/absolute/.. paths before
  file_io calls (fixes path traversal, W13.5 CVSS 7.5); guard g_tools and
  g_session/g_history under mutex; 9 vtable try/catch (security-cao)
- W14.4 host: add fallback plugin search (../plugins/) so binaries run from
  build/tests/ load current DLLs, resolving the W13.6 R2 stale-DLL false
  alarm (architect-lin)
- W14.5 anthropic+deepseek: wrap 12 ABI boundary functions in try/catch with
  log-guard, preventing exceptions from crossing the C ABI (engineer-chen)

Verified: cmake build 0 error 0 warning, ctest 4/4 pass, smoke R2 now
passes naturally.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 12:03:50 +08:00
47082376ef Wave 10: deep audits of 5 unaudited plugins, smoke regression set (W13.1-W13.6)
Some checks failed
CI / Determine matrix (push) Has been cancelled
CI / ${{ matrix.os }} / ${{ matrix.build_type }} (push) Has been cancelled
- W13.1 anthropic_plugin (architect-yang, 497 lines): rated C. 6 C ABI
  functions lack try/catch (§8 violation); my_chat leaks response_body on
  error path; tool_use response silently dropped.
- W13.2 deepseek_plugin (engineer-sun, 486 lines): rated C+. 7 ABI entries
  unprotected including json::parse paths (malformed JSON terminates);
  SSE [DONE] sentinel match brittle; ~55% code overlap with anthropic
  suggests an ai_plugin_base extraction.
- W13.3 network_plugin (qa-wang, 322 lines): rated C. CRITICAL: TLS
  certificate verification fully disabled (set_verify_mode never called,
  default verify_none accepts any cert) — all AI traffic incl. api_key
  is MITM-vulnerable. DNS resolve has no timeout; catch lacks (...).
- W13.4 lsp_plugin (architect-huang, 749 lines): rated C. CRITICAL:
  guaranteed deadlock at L519-526 → L547 (g_lsp_impl_start holds mutex
  then calls g_lsp_impl_stop which re-locks the same non-recursive
  mutex); 7 vtable funcs unprotected; server→client requests dropped.
- W13.5 session+tools (security-cao, 264+251 lines): rated D+/D. Path
  traversal in builtin_file_read/write (zero validation); global
  static state in both plugins lacks mutex (UAF risk); 9 vtable funcs
  lack try/catch.
- W13.6 smoke regression (qa-xu, +193 lines): 4 new cases — context
  max_tokens trim, config dual-store consistency (exposes that W12.2
  merge is incomplete: dstalk_config_set→config_service.get returns
  null), HTTP error path no-crash, repeated init/shutdown cycle.

Verified: cmake build 0 error 0 warning, ctest 4/4 pass.

Top W14 priorities surfaced: TLS verification (W13.3), LSP deadlock
(W13.4), file-tool path traversal (W13.5), config dual-store still
broken (W13.6 R2), shared try/catch wrapper across all AI plugins.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-05-27 09:32:13 +08:00
5766938524 Wave 5+6: plugin ABI hardening, build modernization, ABI/security docs
Some checks failed
CI / Determine matrix (push) Has been cancelled
CI / ${{ matrix.os }} / ${{ matrix.build_type }} (push) Has been cancelled
Wave 5 (9 parallel agents):
- W1.1 atomic diag callback + DLL handle release on shutdown (lin)
- W2.1 unify cross-DLL heap discipline (host->alloc/free/strdup) (chen)
- W2.2 secure_zero api_key on shutdown for deepseek/anthropic (cao)
- W3 CMake modernization: target-based cxx_std_20, dstalk_boost_config
  INTERFACE lib, root-level RUNTIME_OUTPUT_DIRECTORY (hu)
- W4 GitHub Actions CI with dynamic Linux/Windows matrix (ma)
- W5.1 SSE buffer_body to cut peak memory ~67% on 32K streams (zhou)
- W6.1 LSP JSON-RPC frame parser hardened against header reordering (sun)
- W7 smoke test: copy plugin DLLs post-build + Boost.JSON src.hpp fix
  for full 9-plugin load coverage (wang)
- W8.1 README slimmed 398->92, Diataxis docs/ skeleton (deng)

Wave 6 (6 parallel agents):
- W9.1 docs/explanation: architecture + plugin-lifecycle (deng)
- W9.3 log credential leak audit (0 vulns, audit trail in
  docs/explanation/security-logging.md) (cao)
- W9.4 docs/reference/plugin-abi.md - 7-point ABI contract (lin)
- W9.6 CLI /history command + status integration (zhao)
- W9.8 plugin_loader fault tolerance: per-plugin failure no longer
  aborts dstalk_init (huang)
- W9.10 host_api unit tests: tests/host_api_test.cpp, 8 cases (liu)

CEO oversight (preexisting bugs fixed during Wave 5 verification):
- lsp_plugin.cpp:449 forward decl mismatch (int vs void)
- tools_plugin.cpp:109 missing forward decl

Multi-agent collaboration framework:
- agents/WORKFLOW.md: 6-stage protocol, two-tier governance,
  prompt template, technical constraints registry

Build: cmake --build 0 error / 0 warning. Tests: 2/2 100% pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-05-27 05:39:10 +08:00
4433218853 Add multi-agent collaboration system with 16-person team and two-tier governance
- agents/README.md documents company principles (first principles + practical
  delivery), 6-stage collaboration flow, and two-tier governance: CEO has
  highest priority and final say; work groups self-govern internally for
  staffing, scheduling, technical choices within CEO-defined boundaries.
- 16 employees recruited to match CPU physical core count, enabling up to
  16 subagents to run in parallel. Each profile.md has independent name,
  background, strengths, weaknesses, and performance log.
- Roles: 1 CEO, 3 architects (lin/yang/huang), 5 engineers (zhao/chen/li/
  zhou/sun), 3 QA (wang/liu/xu), 2 DevOps (ma/hu), 1 designer (zhu),
  1 writer (deng), 1 security (cao).
- Five working groups defined under agents/groups/: grp-quality-core,
  grp-ai-plugins, grp-cli-ux (B3), grp-build-matrix, grp-security-audit.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-05-27 05:13:12 +08:00