eval-audit
hamelsmu · Other
审计LLM评估流水线并揭示问题:缺少错误分析、未验证的评判者、虚荣指标等。适用于接手评估系统或需要改进时
Audit an LLM eval pipeline and surface problems: missing error analysis, unvalidated judges, vanity metrics, etc. Use when inheriting an eval system, when
npx skills add https://github.com/hamelsmu/evals-skills --skill eval-audit
星标 1374 · 安装量 0