eval-audit

hamelsmu · Other

审计LLM评估流水线并揭示问题:缺少错误分析、未验证的评判者、虚荣指标等。适用于接手评估系统或需要改进时

Audit an LLM eval pipeline and surface problems: missing error analysis, unvalidated judges, vanity metrics, etc. Use when inheriting an eval system, when

npx skills add https://github.com/hamelsmu/evals-skills --skill eval-audit

星标 1374 · 安装量 0

GitHub · SkillBox 全部技能