Release 1.4.12 (2026-04-30)¶
Bug fixes¶
- remove uv from HOST_TOOLS to unblock st-docker-run -- uv run
Documentation¶
- add v2.0 skill rearchitecture spec and pushback review
- add TDD testing harness design spec for skill testing
-
revise TDD harness spec with pushback resolutions Apply all 9 pushback resolutions: make DeepEval role explicit as implementation framework, replace brittle string matching with GEval semantic evaluation, document test environment divergence and integration test milestone, elevate provider flexibility to core requirement, reframe RED-GREEN cycle with skill-gap-first methodology, add Python infrastructure bootstrap section, acknowledge non-deterministic testing honestly, remove overspecified result storage schema, restructure scenario YAML around DeepEval primitives.
-
add TDD testing harness pilot implementation plan
- align spec and plan with alignment review resolutions Apply 4 alignment resolutions: clarify two-model distinction (test subject vs GEval judge) in spec provider section, update Python version to 3.14+ in both documents, add standard-tooling cross-reference to plan Task 1, add GEval threshold calibration note to plan Task 5.
Features¶
- use st-wait-until-green in pr-workflow, add to host tools