Conversation
|
/coder-agents-review |
1 similar comment
|
/coder-agents-review |
There was a problem hiding this comment.
Well-structured cross-cutting streaming feature. The callback chain from nested chatloop through runner/tool/chatd to the websocket is clean, each layer adds exactly one piece of context, and the persistence boundary is correctly maintained (deltas are transient, final result is authoritative). Test coverage spans every layer: unit, integration, visual, and a negative persistence assertion. The generic result_delta / isStreaming infrastructure is reusable if other tools adopt streaming later.
2 P3, 2 P4, 5 Nit. The P3s are about non-happy-path streaming behavior: retry-induced duplicate deltas and missing test coverage for the error-after-partial-deltas path. Everything else is cosmetic or observational.
"'RunAdvisorOptions configures one advisor invocation.' is the kind of comment that would make me flag '// GetUser gets a user' on a hospital chart." (Leorio)
codersdk/chats.go:321-323
Nit [DEREM-6] The StripInternal comment says ResultDelta is produced by "processStepStream in chatloop," but this PR now produces ResultDelta via PublishAdviceDelta in chatd.go:7063. Before this PR the field was never populated so the reference was harmlessly wrong; now it will send someone looking in the wrong file.
Something like: (see processStepStream in chatloop for ArgsDelta, PublishAdviceDelta in chatd for ResultDelta). (Gon)
🤖
🤖 This review was automatically generated with Coder Agents.
| return nil | ||
| }, | ||
| } | ||
| if runOpts.OnAdviceDelta != nil { |
There was a problem hiding this comment.
P3 [DEREM-1] On chatloop retry, the client accumulates duplicate advisor deltas with no reset signal.
RunAdvisor sets PublishMessagePart to forward text deltas via OnAdviceDelta, but never sets OnRetry on the chatloop options. When chatretry.Retry fires (rate limit, provider error), result is reset server-side (chatloop.go:465) and the step function re-runs, but no signal reaches the client. The second attempt's deltas append to the first attempt's accumulated text, producing garbled output (e.g. "Use small s...Use small steps.") in the advisor card during the retry window.
"The final tool result overwrites the accumulated text when it arrives, because
mergeStreamPayloadtakes the value path whenpart.resultis defined. So this is a transient visual glitch, not data corruption. But during the retry window (which includes a backoff delay), the user stares at garbled advisor text." (Hisoka)
Fix: wire OnRetry in RunAdvisor to either publish a reset event the client can use to clear accumulated delta text, or track and reset accumulated state inside the PublishMessagePart closure. (Hisoka)
🤖
| }); | ||
| }); | ||
|
|
||
| it("accumulates tool result deltas until a final result arrives", () => { |
There was a problem hiding this comment.
P3 [DEREM-2] No test exercises the error-after-partial-streaming-deltas path.
If the advisor streams text deltas then the provider errors, the frontend receives result_delta events followed by a final tool-result with is_error: true. The backend path (partial OnAdviceDelta calls followed by chatloop.Run returning an error, which becomes a ResultTypeError JSON result) and the frontend transition (mergeStreamPayload overrides accumulated deltas, isStreaming clears) are both correct by design but neither has a test. A regression in value-override-delta precedence or isStreaming cleanup would be invisible.
"A regression in the value-override-delta precedence or the isStreaming cleanup would cause partial advice text to render alongside the error UI." (Chopper)
A sibling test that streams partial deltas then delivers an error result (both here and in runner_test.go) would close this gap. (Chopper)
🤖
| ); | ||
| const nextToolName = part.tool_name || existing?.name || "Tool"; | ||
| const isFinalResult = part.result !== undefined; | ||
| const isStreaming = isFinalResult |
There was a problem hiding this comment.
P4 [DEREM-3] isStreaming is only cleared when part.result is set; a bare is_error: true part without result would leave the tool stuck in "running" status.
"A tool-result part with is_error: true but no result and no result_delta, arriving after prior deltas set isStreaming: true, would leave the tool in 'running' status forever." (Mafuuu)
No current tool hits this (the advisor always sends a final result alongside errors), so this is about future-proofing. If a future streaming tool terminates with a bare error part, this will surface. Clearing isStreaming when part.is_error is true would close the gap. (Mafuuu)
🤖
| ctx context.Context, | ||
| question string, | ||
| conversationSnapshot []fantasy.Message, | ||
| opts ...RunAdvisorOptions, |
There was a problem hiding this comment.
P4 [DEREM-4] Variadic opts ...RunAdvisorOptions silently discards all but the first element.
Single internal caller today (tool.go), standard Go optional-parameter pattern, and the practical risk is near zero. Four reviewers flagged it independently; the concern is that the variadic signature implies composability that doesn't exist.
A *RunAdvisorOptions pointer would make the single-config contract explicit and cause a compile error on misuse: func (rt *Runtime) RunAdvisor(ctx, question, snapshot, opts *RunAdvisorOptions), nil check instead of length check. (Kite P3, Pariston Note, Ryosuke Note, Knov Note)
🤖
| args: { question: sampleQuestion }, | ||
| result: "Use the smaller diff while the advisor is still responding.", | ||
| }, | ||
| play: async ({ canvasElement }) => { |
There was a problem hiding this comment.
Nit [DEREM-5] RunningWithStreamedAdvice asserts streamed text is visible and the empty-advice fallback is absent, but doesn't assert the initial placeholder ("Reviewing context and preparing guidance.") disappears.
The component uses a ternary so both can't render simultaneously today, but a refactor that splits the ternary into independent conditionals would pass the story while rendering both. Adding expect(canvas.queryByText("Reviewing context and preparing guidance.")).not.toBeInTheDocument() would guard the transition. (Bisky)
🤖
| result?: unknown; | ||
| resultRaw?: string; | ||
| isError: boolean; | ||
| isStreaming?: boolean; |
There was a problem hiding this comment.
Nit [DEREM-7] isStreaming on StreamToolResult is undocumented. Its semantics are non-obvious: the field distinguishes "accumulating deltas" from "final result received." Other boolean fields like isError are self-explanatory; this one requires knowing that result entries can exist in a pre-completion state.
Suggestion: /** True while result deltas are still accumulating before the final result. */ (Gon)
🤖
| "github.com/coder/coder/v2/codersdk" | ||
| ) | ||
|
|
||
| // RunAdvisorOptions configures one advisor invocation. |
There was a problem hiding this comment.
Nit [DEREM-8] RunAdvisorOptions configures one advisor invocation. doesn't mention streaming or callbacks, which is the struct's sole purpose.
Suggestion:
// RunAdvisorOptions carries optional streaming callbacks for a
// single RunAdvisor invocation.(Leorio)
🤖
| result: nextResult.value, | ||
| resultRaw: nextResult.rawText, | ||
| isError: nextIsError, | ||
| ...(isStreaming ? { isStreaming: true } : {}), |
There was a problem hiding this comment.
Nit [DEREM-9] The conditional spread ...(isStreaming ? { isStreaming: true } : {}) relies on the entire object being reconstructed from scratch each time so the old isStreaming: true doesn't survive. A future refactor that merges instead of replaces would silently reintroduce the flag. The test at streamState.test.ts:255 guards this with toBeUndefined(), so it's safe today.
Six reviewers independently noted this pattern. An explicit isStreaming: isStreaming || undefined would make the intent clearer. (Kite Nit, Meruem Note, Nami Note, Pariston Note, Knov Note, Ryosuke Note)
🤖
|
/coder-agents-review |
There was a problem hiding this comment.
All 9 R1 findings addressed in b8b160d. Clean fixes across every layer: retry reset mechanism (OnAdviceReset/ResultReset), pointer options, bare-error handling, and documentation. The panel (9 reviewers) verified each fix against the code. One new P3 surfaced.
1 P3. The PublishAdviceReset wiring at the tool layer lacks a test mirroring the existing PublishAdviceDelta test.
"If someone breaks the PublishAdviceReset wiring (removes the if guard, drops call.ID, swaps the callback), no test fails." (Bisky)
🤖 This review was automatically generated with Coder Agents.
| if opts.PublishAdviceReset != nil { | ||
| runOpts.OnAdviceReset = func() { | ||
| opts.PublishAdviceReset(call.ID) | ||
| } | ||
| } |
There was a problem hiding this comment.
P3 [DEREM-10] No test for PublishAdviceReset wiring at the tool layer.
TestAdvisorToolPublishesAdviceDeltasWithToolCallID proves call.ID is threaded through the PublishAdviceDelta closure (lines 60-62). This structurally identical PublishAdviceReset closure has no corresponding test. If someone removes the call.ID binding here, the chatd.go guard (toolCallID == "") silently suppresses the reset event, and the frontend retains stale accumulated text on retry. The runner test (TestAdvisorRunResetsAdviceDeltasOnRetry) exercises OnAdviceReset directly but does not test the tool layer's call.ID binding.
Sketch: mirror TestAdvisorToolPublishesAdviceDeltasWithToolCallID in tool_test.go but use a StreamFn that returns a 429 on the first call and succeeds on the second. Wire both PublishAdviceDelta and PublishAdviceReset into the tool options. Assert the reset callback was called with "call-1" between the stale and fresh delta callbacks. (Bisky)
🤖
|
/coder-agents-review |
There was a problem hiding this comment.
All 10 findings from rounds 1-2 are fixed and verified. The panel (4 reviewers) and Netero found no new issues in the current head.
R1: 9 findings (2 P3, 2 P4, 5 Nit) on the initial streaming implementation. All addressed in b8b160d.
R2: 1 new P3 (missing reset wiring test). Addressed in 6e5fab8.
R3: Clean. No new findings.
The streaming pipeline is well-structured: nested chatloop deltas flow through runner/tool/chatd to the websocket, retry resets clear stale accumulated text, bare errors correctly terminate streaming state, and deltas never leak into persistence. Test coverage spans unit, tool-layer, integration, frontend state machine, and visual (story).
"The shrinking finding count (9, 1, 0) suggests the feedback loop is working." (Mafu-san)
🤖 This review was automatically generated with Coder Agents.
Stream advisor output into the advisor tool card while the nested advisor call is still running.
This keeps the advisor implementation intentionally advisor-specific: the parent model still receives the same final structured tool result, while the frontend receives transient
tool-result.result_deltaparts to render partial advisor text in the expanded card. The final persisted chat history remains unchanged.Refs CODAGT-322.
Generated by Coder Agents.
Implementation plan
chatloop.RunviaRunAdvisorOptions.OnAdviceDelta.chatadvisor.Toolwith the parent advisor tool call ID.ChatMessagePartTypeToolResultwebsocket parts withResultDeltafromchatd.result_deltato the generated tool-result TypeScript variant.