Skip to content

fix(uploads): direct-to-upload workspace files + shared transport#4407

Merged
waleedlatif1 merged 28 commits intostagingfrom
waleedlatif1/mp4-upload-fail
May 3, 2026
Merged

fix(uploads): direct-to-upload workspace files + shared transport#4407
waleedlatif1 merged 28 commits intostagingfrom
waleedlatif1/mp4-upload-fail

Conversation

@waleedlatif1
Copy link
Copy Markdown
Collaborator

Summary

  • Fix workspace file uploads failing with missing final boundary while parsing FormData on flaky networks (10.5MB MP4 repro). Bytes now go client → S3/Azure directly via presigned PUT (≤50MB) or multipart (>50MB), eliminating the buffered-FormData proxy on the Next worker.
  • Add /api/workspaces/[id]/files/presigned and /register endpoints with cross-tenant key validation, HEAD-verify, and best-effort cleanup on insert failure.
  • Consolidate client transport into lib/uploads/client/direct-upload.ts; migrate KB hook to consume the shared module. Centralize isNetworkError, isAbortError, getFileContentType in file-utils.ts.
  • Add progress UI + abort to the Files tab. Local-dev FormData path retained as fallback when cloud storage isn't configured.

Type of Change

  • Bug fix

Testing

Tested manually with 1MB / 60MB / 4-file drag uploads, mid-upload abort, and disconnect-mid-upload. 60/60 unit tests passing (presigned, register, multipart, direct-upload).

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link
Copy Markdown

vercel Bot commented May 2, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped May 3, 2026 0:27am

Request Review

@cursor
Copy link
Copy Markdown

cursor Bot commented May 2, 2026

PR Summary

Medium Risk
Changes core file upload paths by introducing direct-to-cloud PUT/multipart and new workspace registration endpoints; mistakes could orphan objects, mis-handle quotas, or allow cross-tenant key misuse despite added validation.

Overview
Moves workspace file uploads to a direct-to-cloud transport. Adds a shared client upload module (lib/uploads/client/direct-upload.ts) that chooses presigned PUT for smaller files and multipart for larger ones, with progress callbacks, abort support, and bounded retries.

Introduces workspace-scoped presign + registration APIs. Adds POST /api/workspaces/[id]/files/presigned (permission check, size ceiling, quota check, local-storage fallback) and POST /api/workspaces/[id]/files/register (validates key belongs to workspace, HEAD-verifies object, idempotent metadata insert, best-effort orphan cleanup, emits audit + analytics on first create).

Hardens multipart + storage primitives. Updates /api/files/multipart completion to accept a unified { partNumber, etag? } parts payload, derive Azure block IDs server-side, and pass provider-specific config through initiate/get-urls/complete/abort. Adds headObject support to storage-service (S3/Blob) and allows generatePresignedUploadUrl to accept a caller-supplied customKey.

UI and tests updated. Workspace Files UI now enforces a 5 GiB per-file limit client-side and shows per-file percent progress; the legacy FormData upload route is capped via MAX_WORKSPACE_FORMDATA_FILE_SIZE (100MB) and returns 413. Adds extensive new tests, plus shared testing mocks for storage-service and PostHog.

Reviewed by Cursor Bugbot for commit c7a6fc9. Configure here.

@gitguardian
Copy link
Copy Markdown

gitguardian Bot commented May 2, 2026

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
While these secrets were previously flagged, we no longer have a reference to the
specific commits where they were detected. Once a secret has been leaked into a git
repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Comment thread apps/sim/app/workspace/[workspaceId]/knowledge/hooks/use-knowledge-upload.ts Outdated
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 2, 2026

Greptile Summary

This PR replaces the buffered-FormData proxy path for workspace file uploads with direct-to-storage transfers: presigned PUT for files ≤50MB and chunked multipart for larger files. It adds /api/workspaces/[id]/files/presigned and /register endpoints with cross-tenant key validation, HEAD-based size verification, quota enforcement, and orphan cleanup. Client transport is consolidated into lib/uploads/client/direct-upload.ts, which the KB upload hook now consumes, eliminating ~250 lines of duplicated logic. The FormData fallback is retained for local-dev environments without cloud storage.

  • fetchBatchPresignedData in the KB hook accesses presignedItems[batchPos] without a null guard — a malformed server response (missing files key) would throw a TypeError instead of a handled error.
  • MultipartUploadOptions.workspaceId is required for all contexts including 'knowledge-base'; this implicit contract is not documented and could silently fail for callers that don't supply a valid workspace ID for large KB files.

Confidence Score: 5/5

Safe to merge — all previously identified P1 issues are addressed and no new blocking issues were found.

All P1s from prior review iterations (orphan cleanup, quota inflation via zero-size HEAD fallback, TOCTOU in blob HEAD, multipart abort leaks, presigned URL error surfacing, retry-on-register TOCTOU) appear to have been fixed. The remaining findings are P2: a missing null guard in the KB batch presigned response and an undocumented implicit workspaceId requirement on MultipartUploadOptions. Neither is a runtime blocker under normal server behavior. Test coverage is thorough (60 unit tests, new route tests for both endpoints).

apps/sim/app/workspace/[workspaceId]/knowledge/hooks/use-knowledge-upload.ts (presignedItems null guard), apps/sim/lib/uploads/client/direct-upload.ts (MultipartUploadOptions.workspaceId implicit requirement)

Important Files Changed

Filename Overview
apps/sim/lib/uploads/client/direct-upload.ts New shared upload transport: presigned PUT (≤50MB) + chunked multipart (>50MB) with per-part retry, abort cleanup on every failure path, and timeout management. runUploadStrategy is the new unified entry point for both KB and workspace uploads.
apps/sim/app/api/workspaces/[id]/files/presigned/route.ts New endpoint: issues workspace-scoped presigned PUT URLs with auth, write-permission check, size cap, cloud-storage guard (returns directUploadSupported: false for local dev), and quota check — in that order.
apps/sim/app/api/workspaces/[id]/files/register/route.ts New endpoint: cross-tenant key validation, delegates to registerUploadedWorkspaceFile which HEAD-verifies size, checks quota, and inserts metadata idempotently. Audit + analytics only on created=true path.
apps/sim/lib/uploads/contexts/workspace/workspace-file-manager.ts Adds registerUploadedWorkspaceFile: HEAD-verifies object existence and size, size-cap and quota checks with orphan cleanup on rejection, idempotent re-register guard (existence check precedes quota), and storage-usage increment.
apps/sim/hooks/queries/workspace-files.ts New useUploadWorkspaceFile mutation uses runUploadStrategy with FALLBACK_REQUIRED error falling back to FormData. registerWithRetry adds bounded 3-attempt idempotent retry. parseUploadResponse correctly checks response.ok before JSON parse.
apps/sim/app/api/files/multipart/route.ts Adds workspace-context handling to the existing multipart initiate action: size cap, quota check, workspace key generation. Refactors complete action with typed ClientCompletedPart and shared completeOne helper.
apps/sim/lib/uploads/utils/file-utils.ts Centralizes isNetworkError, isAbortError, getFileContentType — previously duplicated in KB hook. Typo ecconnreseteconnreset corrected.
apps/sim/app/workspace/[workspaceId]/knowledge/hooks/use-knowledge-upload.ts Migrates to shared runUploadStrategy and fetchBatchPresignedData. Removes ~250 lines of duplicated upload logic. presignedItems access in fetchBatchPresignedData lacks a null guard against malformed server responses.
apps/sim/app/workspace/[workspaceId]/files/files.tsx Adds currentPercent to upload progress state, client-side size filter with toast for oversized files, and per-file progress callbacks via onProgress. Clean UI additions.
apps/sim/lib/uploads/shared/types.ts Adds MAX_WORKSPACE_FILE_SIZE (5GiB) and MAX_WORKSPACE_FORMDATA_FILE_SIZE (100MB) — clear separation of the direct-upload ceiling vs. the legacy FormData cap.

Sequence Diagram

sequenceDiagram
    participant Client
    participant PresignedAPI as /workspaces/[id]/files/presigned
    participant S3Blob as S3 / Azure Blob
    participant RegisterAPI as /workspaces/[id]/files/register
    participant MultipartAPI as /api/files/multipart
    participant DB as Database

    alt Small file (≤50MB)
        Client->>PresignedAPI: POST {fileName, contentType, fileSize}
        PresignedAPI-->>Client: {presignedUrl, fileInfo, directUploadSupported}
        Note over Client,PresignedAPI: hasCloudStorage()=false → directUploadSupported:false → FormData fallback
        Client->>S3Blob: PUT file (XHR with progress)
        S3Blob-->>Client: 200 OK
        Client->>RegisterAPI: POST {key, name, contentType}
        RegisterAPI->>S3Blob: HEAD (verify object exists + get size)
        RegisterAPI->>DB: INSERT metadata (idempotent)
        RegisterAPI-->>Client: {success, file}
    else Large file (>50MB) multipart
        Client->>MultipartAPI: POST action=initiate {fileName, workspaceId, context}
        MultipartAPI-->>Client: {uploadId, key, uploadToken}
        Client->>MultipartAPI: POST action=get-part-urls {uploadToken, partNumbers}
        MultipartAPI-->>Client: {presignedUrls[]}
        loop Per chunk (parallel, concurrency=3, with retry)
            Client->>S3Blob: PUT chunk
            S3Blob-->>Client: ETag
        end
        Client->>MultipartAPI: POST action=complete {uploadToken, parts}
        MultipartAPI->>S3Blob: CompleteMultipartUpload
        MultipartAPI-->>Client: {path, key}
        Client->>RegisterAPI: POST {key, name, contentType}
        RegisterAPI->>S3Blob: HEAD (verify)
        RegisterAPI->>DB: INSERT metadata
        RegisterAPI-->>Client: {success, file}
    end
Loading

Reviews (19): Last reviewed commit: "fix(uploads): add timeout/abort to kb ap..." | Re-trigger Greptile

Comment thread apps/sim/lib/uploads/contexts/workspace/workspace-file-manager.ts Outdated
Comment thread apps/sim/hooks/queries/workspace-files.ts Outdated
Comment thread apps/sim/lib/uploads/contexts/workspace/workspace-file-manager.ts Outdated
@waleedlatif1 waleedlatif1 changed the title fix(uploads): direct-to-S3 workspace files + shared transport fix(uploads): direct-to-upload workspace files + shared transport May 2, 2026
@waleedlatif1 waleedlatif1 force-pushed the waleedlatif1/mp4-upload-fail branch from a375aa7 to f0cbdd4 Compare May 2, 2026 17:57
@waleedlatif1 waleedlatif1 force-pushed the waleedlatif1/mp4-upload-fail branch from f0cbdd4 to 75f2c6a Compare May 2, 2026 18:01
…error handling

- Throw immediately on AbortError in KB retry loop (no useless 14s backoff)
- Cleanup S3/Blob object on quota or size-cap rejection in registerUploadedWorkspaceFile
- Enforce MAX_WORKSPACE_FILE_SIZE at registration (defense vs presigned PUT lying about size)
- Handle non-OK / non-JSON responses in workspace-files upload paths
@waleedlatif1 waleedlatif1 force-pushed the waleedlatif1/mp4-upload-fail branch from 4e1299c to 7ab4465 Compare May 2, 2026 18:17
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/lib/uploads/utils/file-utils.ts Outdated
Comment thread apps/sim/lib/uploads/client/direct-upload.ts Outdated
Comment thread apps/sim/lib/uploads/utils/file-utils.ts Outdated
Comment thread apps/sim/lib/uploads/contexts/workspace/workspace-file-manager.ts
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/lib/uploads/client/direct-upload.ts Outdated
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 768fc94. Configure here.

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/lib/uploads/providers/blob/client.ts Outdated
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/app/workspace/[workspaceId]/knowledge/hooks/use-knowledge-upload.ts Outdated
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/lib/uploads/client/direct-upload.ts
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit c7a6fc9. Configure here.

@waleedlatif1 waleedlatif1 merged commit af55bad into staging May 3, 2026
13 of 14 checks passed
@waleedlatif1 waleedlatif1 deleted the waleedlatif1/mp4-upload-fail branch May 3, 2026 00:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant