Mastering Chunked File Uploads with FastApi and Nodejs: A Step-by-Step Guide

Introduction

In today's digital age, the ability to seamlessly upload large files is crucial for many online platforms and applications. Whether you're sharing multimedia content, collaborating on projects, or backing up important data, efficient file uploading is essential. However, traditional file upload methods often encounter limitations when dealing with large files, leading to frustration for users and developers alike.

Enter chunked file uploads – a revolutionary approach that breaks large files into smaller, more manageable chunks for transmission. This technique not only overcomes the constraints of conventional uploads but also offers a range of benefits, including increased reliability, improved performance, and enhanced user experience.

In this comprehensive guide, we'll delve into the world of chunked file uploads, exploring what they are, how they work, and why they're becoming the go-to solution for handling large files in today's digital landscape. Whether you're a developer seeking to optimize file upload functionality or a user looking to understand the technology behind seamless uploads, this guide is your roadmap to mastering chunked file uploads. So, let's dive in and uncover the secrets to smoother, more efficient file transmission.

Getting Started

Let's start by uploading files in chunks. Initially, we read a file and then divide it into several 1MB pieces. Each chunk is transformed into a blob and transmitted to the backend server for storage. Along with the blob, we include the file name, total number of chunks, and the current chunk number. These details are essential for reassembling all the chunks on the backend later.

Client-Side Implementation(React JS)

import { ChangeEvent, FormEvent, useRef, useState } from "react";

function App() {
  // Ref to the file input element
  const fileInputRef = useRef<HTMLInputElement | null>(null);

  // State to store the selected file
  const [selectedFile, setSelectedFile] = useState<File | null>(null);

  // Function to handle file selection
  const handleFileChange = (e: ChangeEvent<HTMLInputElement>) => {
    const file = e.target.files;
    if (file && file.length) {
      setSelectedFile(file[0]);
    }
  };

  // Function to handle file upload
  const handleUpload = async (e: FormEvent<HTMLButtonElement>) => {
    e.preventDefault();
    const file = selectedFile;
    const chunk_size = 1024 * 1024; // Chunk size set to 1 MB
    let offset = 0;
    let chunk_number = 0;
    if (file) {
      // Loop until all chunks are uploaded
      while (offset < file?.size) {
        // Slice the file into chunks
        const chunk = selectedFile.slice(offset, offset + chunk_size);

        // Create a blob from the chunk
        const chunk_blob = new Blob([chunk], { type: file.type });

        // Create a FormData object to send chunk data
        const formData = new FormData();
        formData.append("file", chunk_blob);
        formData.append("name", file.name);
        formData.append("chunk_number", String(chunk_number));
        formData.append(
          "total_chunks",
          String(Math.ceil(file?.size / chunk_size))
        );

        // Send the chunk data to the server using fetch API
        await fetch("http://127.0.0.1:3000/uploads", {
          method: "POST",
          body: formData,
        });

        // Update offset and chunk number for the next iteration
        offset += chunk_size;
        chunk_number += 1;
      }
    }
  };

  return (
    <div className="container">
      <form>
        {/* File selector */}
        <div
          className="fileSelector"
          onClick={() => fileInputRef.current?.click()}
        >
          <img src="/plus.jpg" alt="plus" />
        </div>{" "}
        <br />
        {/* Display selected file name */}
        <span>{selectedFile ? selectedFile.name : "Nothing Selected"}</span>
        {/* Hidden file input */}
        <input
          type="file"
          name="file"
          id="file"
          className="none"
          ref={fileInputRef}
          onChange={handleFileChange}
        />
        <br />
        {/* Upload button */}
        <button type="submit" className="btn" onClick={handleUpload}>
          Upload
        </button>
      </form>
    </div>
  );
}

export default App;

Now, we are prepared to proceed to the server side to manage the chunks sent from the client side. We will set up the server using Node.js and Express.js. To handle the files, we will utilize Multer. The basic concept is to save each chunk received from the client side in a "chunks" folder with the assistance of Multer and Multer Memory Storage. We aim to keep files from different users separate. Therefore, we will combine the file name and chunk number to create a new filename for temporarily storing the files. It will follow this format: filename_chunknumber. Once we receive the final chunk, we will invoke another function to merge these chunks and create a single file.

Server-Side Implementation(NodeJS)

const express = require("express");
const cors = require("cors");
const multer = require("multer");
const app = express();
const fs = require("fs");
const status = require("http-status");

// Middleware to handle JSON data with a limit of 10mb
app.use(
  express.json({
    limit: "10mb",
  })
);

// Middleware to handle URL encoded data with a limit of 10mb
app.use(
  express.urlencoded({
    extended: true,
    limit: "10mb",
  })
);

// Multer storage configuration for handling file uploads
const storage = multer.memoryStorage();
const upload = multer({ storage: storage });

// CORS middleware to allow requests from any origin
app.use(cors({ origin: "*" }));

// Route to check if the server is running
app.get("/", (req, res, next) => {
  return res.send("running");
});

// Function to merge uploaded file chunks
const mergeChunks = async (filename, total_chunks) => {
  const chunkDir = "./chunks";
  const mergedFilePath = "./uploads";

  // Create uploads directory if it doesn't exist
  if (!fs.existsSync(mergedFilePath)) {
    fs.mkdirSync(mergedFilePath);
  }

  // Create a write stream for merging chunks into a single file
  const writeStream = fs.createWriteStream(`${mergedFilePath}/${filename}`);

  // Loop through each chunk and append it to the merged file
  for (let i = 0; i < total_chunks; i++) {
    const chunkFilePath = `${chunkDir}/${filename}_${i}`;
    const chunkBuffer = fs.readFileSync(chunkFilePath);
    writeStream.write(chunkBuffer);
    // Delete the chunk file after merging
    fs.unlinkSync(chunkFilePath);
  }
  writeStream.end();
};

// Route for handling file uploads in chunks
app.post("/uploads", upload.single("file"), async (req, res, next) => {
  const filename = req.body.name;
  const chunk_number = req.body.chunk_number;
  const total_chunks = req.body.total_chunks;
  const isLast = parseInt(chunk_number) + 1 === parseInt(total_chunks);

  // Write the chunk data to a file
  fs.writeFile(
    `./chunks/${filename}_${chunk_number}`,
    req.file.buffer,
    (err) => {
      if (err) {
        console.error(err);
        return res
          .status(status.INTERNAL_SERVER_ERROR)
          .send("Error occurred while writing the file.");
      }
      // If it's the last chunk, merge all chunks into a single file
      if (isLast) {
        mergeChunks(filename, total_chunks);
        return res
          .status(status.OK)
          .send({ message: "File Uploaded", upload_state: "complete" });
      }
      // If it's not the last chunk, acknowledge the upload
      return res
        .status(status.OK)
        .send({ message: "Chunk Uploaded", upload_state: "partial" });
    }
  );
});

// Start the server on port 3000
app.listen(3000, () => {
  console.log("Running On 3000");
});

Server-Side Implementation(FastApi)

from fastapi import (
    FastAPI,
    File,
    UploadFile,
    Form,
)
from fastapi.responses import JSONResponse
from fastapi import status
import os
from fastapi.middleware.cors import CORSMiddleware

app = FastAPI()

# Define origins for CORS (Cross-Origin Resource Sharing)
origins = [
    "http://localhost:5173",
    "http://127.0.0.1:5173",
]

# Add CORS middleware to allow cross-origin requests
app.add_middleware(
    CORSMiddleware,
    allow_origins=origins,
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)


# Endpoint for file uploads
@app.post("/uploads")
async def upload_file(
    file: UploadFile = File(...),  # File to be uploaded
    name: str = Form(...),  # Name of the file
    chunk_number: int = Form(0),  # Current chunk number
    total_chunks: int = Form(1),  # Total number of chunks
):
    """
    Handles file uploads with chunked transfer
    (if total_chunks > 1) or single-file upload.

    Raises:
        HTTPException: If a validation error occurs
        (e.g., missing data, invalid file size).
    """
    isLast = (int(chunk_number) + 1) == int(
        total_chunks
    )  # Check if it's the last chunk

    file_name = f"{name}_{chunk_number}"  # Generate a unique file name for the chunk

    # Write the chunk to a file in the 'chunks' directory
    with open(f"./chunks/{file_name}", "wb") as buffer:
        buffer.write(await file.read())
    buffer.close()

    if isLast:  # If it's the last chunk, concatenate all chunks into the final file
        with open(f"./uploads/{name}", "wb") as buffer:
            chunk = 0
            while chunk < total_chunks:
                with open(f"./chunks/{name}_{chunk}", "rb") as infile:
                    buffer.write(infile.read())  # Write the chunk to the final file
                    infile.close()
                os.remove(f"./chunks/{name}_{chunk}")  # Remove the chunk file
                chunk += 1
        buffer.close()
        return JSONResponse(
            {"message": "File Uploaded"}, status_code=status.HTTP_200_OK
        )

    return JSONResponse(
        {"message": "Chunk Uploaded"}, status_code=status.HTTP_200_OK)

Benefits

  1. Resilience to Network Interruptions: Chunked uploads allow the transmission of large files to be broken down into smaller segments. If a network interruption occurs during the upload process, only the affected chunk needs to be retransmitted, rather than the entire file.

  2. Improved User Experience: By providing feedback on the upload progress at the chunk level, users can track the status of their uploads more accurately. This transparency enhances the overall user experience by reducing uncertainty and frustration.

  3. Optimized Resource Utilization: Chunked uploads enable servers to handle large files more efficiently by processing smaller chunks individually. This approach minimizes memory consumption and reduces the likelihood of timeouts or resource exhaustion on the server side.

  4. Scalability: Chunked uploads are inherently scalable, allowing applications to handle simultaneous uploads of multiple large files without overloading the server or impacting performance for other users.

  5. Flexibility: Chunked uploads offer flexibility in handling files of varying sizes. Applications can adjust the chunk size dynamically based on network conditions, server capabilities, or user preferences to optimize upload performance.

  6. Support for Resumable Uploads: With chunked uploads, resuming interrupted uploads becomes easier since only the remaining chunks need to be transmitted. This feature is particularly useful for large files or users with unstable internet connections.

  7. Reduced Memory Footprint: Since chunks are processed individually, memory usage on both the client and server sides is optimized. This is especially beneficial for applications handling multiple concurrent uploads or running on resource-constrained environments.

  8. Compatibility: Chunked uploads are compatible with a wide range of platforms, frameworks, and programming languages, making them a versatile solution for file transfer across different systems and devices.

Conclusion

Chunked file uploads represent a significant advancement in managing the transfer of large files in today's digital world. By breaking files into smaller, more manageable parts, this method offers numerous benefits for both users and developers.

For users, chunked uploads lead to a smoother and more transparent uploading process. They can track upload progress more accurately, enjoy increased reliability during network interruptions, and easily resume interrupted uploads. This enhanced user experience builds trust and satisfaction with the platform or application.

From a developer's perspective, chunked uploads bring efficiency and scalability to file management. By handling smaller parts separately, developers can optimize resource usage, reduce the risk of timeouts or server overload, and ensure compatibility across various systems and devices. Moreover, the ability to adjust chunk sizes dynamically allows for fine-tuning upload performance according to specific requirements.

Overall, chunked file uploads offer a mutually beneficial solution, enabling users to share large files seamlessly while helping developers create robust and scalable applications. As the digital landscape advances, the widespread adoption of chunked uploads is expected to fuel innovation and efficiency in file transfer technology.

Did you find this article valuable?

Support Arnab gupta by becoming a sponsor. Any amount is appreciated!