Floreal Logo

Rate Limits

All API endpoints are protected by rate limiting to ensure fair usage and system stability. Rate limits are applied per API key and reset automatically after the time window expires.


Rate Limit Types

We use two types of rate limiting depending on the endpoint's computational cost:

1. Upload Rate Limits (Dual Protection)

Upload endpoints use dual rate limiting with both burst protection and sustained limits:

Limit TypeWindowRequestsPurpose
Burst Protection10 seconds10Prevents rapid-fire abuse
Sustained Limit1 minute20Overall quota protection

How it works:

  • You can upload up to 10 files quickly (within 10 seconds)
  • After that, you must wait for the 10-second window to reset
  • Maximum 20 uploads total per minute
  • Both limits reset independently

Affected Endpoints:

  • POST /v1/public/documents/upload-presigned
  • POST /v1/public/documents/upload-presigned-finalize
  • POST /v1/public/documents/upload-direct
  • POST /v1/public/documents/upload-from-url

2. Search Rate Limits (Strict)

Search operations are computationally expensive and use strict rate limiting:

Limit TypeWindowRequests
Search Operations1 minute10

Affected Endpoints:

  • POST /v1/public/searches/dense
  • POST /v1/public/searches/sparse
  • POST /v1/public/searches/hybrid
  • POST /v1/public/searches/builtin-rerank

3. Read Rate Limits (Generous)

Read operations are lightweight and have generous limits:

Limit TypeWindowRequests
Read Operations1 minute100

Affected Endpoints:

  • GET /v1/public/documents/:documentId
  • GET /v1/public/documents
  • GET /v1/public/documents/upload-presigned/:uploadId
  • DELETE /v1/public/documents/:documentId
  • GET /v1/public/searches/:searchId
  • GET /v1/public/searches/:searchId/results
  • GET /v1/public/searches/groups/:searchGroupId
  • GET /v1/public/searches/groups/:searchGroupId/results
  • GET /v1/public/searches/groups

Complete Rate Limit Reference

EndpointMethodBurst LimitMinute LimitNotes
/v1/public/documents/upload-presignedPOST10 per 10s20 per minGenerate presigned URL
/v1/public/documents/upload-presigned/:uploadIdGET-100 per minVerify upload
/v1/public/documents/upload-presigned-finalizePOST10 per 10s20 per minFinalize upload
/v1/public/documents/upload-directPOST10 per 10s20 per minDirect file upload
/v1/public/documents/upload-from-urlPOST10 per 10s20 per minUpload from URL
/v1/public/documents/:documentIdGET-100 per minGet document details
/v1/public/documents/:documentIdDELETE-100 per minDelete document
/v1/public/documentsGET-100 per minList documents
/v1/public/searches/densePOST-10 per minDense search
/v1/public/searches/sparsePOST-10 per minSparse search
/v1/public/searches/hybridPOST-10 per minHybrid search
/v1/public/searches/builtin-rerankPOST-10 per minBuiltin-rerank search
/v1/public/searches/:searchIdGET-100 per minGet search status
/v1/public/searches/:searchId/resultsGET-100 per minGet search results
/v1/public/searches/groups/:searchGroupIdGET-100 per minGet group status
/v1/public/searches/groups/:searchGroupId/resultsGET-100 per minGet group results
/v1/public/searches/groupsGET-100 per minList search groups

Rate Limit Headers

All API responses include rate limit headers so you can track your usage:

RateLimit-Limit: 10
RateLimit-Remaining: 7
RateLimit-Reset: 1699884000
HeaderDescriptionExample
RateLimit-LimitMaximum requests allowed in the window10
RateLimit-RemainingRequests remaining in current window7
RateLimit-ResetUnix timestamp when the limit resets1699884000

Convert reset time to human-readable:

const resetTime = new Date(headers["RateLimit-Reset"] * 1000);
console.log("Rate limit resets at:", resetTime.toLocaleString());

Rate Limit Exceeded (HTTP 429)

When you exceed a rate limit, the API returns HTTP 429 Too Many Requests with details about which limit you hit:

Upload Rate Limit (Burst)

{
  "error": "Rate limit exceeded",
  "message": "Too many requests too quickly. Maximum 10 uploads per 10 seconds allowed.",
  "limit": 10,
  "window": "10 seconds",
  "retryAfter": "8",
  "hint": "Slow down! You're making requests too fast. Wait a few seconds between uploads."
}

Upload Rate Limit (Sustained)

{
  "error": "Rate limit exceeded",
  "message": "Too many upload requests. Maximum 20 uploads per minute allowed per API key.",
  "limit": 20,
  "window": "1 minute",
  "retryAfter": "23",
  "hint": "You've used your minute quota. Wait for the limit to reset or implement client-side queueing."
}

Search Rate Limit

{
  "error": "Rate limit exceeded",
  "message": "Too many search requests. Maximum 10 searches per minute allowed.",
  "limit": 10,
  "window": "1 minute",
  "retryAfter": "45",
  "hint": "Search operations are computationally expensive. Please reduce query frequency."
}

Read Rate Limit

{
  "error": "Rate limit exceeded",
  "message": "Too many read requests. Maximum 100 reads per minute allowed.",
  "limit": 100,
  "window": "1 minute",
  "retryAfter": "12"
}

Handling Rate Limits

1. Respect the Headers

Always check rate limit headers before making requests:

async function makeRequest(url, options) {
  const response = await fetch(url, options);

  // Check remaining requests
  const remaining = parseInt(response.headers.get("RateLimit-Remaining"));
  const reset = parseInt(response.headers.get("RateLimit-Reset"));

  console.log(`Requests remaining: ${remaining}`);

  if (remaining === 0) {
    const waitTime = reset * 1000 - Date.now();
    console.log(`Rate limit reached. Waiting ${waitTime}ms...`);
    await new Promise((resolve) => setTimeout(resolve, waitTime));
  }

  return response;
}

2. Implement Exponential Backoff

When you hit a rate limit, wait before retrying:

async function requestWithBackoff(url, options, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    const response = await fetch(url, options);

    if (response.status !== 429) {
      return response;
    }

    // Rate limited - wait before retry
    const retryAfter = parseInt(response.headers.get("Retry-After") || "5");
    const waitTime = retryAfter * 1000 * Math.pow(2, i); // Exponential backoff

    console.log(
      `Rate limited. Waiting ${waitTime}ms before retry ${i + 1}/${maxRetries}...`
    );
    await new Promise((resolve) => setTimeout(resolve, waitTime));
  }

  throw new Error("Max retries reached");
}

3. Use Request Queuing

For bulk operations, implement a queue to respect rate limits:

class RateLimitedQueue {
  constructor(limit, windowMs) {
    this.limit = limit;
    this.windowMs = windowMs;
    this.queue = [];
    this.processing = false;
  }

  async add(fn) {
    return new Promise((resolve, reject) => {
      this.queue.push({ fn, resolve, reject });
      this.process();
    });
  }

  async process() {
    if (this.processing || this.queue.length === 0) return;

    this.processing = true;
    const timestamps = [];

    while (this.queue.length > 0) {
      const now = Date.now();

      // Remove timestamps outside the window
      const cutoff = now - this.windowMs;
      while (timestamps.length > 0 && timestamps[0] < cutoff) {
        timestamps.shift();
      }

      // Check if we can make a request
      if (timestamps.length < this.limit) {
        const { fn, resolve, reject } = this.queue.shift();
        timestamps.push(now);

        try {
          const result = await fn();
          resolve(result);
        } catch (error) {
          reject(error);
        }
      } else {
        // Wait until oldest timestamp expires
        const waitTime = timestamps[0] + this.windowMs - now;
        await new Promise((resolve) => setTimeout(resolve, waitTime));
      }
    }

    this.processing = false;
  }
}

// Usage for uploads (20 per minute)
const uploadQueue = new RateLimitedQueue(20, 60000);

// Queue upload requests
const results = await Promise.all(
  files.map((file) => uploadQueue.add(() => uploadFile(file)))
);

4. Monitor Your Usage

Track your API usage to avoid hitting limits:

class RateLimitMonitor {
  constructor() {
    this.requests = {
      uploads: [],
      searches: [],
      reads: [],
    };
  }

  trackRequest(type) {
    const now = Date.now();
    this.requests[type].push(now);

    // Clean old entries (older than 1 minute)
    const cutoff = now - 60000;
    this.requests[type] = this.requests[type].filter((t) => t > cutoff);
  }

  getUsage(type) {
    const limits = {
      uploads: 20,
      searches: 10,
      reads: 100,
    };

    const used = this.requests[type].length;
    const limit = limits[type];
    const remaining = limit - used;

    return {
      used,
      limit,
      remaining,
      percentUsed: (used / limit) * 100,
    };
  }

  canMakeRequest(type) {
    const usage = this.getUsage(type);
    return usage.remaining > 0;
  }
}

// Usage
const monitor = new RateLimitMonitor();

async function uploadWithMonitoring(file) {
  if (!monitor.canMakeRequest("uploads")) {
    throw new Error("Upload rate limit would be exceeded");
  }

  const result = await uploadFile(file);
  monitor.trackRequest("uploads");

  const usage = monitor.getUsage("uploads");
  console.log(
    `Uploads: ${usage.used}/${usage.limit} (${usage.remaining} remaining)`
  );

  return result;
}

Rate Limit Increase Requests

Need higher limits for your use case? Contact us at support@floreal.ai with:

  1. Your API key (first 8 characters only)
  2. Current usage patterns (requests per minute/hour)
  3. Desired limits (what you need and why)
  4. Use case description (what you're building)

We review requests on a case-by-case basis and can provision higher limits for legitimate use cases.


Summary

Operation TypeLimitWindowBest Practice
Uploads10 burst, 20 sustained10s, 60sSpace uploads 3s apart
Searches1060sQueue searches with 6s delay
Reads10060sCache aggressively

Key Takeaways:

  • ✅ Always check RateLimit-Remaining header
  • ✅ Implement exponential backoff for 429 responses
  • ✅ Use request queuing for bulk operations
  • ✅ Cache read results to reduce API calls
  • ✅ Poll status endpoints every 5-10 seconds, not faster