Firecrawl v2.9.0

Improvements

Browser Interaction via /interact endpoint — Scrape a page, then call /interact to take actions on it — click buttons, fill forms, navigate deeper, or extract dynamic content. Describe what you want in natural language via prompt, or write Playwright code (Node.js, Python) and Bash (agent-browser) for full control. Sessions persist across calls, with live view and interactive live view URLs for real-time browser streaming. Persistent profiles let you save and reuse browser state (cookies, localStorage) across scrapes. Available in JS, Python, Java, and Rust SDKs.
query format — Added query format to the /scrape endpoint — pass a natural-language prompt and get a direct answer back in data.answer.
audio format — Added audio format option to scrape responses, returning audio output as a field on the document.
onlyCleanContent parameter — Added onlyCleanContent parameter to the /scrape endpoint, which strips navigation, ads, cookie banners, and other non-semantic content from markdown output.
PDF parsing modes — Added PDF parsing modes (fast, auto, ocr) and a maxPages option to control extraction depth and OCR behavior.
Java and Elixir SDKs — Added official Java and Elixir SDKs with full v2 API support.
Legacy .doc file support — Added support for parsing legacy .doc files.
Wikimedia engine — Added a dedicated engine for scraping Wikipedia and Wikimedia pages with improved output quality.
contentType in scrape responses — Added contentType to scrape responses for PDFs and documents.
PDF pipeline improvements — Improved PDF pipeline with better table detection, header/footer stripping, mixed PDF handling, inline image parsing, and magic byte detection.
Branding extraction — Improved branding extraction to skip hidden DOM elements for cleaner output.
HTML-to-markdown performance — Improved HTML-to-markdown conversion performance and fixed code blocks losing content during conversion.
Concurrency queue — New concurrency queue system with reconciler and backfill for more reliable job scheduling.
Rust SDK v2 — Added v2 API namespace with agent support to the Rust SDK.
Fixed Python SDK parameters timeout, max_retries, and backoff_factor — these were previously accepted but silently ignored.
Capped job timeouts at 48 hours to prevent runaway jobs from consuming resources.
Added retry limits to prevent scrape loops.
Binary content types are now rejected early in the scrape pipeline to avoid wasted processing.

Fixes

Fixed empty responses when using the o3-mini model on extract jobs.
Fixed revoked API keys remaining valid for up to 10 minutes after deletion.
Fixed a race condition in extract jobs that caused "Job not found" crashes.
Fixed time_taken in /v1/map always returning ~0.
Fixed crawl status responses now surfacing a failed status with an error message and partial data when a crawl-level failure occurs.
Fixed maxPages not being passed to the PDF extractor — previously, full PDF content was returned while only charging for the limited page count.
Fixed free request credits being incorrectly consumed and billed on agent jobs exceeding the maxCredits threshold.
Fixed dashboard displaying incorrect concurrency limits due to stale reads.
Fixed branding colors.secondary not being populated.
Fixed removeBase64Images running after deriveDiff in the transformer pipeline, causing diff issues.
Fixed GCS fetch using wrong row index for cache info lookups.
Fixed unhandled ZodError in /v1/search controller.
Resolved multiple CVEs across dependencies including handlebars, path-to-regexp, fast-xml-parser, rollup (CVE-2026-27606), undici, and others.
Hardened the Playwright service against SSRF attacks.

API

Added GET /v2/team/activity endpoint for listing recent scrape, crawl, and extract jobs with cursor-based pagination (last 24 hours, up to 100 results per page, filterable by endpoint type).
Added regexOnFullURL parameter on crawl requests to apply includePaths/excludePaths filtering against the full URL including query parameters. Available in JS, Python, Java, and Elixir SDKs.
Added deduplicateSimilarURLs parameter on crawl requests. Available in JS, Python, Java, and Elixir SDKs.
Deprecated the extract endpoint — use the /agent endpoint instead. Existing extract methods in JS and Python SDKs are marked deprecated.
Renamed persistentSession to profile on browser/interact requests (writeMode is now saveChanges). The old parameter name remains functional but is no longer documented.

New Contributors

@misza-one made their first contribution in https://github.com/firecrawl/firecrawl/pull/2660
@madmikeross made their first contribution in https://github.com/firecrawl/firecrawl/pull/2948
@rowinsg made their first contribution in https://github.com/firecrawl/firecrawl/pull/3065
@Bortlesboat made their first contribution in https://github.com/firecrawl/firecrawl/pull/3243
@dagecko made their first contribution in https://github.com/firecrawl/firecrawl/pull/3249
@cokemine made their first contribution in https://github.com/firecrawl/firecrawl/pull/3262
@paulonasc made their first contribution in https://github.com/firecrawl/firecrawl/pull/3275

Contributors

@nickscamara
@mogery
@amplitudesxd
@abimaelmartell
@ericciarla
@rafaelsideguide
@delong3
@devhims
@Chadha93
@tomsideguide
@charlietlamb
@developersdigest
@micahstairs
@rhys-firecrawl
@firecrawl-spring
@devin-ai-integration
@misza-one
@madmikeross
@rowinsg
@Bortlesboat
@dagecko
@cokemine
@paulonasc

Full Changelog: https://github.com/firecrawl/firecrawl/compare/v2.8.0...v2.9.0

Firecrawl v2.9.0 includes browser interaction via /interact, new scrape formats, smarter PDF handling, two new SDKs, and reliability fixes.

Key Features:

Browser Interaction via /interact — Scrape a page, then call /interact to click buttons, fill forms, navigate, or extract dynamic content using natural language or Playwright/Bash code. Sessions persist across calls with live view URLs and reusable browser profiles.
Query Format — Pass natural-language prompts to /scrape and get direct answers in data.answer.
Audio Format — Request audio output from any scrape as a field on the document.
onlyCleanContent Parameter — Strip navigation, ads, and non-semantic content from markdown output.
PDF Parsing Modes — Choose fast, auto, or ocr parsing with maxPages option for fine-grained extraction control.
Java & Elixir SDKs — Official SDKs with full v2 API support, joining JS, Python, Go, and Rust.

Introduce the new /interact endpoint that turns any scrape into a live browser session where agents can click, type, and navigate using natural language.

Key Features:

Natural Language Control — Describe what you want in plain English; the agent clicks, types, scrolls, and extracts data automatically without selectors or scripts.
Live Browser Sessions — Every session includes a live URL you can embed, share, or interact with in real time for debugging and demos.
Persistent Profiles — Log in once and pick up where you left off with cookies and localStorage carrying across scrapes with named profiles.
Full Playwright Control — Switch to code mode and run Playwright (Node.js or Python) or Bash for precision control.
Session Reuse — Chain multiple interact calls on the same scrape with the browser maintaining state between calls for complex multi-step workflows.

Full support for core endpoints including scrape, search, and crawl. Works with Maven, Gradle, and Java 17+.

Key Features:

Maven & Gradle Ready — Drop into any Java project via JitPack with standard dependency management.
Java 17+ Support — Built for modern Java environments.
Core Endpoint Coverage — Scrape, search, crawl, map, and agent endpoints all supported.

New PDF parsing engine delivers 3x faster parsing and significantly improved reliability. Rebuilt in Rust, it automatically adapts to any PDF from clean text files to scanned reports and complex layouts.

Key Features:

Rust-Based Parser — High-performance engine built in Rust delivers up to 3x faster parsing, reducing latency in data ingestion and embedding workflows.
Three Parsing Modes:
- fast — text-only parsing for maximum performance.
- auto — new default; starts in fast mode and automatically falls back to OCR when needed, intelligently detecting edge cases like embedded images, graphs, multi-column layouts, and unusual text encodings.
- ocr — forces OCR parsing for fully image-based or scanned documents.
Built for Production Reliability — Extensively tested across thousands of real-world PDFs for consistent, accurate extraction.

Browser Sandbox gives agents a secure, fully managed browser environment for interactive web automation with no local setup, Chromium installs, or driver compatibility issues. Each session runs in an isolated, disposable sandbox that scales without infrastructure management.

Key Features:

Browser Sandbox — Launch secure, isolated browser sessions with Python, JavaScript, and bash execution. Pre-installed with agent-browser CLI and Playwright.
Multi-Language Support — Execute Python, JavaScript, or bash code remotely via API, CLI, or SDK with instant results.
agent-browser Integration — Pre-installed CLI with 40+ commands for AI agents to write simple bash commands instead of complex Playwright code.
Live View & CDP Access — Watch sessions in real time via embeddable stream URL or connect own Playwright instance over WebSocket.
Session Management — Configurable TTL controls, parallel sessions (up to 20 concurrent), and automatic cleanup. 2 credits per browser minute with 5 minutes free.

Significantly improved logo extraction accuracy for Branding Format v2, the endpoint for extracting brand identities from websites.

Key Features:

Significantly improved logo detection — More reliable logo extraction with fewer false positives and better handling of edge cases like logos embedded in background images.
Works with modern site builders — Branding Format now properly detects logos built with Wix, Framer, and other drag-and-drop platforms generating complex or non-semantic HTML.
Built for AI agents and developers — Captures colors, typography, spacing, and UI components in structured format to power AI agents and apps.

Firecrawl v2.8.0 is here!

Firecrawl v2.8.0 brings major improvements to agent workflows, developer tooling, and self-hosted deployments across the API and SDKs, including our new Skill.

Parallel Agents for running thousands of /agent queries simultaneously, powered by our new Spark 1 Fast model.
Firecrawl CLI with full support for scrape, search, crawl, and map commands.
Firecrawl Skill for enabling AI agents (Claude Code, Codex, OpenCode) to use Firecrawl autonomously.
Three new models powering /agent: Spark 1 Fast for instant retrieval (currently only available in Playground), Spark 1 Mini for complex research queries, and Spark 1 Pro for advanced extraction tasks.
Agent enhancements including webhooks, model selection, and new MCP Server tools.
Platform-wide performance improvements including faster search execution and optimized Redis calls.
SDK improvements including Zod v4 compatibility.

And much more, check it out below!

New Features

Parallel Agents
Execute thousands of /agent queries in parallel with automatic failure handling and intelligent waterfall execution. Powered by Spark 1-Fast for instant retrieval, automatically upgrading to Spark 1 Mini for complex queries requiring full research.
Firecrawl CLI
New command-line interface for Firecrawl with full support for scrape, search, crawl, and map commands. Install with npm install -g firecrawl-cli.
Firecrawl Skill
Enables agents like Claude Cursor, Codex, and OpenCode to use Firecrawl for web scraping and data extraction, installable via npx skills add firecrawl/cli.
Spark Model Family
Three new models powering /agent: Spark 1 Fast for instant retrieval (currently available in Playground), Spark 1 Mini (default) for everyday extraction tasks at 60% lower cost, and Spark 1 Pro for complex multi-domain research requiring maximum accuracy. Spark 1 Pro achieves ~50% recall while Mini delivers ~40% recall, both significantly outperforming tools costing 4-7x more per task.
Firecrawl MCP Server Agent Tools
New firecrawl_agent and firecrawl_agent_status tools for autonomous web data gathering via MCP-enabled agents.
Agent Webhooks
Agent endpoint now supports webhooks for real-time notifications on job completion and progress.
Agent Model Selection
Agent endpoint now accepts a model parameter and includes model info in status responses.
Multi-Arch Docker Images
Self-hosted deployments now support linux/arm64 architecture in addition to amd64.
Sitemap-Only Crawl Mode
New crawl option to exclusively use sitemap URLs without following links.
ignoreCache Map Parameter
New option to bypass cached results when mapping URLs.
Custom Headers for /map
Map endpoint now supports custom request headers.
Background Image Extraction
Scraper now extracts background images from CSS styles.
Improved Error Messages
All user-facing error messages now include detailed explanations to help diagnose issues.

API Improvements

Search without concurrency limits — scrapes in search now execute directly without queue overhead.
Return 400 for unsupported actions with clear errors when requested actions aren't supported by available engines.
Job ID now included in search metadata for easier tracking.
Metadata responses now include detected timezone.
Backfill metadata title from og:title or twitter:title when missing.
Preserve gid parameter when rewriting Google Sheets URLs.
Fixed v2 path in batch scrape status pagination.
Validate team ownership when appending to existing crawls.
Screenshots with custom viewport or quality settings now bypass cache.
Optimized Redis calls across endpoints.
Reduced excessive robots.txt fetching and parsing.
Minimum request timeout parameter now configurable.

SDK Improvements

JavaScript SDK

Zod v4 Compatibility — schema conversion now works with Zod v4 with improved error detection.
Watcher Exports — Watcher and WatcherOptions now exported from the SDK entrypoint.
Agent Webhook Support — new webhook options for agent calls.
Error Retry Polling — SDK retries polling after transient errors.
Job ID in Exceptions — error exceptions now include jobId for debugging.

Python SDK

Manual pagination helpers for iterating through results.
Agent webhook support added to agent client.
Agent endpoint now accepts model selection parameter.
Metadata now includes concurrency limit information.
Fixed max_pages handling in crawl requests.

Dashboard Improvements

Dark mode is now supported.
On the usage page, you can now view credit usage broken down by day.
On the activity logs page, you can now filter by the API key that was used.
The "images" output format is now supported in the Playground.
All admins can now manage their team's subscriptions.

Quality & Performance

Skip markdown conversion checks for large HTML documents.
Export Google Docs as HTML instead of PDF for improved performance.
Improved branding format with better logo detection and error messages for PDFs and documents.
Improved lopdf metadata loading performance.
Updated html-to-markdown module with multiple bug fixes.
Increased markdown service body limit and added request ID logging.
Better Sentry filtering for cancelled jobs and engine errors.
Fixed extract race conditions and RabbitMQ poison pill handling.
Centralized Firecrawl configuration across the codebase.
Multiple security vulnerability fixes, including CVE-2025-59466 and lodash prototype pollution.

Self-Hosted Improvements

CLI custom API URL support via firecrawl --api-url http://localhost:3002 for local instances.
ARM64 Docker support via multi-arch images for Apple Silicon and ARM servers.
Fixed docker-compose database credentials out of the box.
Fixed Playwright service startup caused by Chromium path issues.
Updated Node.js to major version 22 instead of a pinned minor.
Added RabbitMQ health check endpoint.
Fixed PostgreSQL port exposure in docker-compose.

New Contributors

@gemyago
@loganaden
@pcgeek86
@dmlarionov

Full Changelog: https://github.com/firecrawl/firecrawl/compare/v2.7.0...v2.8.0

What's Changed

refactor(api): centralize firecrawl config by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2496
fix(config): add .catch to NUQ worker port defaults for error handling by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2505
(sdk)fix/same timeout as api now by @rafaelsideguide in https://github.com/firecrawl/firecrawl/pull/2503
(sdks)feat/added concurrency info to metadata by @rafaelsideguide in https://github.com/firecrawl/firecrawl/pull/2502
fix: make srcset URLs absolute in HTML transformation by @Chadha93 in https://github.com/firecrawl/firecrawl/pull/2515
feat(api/admin/crawl-monitor): add endpoint for monitoring crawl system by @mogery in https://github.com/firecrawl/firecrawl/pull/2518
feat(api/logRequest): associate requests with API keys by @mogery in https://github.com/firecrawl/firecrawl/pull/2519
Fix Config Load on Tests by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2506
feat(api): update model usage to gpt-4o-mini by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2520
feat(api/scrapeURL): engpicker integ by @mogery in https://github.com/firecrawl/firecrawl/pull/2523
fix(playwright-service-ts): wasn't starting up due to the lack of chromium under /tmp/.cache by @dmlarionov in https://github.com/firecrawl/firecrawl/pull/2512
added timezone to metadata response by @rafaelsideguide in https://github.com/firecrawl/firecrawl/pull/2526
Increase Go Service Write Timeout by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2489
(python-sdk)fix/max_pages by @rafaelsideguide in https://github.com/firecrawl/firecrawl/pull/2527
Use invoiced billing for certain expansion packs by @micahstairs in https://github.com/firecrawl/firecrawl/pull/2532
Fix PostgreSQL port exposure in docker-compose by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2530
(feat/partners) Allow email to be optional for partners API by @nickscamara in https://github.com/firecrawl/firecrawl/pull/2533
Advanced model for recursive schemas by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2535
feat(api): update gpt-4o usage to gpt-4.1 by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2536
fix(api): cost tracking by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2537
Update Sentry for ZDR compliance by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2529
sanitize null-byte strings and report robustInsert failures to Sentry by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2538
Dont log Feature Flog Errors to Sentry by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2540
Debug logs to Extract Updates by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2539
Update test site build by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2543
feat: increase precrawl limits by @delong3 in https://github.com/firecrawl/firecrawl/pull/2544
fix(api): engines for robots and scrape + reduced sitemap limit by @delong3 in https://github.com/firecrawl/firecrawl/pull/2545
Webhook dispatcher by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2534
(feat/partner-integrations) Rotate endpoint by @nickscamara in https://github.com/firecrawl/firecrawl/pull/2547
fix: extract race condition by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2548
fix: correct property name from 'success' to 'is_successful' by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2550
feat: update root endpoint to return JSON with documentation URL by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2552
Branding Format Improvements by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2438
Add more Sentry filtering by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2556
Revert "feat(crawl): implement URL modification handling in crawl con… by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2558
fix: export URL schema for external usage by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2559
feat: add configurable harness startup timeout by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2560
fix(api): scrapeURL/index-metrics logging by @delong3 in https://github.com/firecrawl/firecrawl/pull/2561
fix(api): excessive robots.txt fetching/parsing by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2562
feat(api): optimize redis calls by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2564
Add strictJsonSchema on Branding Format LLM Call by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2567
Fix docker-compose db credentials by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2568
Fix Deps Audit by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2569
feat(api): ab/b by @mogery in https://github.com/firecrawl/firecrawl/pull/2570
feat(api): ab/bs by @mogery in https://github.com/firecrawl/firecrawl/pull/2571
feat(api): ab/bsrch by @mogery in https://github.com/firecrawl/firecrawl/pull/2572
Update html-to-markdown module by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2573
python-sdk: Update Agent Client by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2579
fix(extract): rabbitmq + poison pill handling by @mogery in https://github.com/firecrawl/firecrawl/pull/2581
chore(api): re-route fire-1 by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2584
feat: add bypassCreditChecks team flag for infinite graceful credit checks by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2578
fix(api/search): saner billing logic by @mogery in https://github.com/firecrawl/firecrawl/pull/2585
feat(api): custom header support for /map by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2593
fix(api): index metrics backwards compat by @delong3 in https://github.com/firecrawl/firecrawl/pull/2598
Add redis.sadd("billed_teams", team_id) to clearACUCTeam for centralized tracking by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2602
Throw error when v3-beta is passed into /extract by @micahstairs in https://github.com/firecrawl/firecrawl/pull/2601
Fixes #2583 add RabbitMQ health check and update API dependencies by @pcgeek86 in https://github.com/firecrawl/firecrawl/pull/2605
js-sdk: Retry polling after errors, add jobId to error exception by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2608
Allow formats on agent schema by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2603
feat(api): extract background images by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2611
Don't log EngineError to Sentry by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2613
Ignore Cancelled Jobs on Sentry by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2614
feat(api): include jobId in search response metadata by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2618
Update html-to-markdown version by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2620
Add request_id to Markdown Logs by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2621
Fix NPM Audit by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2623
Add error details to Go Markdown Service by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2624
Add Request ID to Markdown Service Logger by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2630
Increase Markdown Service Body Limit by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2629
Multi-arch images for playwright and api with linux/arm64 support by @gemyago in https://github.com/firecrawl/firecrawl/pull/2555
fix(nuq): zombie pg clients by @mogery in https://github.com/firecrawl/firecrawl/pull/2633
Webhooks for agent by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2628
Update firecrawl/html-to-markdown by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2634
fix: add Number() coercion to prevent string concatenation in credit calculations by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2635
Revert "fix: add Number() coercion to prevent string concatenation in credit calculations" by @devhims in https://github.com/firecrawl/firecrawl/pull/2636
fix(api): validate team ownership when appending to an existing crawl by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2637
Add debug logging for 402 credit check failures by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2638
Implement num_results for searxng by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2645
Bump auth_credit_usage_chunk_38 to auth_credit_usage_chunk_39 by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2646
Attempt fix infinite loop by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2644
Update lopdf to use load_metadata method for better performance by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2642
fix: backfill metadata title from og:title or twitter:title when missing by @devhims in https://github.com/firecrawl/firecrawl/pull/2650
fix(security): fix audit-ci vulnerabilities and clean up allowlists by @mogery in https://github.com/firecrawl/firecrawl/pull/2657
chore(api): export gdocs as HTML instead of PDF by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2652
Skip Markdown Check for Big HTML Documents by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2622
feat(sdk): add model parameter to agent endpoint by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2663
fix(security): add new hono vulnerabilities to audit allowlist by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2664
Update Audit Workflow by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2667
fix(nuq): align RabbitMQ expiration with lock reaper to prevent oscillation by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2672
fix(nuq-postgres): tune checkpoint and autovacuum to reduce job prefetch stalls by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2673
feat(billing): skip credit checks for organization teams by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2674
fix(cache): skip cache for screenshots with custom viewport or quality settings by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2677
.github/workflows: Migrate workflows to Blacksmith runners by @blacksmith-sh[bot] in https://github.com/firecrawl/firecrawl/pull/2680
fix(ci): allowlist new vulnerabilities in npm audit by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2681
fix(nuq-postgres): aggressive checkpoint tuning to prevent queue stalls by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2679
chore(ci): tune runners by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2683
chore(ci): tune runners by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2685
fix(js-sdk): detect mistaken use of Zod schema.shape and provide helpful error by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2684
fix(ci): allowlist low severity npm audit vulnerabilities by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2682
fix(ci): add new vulnerabilities to audit allowlists by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2688
feat(api): add ignoreCache parameter to map endpoint by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2686
chore(api): update A/B test to use FIRE_ENGINE_AB_URL instead of FIRE… by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2689
chore: update audit runner from 4c to 2c by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2690
fix(scraper): preserve postprocessor markdown in transformer by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2694
Fix CVE-2025-59466 by @loganaden in https://github.com/firecrawl/firecrawl/pull/2695
fix(docker): use node 22 major version instead of pinned 22.22 by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2696
feat(search): remove NuQ queue and execute scrapes directly without concurrency limits by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2668
fix: preserve gid parameter when rewriting Google Sheets URLs by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2693
feat(api): fire-engine action metadata by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2699
Don't Expose Internal Errors by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2700
docs: fix typos in README by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2703
feat(sdk): add agent webhook support to Node.js and Python SDKs by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2705
Return 400 when actions are requested but no engines support then by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2704
feat: add fire engine A/B comparison by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2706
chore: remove DB webhook logic by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2707
fix: a/b test logic by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2708
feat: allow slight variance on test comparison by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2709
feat: HTML to markdown conversion in A/B comparison by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2714
feat: request timeout parameter min by @delong3 in https://github.com/firecrawl/firecrawl/pull/2710
feat: word jaccard diff by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2720
fix(api): use correct v2 path in batch scrape status next URL by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2722
fix(js-sdk): add Zod v4 compatibility for schema conversion by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2724
feat(crawl): add sitemap-only support by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2726
fix(deps): resolve lodash prototype pollution vulnerability by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2728
Add manual pagination helpers for Python SDK by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2727
feat(agent): include model in status responses by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2717
feat(errors): improve all user-facing error messages with detailed explanations by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2697
chore(codeowners): add abimaelmartell as owner for branding files by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2735
fix: handle 415 Unsupported Media Type without retrying by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2737
fix(branding): use spread operator instead of Array.from for Set conversion by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2746
fix(js-sdk): export Watcher and WatcherOptions from SDK entrypoint by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2754
fix(audit): Fix pnpm audit issues by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2757
fix(audit): Upgrade eslint to fix GHSA-p5wg-g6qr-c7cg vulnerability in ingestion-ui by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2765
feat(api): add enhanced proxy option as alias for stealth by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2759
feat(api): only load more pages if we already have one by @tomsideguide in https://github.com/firecrawl/firecrawl/pull/2767
fix(api): update notification email from address by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2769
conc boost for agent interop by @rafaelsideguide in https://github.com/firecrawl/firecrawl/pull/2770
chore(api): Remove jest-junit test reporting by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2768
chore(api): Add logging to flaky map redirect test by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2771
fix(branding): Improve Logo Detection by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2749
feat(api): promote jobs in concurrency queue backfill by @tomsideguide in https://github.com/firecrawl/firecrawl/pull/2773
fix(security): allowlist fast-xml-parser vulnerability GHSA-37qj-frw5-hhjh by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2775
fix(branding): improve error messages for PDFs and documents by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2776
chore: remove firecrawl_jobs cleanup from ZDR cleaner by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2777
fix(billing): bump to update_tally_8_team by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2779
fix(api): add 'enhanced' to proxy enum in OpenAPI specs by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2780
chore: bump auth_credit_usage_chunk_39 to auth_credit_usage_chunk_40 by @firecrawl-spring[bot] in https://github.com/firecrawl/firecrawl/pull/2782
feat(rust-sdk): add v2 API namespace with agent support by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2778

Firecrawl v2.8.0 brings major improvements to agent workflows, developer tooling, and self-hosted deployments across the API and SDKs.

Key Features:

Parallel Agents — Execute thousands of /agent queries simultaneously with automatic failure handling and intelligent waterfall execution. Powered by Spark 1 Fast for instant retrieval, automatically upgrading to Spark 1 Mini for complex queries.
Firecrawl Skill — Enables agents to use Firecrawl for web scraping and data extraction.
Firecrawl CLI — Command-line interface with full scrape, search, crawl, and map support.
Spark Model Family — Three new models: Spark 1 Fast for instant retrieval, Spark 1 Mini for complex research queries, and Spark 1 Pro for advanced extraction tasks.
Agent Enhancements — Webhook support, model selection, and new MCP Server tools for autonomous web data gathering.

Bringing parallel processing to /agent, letting you batch hundreds or thousands of queries simultaneously. What took hours of sequential queries now completes in minutes with automatic failure handling and parallel execution.

Key Features:

Parallel Batch Processing — Run thousands of /agent queries simultaneously to enrich companies, research competitors, or build datasets at scale.
Intelligent Waterfall — Tries instant retrieval first, then automatically upgrades specific cells to full agent research (Spark One Mini) only when needed.
Real-Time Spreadsheet Interface — Work in familiar CSV format with instant visual feedback as cells populate in real-time.
Zero Configuration — Input data schema, write one prompt, hit run without workflow building.
Predictable Pricing — 10 credits per cell with Spark-1 Fast.

Introducing the Firecrawl Skill and CLI, a new way for AI agents to reliably access real-time web data. With a single install, agents like Claude Code, Antigravity, and OpenCode can access Firecrawl endpoints including scrape, search, crawl, and map.

Key Features:

One-Command Install — Install the skill with a single command to teach agents how to authenticate and use all of Firecrawl's endpoints.
Real-Time Web Data at Runtime — Agents can pull fresh, full-page content from docs, product pages, pricing, and articles exactly when needed.
Context-Efficient for Agents — Uses a file-based approach for context management and bash methods for efficient search and retrieval.
Works Across Complex & Dynamic Sites — Powered by Firecrawl's custom browser stack for reliable extraction from large, JavaScript-heavy sites.
Proven, Best-in-Class Coverage — Backed by benchmark results showing >80% coverage across real-world evaluations.

Firecrawl v2.7.0 is here!

ZDR Search support for enterprise customers.
Improved Branding Format with better detection.
Partner Integrations API now in closed beta.
Faster and more accurate screenshots.
Self-hosted improvements

And a lot more enhacements, check it out below!

New Features

Improved Branding Extract
Better logo and color detection for more accurate brand extraction results.
NOQ Scrape System (Experimental)
New scrape pipeline with improved stability and integrated concurrency checks.
Enhanced Redirect Handling
URLs now resolve before mapping, with safer redirect-chain detection and new abort timeouts.
Enterprise Search Parameters
New enterprise-level options available for the /search endpoint.
Integration-Based User Creation
Users can now be automatically created when coming from referring integrations.
minAge Scrape Parameter
Allows requiring a minimum cached age before re-scraping.
Extract Billing Credits
Extract jobs now use the same credit billing system as other endpoints.
Self-Host: Configurable Crawl Concurrency
Self-hosted deployments can now set custom concurrency limits.
Sentry Enhancements
Added Vercel AI integration, configurable sampling rates, and improved exception filtering.
UUIDv7 IDs
All new resources use lexicographically sortable UUIDv7.

API Improvements

DNS Resolution Errors Now Return 200 for more consistent failure handling.
Improved URL Mapping Logic including sitemap maxAge fixes, recursive sitemap support, Vue/Angular router normalization, and skipping subdomain logic for IP addresses.
Partial Results for Multi-Source Search instead of failing all sources.
Concurrency Metadata Added to scrape job responses.
Enhanced Metrics including total wait time, LLM usage, and format details.
Batch Scrape Upgrades
- Added missing /v2/batch/scrape/:jobId/errors endpoint
- Fixed pagination off-by-one bug
More Robust Error Handling for PDF/document engines, pydantic parsing, Zod validation, URL validation, and billing edge cases.

SDK Improvements

JavaScript SDK

Returns job ID from synchronous methods.
Improved WebSocket document event handling.
Fixed types, Deno WS, and added support for ignoreQueryParameter.
Version bump with internal cleanup.

Python SDK

Added extra metadata fields.
Improved batch validation handling.

Quality & Performance

Reduced log file size and improved tmp file cleanup.
Updated Express version and patched vulnerable packages.
Disabled markdown conversion for sitemap scrapes for improved performance.
Better precrawl logging and formatting.
Skip URL rewriting for published Google Docs.
Prevent empty cookie headers during webhook callbacks.

Self-Hosted Improvements

Disabled concurrency limit enforcement for self-hosted mode.
PostgreSQL credentials now configurable via environment variables.
Docker-compose build instructions fixed.

👥 New Contributors

@omahs
@davidkhala
@DraPraks
@devhims

Full Changelog: https://github.com/firecrawl/firecrawl/compare/v2.6.0...v2.7.0

What's Changed

(feat/dns) DNS Resolution errors should be a 200 by @nickscamara in https://github.com/firecrawl/firecrawl/pull/2402
Improve Logo and Color Detection on Branding Extract by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2362
(js-sdk) fix: ws 'document' event implementation by @rafaelsideguide in https://github.com/firecrawl/firecrawl/pull/2415
(js-sdk): Return job ID from synchronous methods by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2414
(js-sdk): Fix types by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2416
(js-sdk): Bump Version by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2417
fix(api): disable gcs logging without db auth by @delong3 in https://github.com/firecrawl/firecrawl/pull/2418
feat: noq scrape system by @delong3 in https://github.com/firecrawl/firecrawl/pull/2419
Muv2 exp add more logs by @tomkosm in https://github.com/firecrawl/firecrawl/pull/2421
feat(api): noq concurrency check integration by @delong3 in https://github.com/firecrawl/firecrawl/pull/2424
Fix concurrency backfill bug by @micahstairs in https://github.com/firecrawl/firecrawl/pull/2425
feat: redirect to docs when hitting main api endpoint by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2426
feat(api): total wait time in request metrics by @delong3 in https://github.com/firecrawl/firecrawl/pull/2428
(fix/search) rm legacy external search apis by @nickscamara in https://github.com/firecrawl/firecrawl/pull/2420
fix(api): various vulnerable packages (2025/11/21) by @mogery in https://github.com/firecrawl/firecrawl/pull/2431
fix(api): pdf + document engines not respecting skipTlsVerification flag and error handling for uncidi by @delong3 in https://github.com/firecrawl/firecrawl/pull/2435
fix(api): /map returning less urls with sitemap include by @delong3 in https://github.com/firecrawl/firecrawl/pull/2440
feat: resolve redirects before mapping urls by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2439
fix(api): tally system rework by @mogery in https://github.com/firecrawl/firecrawl/pull/2430
fix(api): update URL handling of resolved redirects by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2442
fix(api): opaque fire engine delete + poll interval by @delong3 in https://github.com/firecrawl/firecrawl/pull/2443
feat(api): add abort timeout for resolveRedirects by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2444
(python-sdk) feat: added extra fields to metadata by @rafaelsideguide in https://github.com/firecrawl/firecrawl/pull/2441
fix(api): handle case with no billed teams in tallyBilling function by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2446
fix: Add support for ignoreQueryParameter in map SDKs by @Chadha93 in https://github.com/firecrawl/firecrawl/pull/2429
fix(api): vue + angular router url normalization by @delong3 in https://github.com/firecrawl/firecrawl/pull/2447
feat(api): usedLlm + formats in request metrics by @delong3 in https://github.com/firecrawl/firecrawl/pull/2448
feat(api): switch to uuidv7 by @mogery in https://github.com/firecrawl/firecrawl/pull/2449
Add Sentry Settings by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2451
Filter Sentry Exceptions by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2453
Add minAge parameter to scrape (ENG-4073) by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2452
Fix typos by @omahs in https://github.com/firecrawl/firecrawl/pull/2457
fix docker-compose service build instructions by @davidkhala in https://github.com/firecrawl/firecrawl/pull/2406
Cleanup tmp files from downloadFile by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2455
Optimize Logs File Size by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2456
Update express version by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2465
Annotate test failures on CI by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2462
fix(api/precrawl): precrawl logging + format + skip index by @delong3 in https://github.com/firecrawl/firecrawl/pull/2466
fix(go-html-to-md): request body max 60MB by @delong3 in https://github.com/firecrawl/firecrawl/pull/2467
fix(api): dns + crawl denial errors by @delong3 in https://github.com/firecrawl/firecrawl/pull/2469
Disable markdown conversion for sitemap scrapes by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2461
Add missing /v2/batch/scrape/:jobId/errors endpoint by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2471
fix: improve pydantic parsing error handling | ENG-4070 by @Chadha93 in https://github.com/firecrawl/firecrawl/pull/2450
fix: Make PostgreSQL credentials configurable via environment variables by @DraPraks in https://github.com/firecrawl/firecrawl/pull/2388
feat(api): create users via referring integrations by @mogery in https://github.com/firecrawl/firecrawl/pull/2463
feat: muv2 exp apikey env by @tomkosm in https://github.com/firecrawl/firecrawl/pull/2472
(feat/search) Enterprise params by @nickscamara in https://github.com/firecrawl/firecrawl/pull/2412
Validate UUID from URL in Requests by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2392
Disable Concurrency Limit on Self Hosted by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2475
(js sdk)fix/ws deno by @rafaelsideguide in https://github.com/firecrawl/firecrawl/pull/2476
fix(api): sitemap max age for map requests by @delong3 in https://github.com/firecrawl/firecrawl/pull/2479
fix(api): sitemap max age for recursive sitemaps by @delong3 in https://github.com/firecrawl/firecrawl/pull/2480
feat: new app database shape by @mogery in https://github.com/firecrawl/firecrawl/pull/2445
chore(api): disable x-powered-by by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2483
Skip subdomain logic for IP addresses by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2477
Attempt Fix Search Tests by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2478
fix(api): don't bill where stealth proxy was unsupported by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2484
feat(extract): port to billing credits by @mogery in https://github.com/firecrawl/firecrawl/pull/2482
feat: [self-host] - add support to configure concurrency for crawl by @Chadha93 in https://github.com/firecrawl/firecrawl/pull/2193
Update AI SDK to Latest Version by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2369
Fix Zod Error Handling on V0 by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2488
fix: off-by-one bug in batch scrape pagination by @devhims in https://github.com/firecrawl/firecrawl/pull/2492
fix: make auto-recharge email show actual credit amount instead of hardcoded 1000 by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2485
fix: allow partial results when searching multiple sources by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2490
gitignore test results xml by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2493
Fix latest advisory issues by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2494
feat(api): add sentry vercel ai integration by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2473
feat(api/sentry): make sampling rates configurable via environment va… by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2495
Python sdk fix/batch validate limit by @rafaelsideguide in https://github.com/firecrawl/firecrawl/pull/2399
chore(sentry): set default TRACE_SAMPLE_RATE to 0 by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2498
chore(api): disable vercel input/output tracing by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2499
feat(api): concurrency limit info in scrape metadata by @delong3 in https://github.com/firecrawl/firecrawl/pull/2497
engpicker POC by @mogery in https://github.com/firecrawl/firecrawl/pull/2501
fix: skip URL rewriting for published Google Docs by @devhims in https://github.com/firecrawl/firecrawl/pull/2500
fix: prevent empty cookie header in webhook callbacks by @devhims in https://github.com/firecrawl/firecrawl/pull/2504

New Contributors

@omahs made their first contribution in https://github.com/firecrawl/firecrawl/pull/2457
@davidkhala made their first contribution in https://github.com/firecrawl/firecrawl/pull/2406
@DraPraks made their first contribution in https://github.com/firecrawl/firecrawl/pull/2388
@devhims made their first contribution in https://github.com/firecrawl/firecrawl/pull/2492

Full Changelog: https://github.com/firecrawl/firecrawl/compare/v2.6.0...v2.7.0

Major release with enterprise features and platform improvements.

Key Features:

ZDR Search Support — Enterprise customers can now search with Zero Data Retention enabled end-to-end.
Partner Integrations API — Available in closed beta for native integrations in partner products.
Improved Branding Format — Better detection and support across all platforms.
Faster Screenshots — Enhanced viewport and full page screenshots with improved speed and accuracy.
Self-hosted Improvements — Significant enhancements for deployments and infrastructure.
Performance Enhancements — Platform-wide improvements for better user experience.

Highlights

Unified Billing Model - Credits and tokens merged into single system. Extract now uses credits (15 tokens = 1 credit), existing tokens work everywhere.
Full Release of Branding Format - Full support across Playground, MCP, JS and Python SDKs.
Change Tracking - Faster and more reliable detection of web page content updates.
Reliability and Speed Improvements - All endpoints significantly faster with improved reliability.
Instant Credit Purchases - Buy credit packs directly from dashboard without waiting for auto-recharge.
Improved Markdown Parsing - Enhanced markdown conversion and main content extraction accuracy.
Core Stability Fixes - Fixed change-tracking issues, PDF timeouts, and improved error handling.

What's Changed

fix(mu): Bug fix on v2 exp by @tomkosm in https://github.com/firecrawl/firecrawl/pull/2345
Allow index use with waitFor (ENG-3481) by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2346
Fix autoCharge return, add top level guard by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2341
fix: import MAX_MAP_LIMIT from types.ts to resolve 1000 URL cap by @prashu0705 in https://github.com/firecrawl/firecrawl/pull/2333
chore: improve llm extract logging by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2348
fix: error truncation by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2349
feat(go-html-to-md): enhance markdown conversion with robust PRE and … by @rafaelsideguide in https://github.com/firecrawl/firecrawl/pull/2321
feat(billing): merge credits and tokens by @mogery in https://github.com/firecrawl/firecrawl/pull/2352
chore: update geoip database by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2354
Implement Branding Format by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2326
Filter non-HTTP(S) protocols with separate error message by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2357
Add branding format support to JS and Python SDKs by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2360
Fix: Handle invalid favicon URLs gracefully in metadata extraction by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2361
feat: allow disabling webhook delivery by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2367
feat: add engine forcing by domain pattern by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2371
revert nuq commits by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2376
CI: Remove npm audit from server tests by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2385
ci: Fix dependency audit by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2386
(fix/ctracking) Fix change tracking issues by @nickscamara in https://github.com/firecrawl/firecrawl/pull/2391
update: Adds support for recursive schema for python-sdk with model selection by @Chadha93 in https://github.com/firecrawl/firecrawl/pull/2266
fix: image search field mapping in Python SDK by @naaa760 in https://github.com/firecrawl/firecrawl/pull/2244
fix(api/scrape): document + pdf scrape loop by @delong3 in https://github.com/firecrawl/firecrawl/pull/2396

New Contributors

@prashu0705 made their first contribution in https://github.com/firecrawl/firecrawl/pull/2333
@naaa760 made their first contribution in https://github.com/firecrawl/firecrawl/pull/2244

Full Changelog: https://github.com/firecrawl/firecrawl/compare/v2.5.0...v2.6.0

v2.6.0 available now

Major release with unified billing, enhanced features, and significant reliability improvements.

Key Features:

Unified Billing Model — Credits and tokens merged into single system. Extract now uses credits (15 tokens = 1 credit), existing tokens work everywhere.
Enhanced Branding Format — Full support across Playground, MCP, JS and Python SDKs.
Reliability and Speed Improvements — All endpoints significantly faster with improved reliability.
Instant Credit Purchases — Buy credit packs directly from dashboard without waiting for auto-recharge.
Improved Markdown Parsing — Enhanced markdown conversion and main content extraction accuracy.
Change Tracking — Faster and more reliable detection of web page content updates.
Core Stability Fixes — Fixed core stability issues, PDF timeouts, and improved error handling.

v2.5.0 - The World's Best Web Data API

We now have the highest quality and most comprehensive web data API available powered by our new semantic index and custom browser stack.

See the benchmarks below:

New Features

Implemented scraping for .xlsx (Excel) files.
Introduced new crawl architecture and NUQ concurrency tracking system.
Per-owner/group concurrency limiting + dynamic concurrency calculation.
Added group backlog handling and improved group operations.
Added /search pricing update
Added team flag to skip country check.
Always populate NUQ metrics for improved observability.
New test-site app for improved CI testing.
Extract metadata from document head for richer output.

Enhancements & Improvements

Improved blocklist loading and unsupported site error messages.
Updated x402-express version.
Improved includePaths handling for subdomains.
Updated self-hosted search to use DuckDuckGo.
JS & Python SDKs no longer require API key for self-hosted deployments.
Python SDK timeout handling improvements.
Rust client now uses tracing instead of print.
Reduced noise in auto-recharge Slack notifications.

Fixes

Ensured crawl robots.txt warnings surface reliably.
Resolved concurrency deadlocks and duplicate job handling.
Fixed search country defaults and pricing logic bugs.
Fixed port conflicts in harness environments.
Fixed viewport dimension support and screenshot behavior in Playwright.
Resolved CI test flakiness (playwright cache, prod tests).

👋 New Contributors

@delong3
@c4nc
@codetheweb

Full diff: https://github.com/firecrawl/firecrawl/compare/v2.4.0...v2.5.0

What's Changed

More verbose blocklist loading errors by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2277
Update x402-express Version by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2279
Revise unsupported site error message by @micahstairs in https://github.com/firecrawl/firecrawl/pull/2286
feat: index precrawl by @delong3 in https://github.com/firecrawl/firecrawl/pull/2289
fix: ensure includePaths apply to subdomains when allowSubdomains is enabled by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2278
Fix search country parameter to default to undefined when location is set by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2283
Fix Port Conflict in Harness by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2285
js-sdk: require API key only for cloud API (not self-hosted) by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2237
feat: Implement Scraping Excel xlsx files by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2284
feat(nuq): concurrency tracking by @mogery in https://github.com/firecrawl/firecrawl/pull/2291
fix(crawl): surface robots.txt warning reliably by @ftonato in https://github.com/firecrawl/firecrawl/pull/2287
feat(nuq): add source for max_concurrency by @mogery in https://github.com/firecrawl/firecrawl/pull/2293
feat(nuq/concurrency-tracking): fix deadlock by @mogery in https://github.com/firecrawl/firecrawl/pull/2295
Replace self-hosted Google with DDG search (ENG-3499) by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2225
python-sdk: Fix timeout handling across api calls by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2288
python-sdk: Don't require API Key when running Self Hosted by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2290
Add team flag to skip country check by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2300
Update /search endpoint pricing to 2 credits per 10 search results by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2299
Fix search pricing bug by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2301
feat(nuq): per-owner-per-group concurrency limiting by @mogery in https://github.com/firecrawl/firecrawl/pull/2302
update: handle circular refs as well in recursive schema by @Chadha93 in https://github.com/firecrawl/firecrawl/pull/2298
feat(nuq): dynamically calculate current concurrency by @mogery in https://github.com/firecrawl/firecrawl/pull/2305
feat(nuq): group_id, job backlogs, and group add operations by @mogery in https://github.com/firecrawl/firecrawl/pull/2309
feat(ci): new test-site app + updated jest tests by @delong3 in https://github.com/firecrawl/firecrawl/pull/2312
feat: new crawl architecture by @mogery in https://github.com/firecrawl/firecrawl/pull/2320
Moved index for backlog query after the table creation by @c4nc in https://github.com/firecrawl/firecrawl/pull/2323
fix(ci): playwright cache + prod tests by @delong3 in https://github.com/firecrawl/firecrawl/pull/2314
Improve slack notifications for scale auto-recharges by @micahstairs in https://github.com/firecrawl/firecrawl/pull/2325
Make auto-recharge notifications less noisy by @micahstairs in https://github.com/firecrawl/firecrawl/pull/2327
fix: viewport dimension support for Playwright engine screenshots by @ftonato in https://github.com/firecrawl/firecrawl/pull/2329
feat: always populate nuq metrics by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2328
fix: scrape viewport test by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2330
Revert "Merge pull request #2329 from firecrawl/devin/ENG-3639-175924… by @micahstairs in https://github.com/firecrawl/firecrawl/pull/2332
fix(nuq): per-instance listen channel ID by @mogery in https://github.com/firecrawl/firecrawl/pull/2336
fix(auto_charge): add a cooldown to the new recharge route by @mogery in https://github.com/firecrawl/firecrawl/pull/2338
chore: update last scrape rpc by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2339
Rust client: use tracing instead of print by @codetheweb in https://github.com/firecrawl/firecrawl/pull/2324
Extract metadata from document head (ENG-3822) by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2342
fix(nuq,concurrency-limit): handle if there are duplicate jobs in the concurrency queue by @mogery in https://github.com/firecrawl/firecrawl/pull/2343

New Contributors

@delong3 made their first contribution in https://github.com/firecrawl/firecrawl/pull/2289
@c4nc made their first contribution in https://github.com/firecrawl/firecrawl/pull/2323
@codetheweb made their first contribution in https://github.com/firecrawl/firecrawl/pull/2324

Full Changelog: https://github.com/firecrawl/firecrawl/compare/v2.4.0...v2.5.0

v2.5.0 - The World's Best Web Data API

Major release delivering the highest quality and most comprehensive web data API with two major infrastructure improvements: a new Semantic Index and a completely custom browser stack.

Key Features:

Semantic Index — New infrastructure improvement for better understanding and extraction of web content.
Custom Browser Stack — Completely redesigned browser infrastructure for improved reliability and performance.
Benchmark Results — Represents a significant leap forward in web data extraction quality and comprehensiveness.
Open-Sourced Benchmarks — Released scrape-evals, a reproducible framework for testing web scraping engines on 1,000 real URLs.

New Features

New PDF Search Category - You can now search for only pdfs via our v2/search endpoints by specifying .pdf category
Gemini 2.5 Flash CLI Image Editor — Create and edit images directly in the CLI using Firecrawl + Gemini 2.5 Flash integration (#2172)
x402 Search Endpoint (/v2/x402) — Added a next-gen search API with improved accuracy and speed (#2218)
RabbitMQ Event System — Firecrawl jobs now support event-based communication and prefetching from Postgres (#2230, #2233)
Improved Crawl Status API — More accurate and real-time crawl status reporting using the new crawl_status_2 RPC (#2239)
Low-Results & Robots.txt Warnings — Users now receive clear feedback when crawls are limited by robots.txt or yield few results (#2248)
Enhanced Tracing (OpenTelemetry) — Much-improved distributed tracing for better observability across services (#2219)
Metrics & Analytics — Added request-level metrics for both Scrape and Search endpoints (#2216)
Self-Hosted Webhook Support — Webhooks can now be delivered to private IP addresses for self-hosted environments (#2232)

Improvements

Reduced Docker Image Size — Playwright service image size reduced by 1 GB by only installing Chromium (#2210)
Python SDK Enhancements — Added "cancelled" job status handling and poll interval fixes (#2240, #2265)
Faster Node SDK Timeouts — Axios timeouts now propagate correctly, improving reliability under heavy loads (#2235)
Improved Crawl Parameter Previews — Enhanced prompts and validation for crawl parameter previews (#2220)
Zod Schema Validation — Stricter API parameter validation with rejection of extra fields (#2058)
Better Redis Job Handling — Fixed edge cases in getDoneJobsOrderedUntil for more stable Redis retrieval (#2258)
Markdown & YouTube Fixes — Fixed YouTube cache and empty markdown summary bugs (#2226, #2261)
Updated Docs & Metadata — README updates and new metadata fields added to the JS SDK (#2250, #2254)
Improved API Port Configuration — The API now respects environment-defined ports (#2209)

Fixes

Fixed recursive $ref schema validation edge cases (#2238)
Fixed enum arrays being incorrectly converted to objects (#2224)
Fixed harness timeouts and self-hosted docker-compose.yaml issues (#2242, #2252)

New Contributors

@Chadha93 (#2155)
@MAVRICK-1 (#2172)
@bernie43 (#2210)
@abimaelmartell (#2209)
@th3w1zard1 (#2252)

🔗 Full Changelog: v2.3.0 → v2.4.0

What's Changed

fix: add missing poll_interval param in watcher by @Chadha93 in https://github.com/firecrawl/firecrawl/pull/2155
feat: Add Firecrawl + Gemini 2.5 Flash Image CLI Editor by @MAVRICK-1 in https://github.com/firecrawl/firecrawl/pull/2172
Add environment variable to disable blocklist by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2197
Fix ARM builds by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2198
fix(v1/search): if f-e search is available, only use that by @mogery in https://github.com/firecrawl/firecrawl/pull/2199
Upgrade html-to-markdown dependency (ENG-3563) by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2195
feat(map): add crawler and scrape options to job logging by @ftonato in https://github.com/firecrawl/firecrawl/pull/2203
refactor: integrate facilitator in payment middleware by @ftonato in https://github.com/firecrawl/firecrawl/pull/2213
(feat/metrics) Scrape and Search Request Metrics by @nickscamara in https://github.com/firecrawl/firecrawl/pull/2216
(feat/big-query) Big Query by @nickscamara in https://github.com/firecrawl/firecrawl/pull/2217
feat(api): add x402 search endpoint to /v2 by @ftonato in https://github.com/firecrawl/firecrawl/pull/2218
feat(api/otel): much improved tracing by @mogery in https://github.com/firecrawl/firecrawl/pull/2219
fix: Add Zod validation to reject additionalProperties in schema parameters by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2058
Reduce playwright-service image size by 1 GB by installing only Chromium by @bernie43 in https://github.com/firecrawl/firecrawl/pull/2210
fix: enum arrays being converted to objects by @Chadha93 in https://github.com/firecrawl/firecrawl/pull/2224
feat(nuq): RabbitMQ support for job finish events and waiting by @mogery in https://github.com/firecrawl/firecrawl/pull/2230
fix: Use port from env.PORT for API by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2209
feat(nuq/rabbitmq): add prefetching jobs from psql to rabbitmq by @mogery in https://github.com/firecrawl/firecrawl/pull/2233
fix: skip summary generation when markdown is empty by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2226
Propagate timeout to Axios in Node SDK (ENG-3474) by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2235
feat(api/crawl-status): use crawl_status_2 RPC by @mogery in https://github.com/firecrawl/firecrawl/pull/2239
Allow self-hosted webhook delivery to private IP addresses by @abimaelmartell in https://github.com/firecrawl/firecrawl/pull/2232
Update harness timeout by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2242
python-sdk: include "cancelled" in CrawlJob.status and exit wait loop on cancel (fixes #2190) by @Jeelislive in https://github.com/firecrawl/firecrawl/pull/2240
feat(api/ci): test with RabbitMQ on prod by @mogery in https://github.com/firecrawl/firecrawl/pull/2241
(fix/crawl-params) Enhance crawl param preview prompt further by @nickscamara in https://github.com/firecrawl/firecrawl/pull/2220
build(deps): bump actions/checkout from 3 to 5 by @dependabot[bot] in https://github.com/firecrawl/firecrawl/pull/2115
fix: harness by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2249
Fix a self-hosted docker-compose.yaml bug caused by a recent firecrawl change by @th3w1zard1 in https://github.com/firecrawl/firecrawl/pull/2252
fix: handle $ref for recursive schema validation by @Chadha93 in https://github.com/firecrawl/firecrawl/pull/2238
Add missing metadata fields to JS SDK (ENG-3439) by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2250
Update README.md by @nickscamara in https://github.com/firecrawl/firecrawl/pull/2254
fix: handle edge case in getDoneJobsOrderedUntil function for Redis job retrieval by @ftonato in https://github.com/firecrawl/firecrawl/pull/2258
Fix YouTube cache markdown bug by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2261
feat(api): add warnings for low results and robots.txt restrictions in map and crawl controllers by @ftonato in https://github.com/firecrawl/firecrawl/pull/2248
Test new mu alternative by @tomkosm in https://github.com/firecrawl/firecrawl/pull/2263
chore(python-sdk): Bump version to 4.3.7 for poll_interval fix by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2265
Feat/test new mu alt by @tomkosm in https://github.com/firecrawl/firecrawl/pull/2267
(feat/search-index) Search Index by @nickscamara in https://github.com/firecrawl/firecrawl/pull/2268
Feat/test new mu alt by @tomkosm in https://github.com/firecrawl/firecrawl/pull/2270
(feat/search-index) Separate service by @nickscamara in https://github.com/firecrawl/firecrawl/pull/2271
fix: additional queue_scrape for nuq schema by @Chadha93 in https://github.com/firecrawl/firecrawl/pull/2272
(feat/search) Pdf search category by @nickscamara in https://github.com/firecrawl/firecrawl/pull/2276

New Contributors

@Chadha93 made their first contribution in https://github.com/firecrawl/firecrawl/pull/2155
@MAVRICK-1 made their first contribution in https://github.com/firecrawl/firecrawl/pull/2172
@bernie43 made their first contribution in https://github.com/firecrawl/firecrawl/pull/2210
@abimaelmartell made their first contribution in https://github.com/firecrawl/firecrawl/pull/2209
@th3w1zard1 made their first contribution in https://github.com/firecrawl/firecrawl/pull/2252

Full Changelog: https://github.com/firecrawl/firecrawl/compare/v2.3.0...v2.4.0

Major release featuring open-source scrape-evals benchmark testing 13 web scraping engines on 1,000 URLs, improved full-page extraction with enhanced browser stack, semantic index for faster retrieval of fresh or previously indexed data, 5x cheaper search with auto-recharge credit packs, smarter concurrency and crawl architecture for improved throughput and reliability, and Excel (.xlsx) scraping support for spreadsheets and CSV files.

New Features

YouTube Support: You can now get YouTube transcripts
Enterprise Auto-Recharge: Added enterprise support for auto-recharge
odt and .rtf: Now support odt and rtf file parsing
Docx Parsing: 50x faster docx parsing
K8s Deployment: Added NuQ worker deployment example
Self Host: Tons of improvements for our self host users

Improvements & Fixes

Stability: Fixed timeout race condition, infinite scrape loop, and location query bug
Tooling: Replaced ts-prune with knip, updated pnpm with minimumReleaseAge
Docs: Added Rust to CONTRIBUTING and fixed typos
Security: Fixed pkgvuln issue

What's Changed

Update blocklist by @micahstairs in https://github.com/firecrawl/firecrawl/pull/2150
docs: fix typo and punctuation in CONTRIBUTING.md by @jarrensj in https://github.com/firecrawl/firecrawl/pull/2149
Fix timeout error message race condition for ENG-3372 by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2144
Add exceptions to blocklist by @micahstairs in https://github.com/firecrawl/firecrawl/pull/2156
fix: pkgvuln by @mogery in https://github.com/firecrawl/firecrawl/pull/2158
Replace ts prune with knip (ENG-3540) by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2148
feat(auto-recharge): enterprise by @mogery in https://github.com/firecrawl/firecrawl/pull/2127
feat(scrapeURL/index): index metrics by @mogery in https://github.com/firecrawl/firecrawl/pull/2160
Update pnpm and add minimumReleaseAge (ENG-3560) by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2162
feat(api/scrapeURL): add special support for YouTube watch pages by @mogery in https://github.com/firecrawl/firecrawl/pull/2157
fix(scrapeURL/index): locations array querying bug by @mogery in https://github.com/firecrawl/firecrawl/pull/2164
Fix infinite loop when scraping a forbidden webpage (ENG-3339) by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2147
Add Rust to CONTRIBUTING by @oalsing in https://github.com/firecrawl/firecrawl/pull/2180
feat(scrapeURL/summary): use gpt-5-mini by @mogery in https://github.com/firecrawl/firecrawl/pull/2174
Custom Rust document parser (ENG-3489) by @amplitudesxd in https://github.com/firecrawl/firecrawl/pull/2159
feat: add NuQ worker deployment to Kubernetes examples by @devin-ai-integration[bot] in https://github.com/firecrawl/firecrawl/pull/2163
feat(api): move blocklist to DB by @mogery in https://github.com/firecrawl/firecrawl/pull/2186

New Contributors

@jarrensj made their first contribution in https://github.com/firecrawl/firecrawl/pull/2149
@oalsing made their first contribution in https://github.com/firecrawl/firecrawl/pull/2180

Full Changelog: https://github.com/firecrawl/firecrawl/compare/v2.2.0...v2.3.0