Firecrawl v2.1.0 is here!
✨ New Features
- Search Categories: Filter search results by specific categories using the
categories parameter:
github: Search within GitHub repositories, code, issues, and documentation
research: Search academic and research websites (arXiv, Nature, IEEE, PubMed, etc.)
- More coming soon
- Image Extraction: Added image extraction support to the v2 scrape endpoint.
- Data Attribute Scraping: Now supports extraction of
data-* attributes.
- Hash-Based Routing: Crawl endpoints now handle hash-based routes.
- Improved Google Drive Scraping: Added ability to scrape TXT, PDF, and Sheets from Google Drive.
- PDF Enhancements: Extracts PDF titles and shows them in metadata.
- API Enhancements:
- Map endpoint supports up to 100k results.
- Helm Chart: Initial Helm chart added for Firecrawl deployment.
- Security: Improved protection against XFF spoofing.
🛠 Fixes
- Fixed UTF-8 encoding in Google search scraper.
- Restored crawl status in preview mode.
- Fixed missing methods in Python SDK.
- Corrected JSON response handling for v2 search with
scrapeOptions.formats.
- Fixed field population for
credits_billed in v0 scrape.
- Improved document field overlay in v2 search.
👥 New Contributors
- @kelter-antunes
- @vishkrish200
- @ieedan
🔗 Full Changelog
What's Changed
New Contributors
Full Changelog: https://github.com/firecrawl/firecrawl/compare/v2.0.1...v2.1.0