Crawlee
Mon
Wed
Fri
JunJulAugSepOctNovDecJanFebMarAprMay
LessMore
Releases1Avg0/wkVersionsv3.17.0
Last Checked
20h ago
All notable changes to this project will be documented in this file. See Conventional Commits for commit guidelines.
Browser.pages() correctly in PuppeteerPlugin (#3439) (c3a4b3b)useIncognitoPages (#3433) (db2bb68)maxRequestsPerCrawl (#3531) (b23319b)ignoreProxyCertificate option for the internal proxy-chain instance (#3418) (02eec66), closes #3369maxCrawlDepth warning is logged only once (#3337) (9d01334), closes #3336BasicCrawler.stop() calls correctly (#3324) (9c0580b), closes #3257RequestQueueV1 (#3341) (89309bc)@crawlee/stagehand package for AI-powered browser automation (#3331) (a89cb5a), closes #3064handleCloudflareChallenge more configurable (#3247) (629daf8), closes #3127discoverValidSitemaps utility (#3339) (29f52ed)_timeoutAndRetry (#3206) (9c1cf6d), closes /github.com/apify/crawlee/pull/3188#discussion_r2410256271AdaptivePlaywrightCrawler (#3188) (9569d19)launchOptions with useIncognitoPages (#3181) (84a4b70), closes /github.com/apify/crawlee/issues/3173#issuecomment-3346728227 #3173 #3173systemInfoV2 by default (#3208) (617a343)ImpitHttpClient respects the internal Request timeout (#3103) (a35376d)proxyUrls list can contain null (#3142) (dc39cc2), closes #3136exportData calls on empty datasets (#3115) (298f170), closes #2734maxCrawlDepth with a custom enqueueLinks transformRequestFunction (#3159) (e2ecb74)collectAllKeys option for BasicCrawler.exportData (#3129) (2ddfc9c), closes #3007TandemRequestProvider for combined RequestList and RequestQueue usage (#2914) (4ca450f), closes #2499Note: Version bump only for package @crawlee/root
pre|postLaunchHooks prematurely (#3062) (681660e)exclude option in enqueueLinksByClickingElements (#3058) (013eb02)HttpCrawler (#3060) (b5fcd79), closes /github.com/apify/crawlee/blob/f68d2a95d67cc6230122dc1a5226c57ca23d0ae7/packages/browser-crawler/src/internals/browser-crawler.ts#L481-L486 #3029maxCrawlDepth crawler option (#3045) (0090df9), closes #2633onSkippedRequest for AdaptivePlaywrightCrawler.enqueueLinks (#3043) (fc23d34), closes #3026 #3039limit checking (#3038) (2774124), closes #3037Sitemap.tryCommonNames (#3015) (64a090f), closes #2884addRequests methods (#3013) (a4ab748), closes #2980AdaptivePlaywrightCrawler (#2987) (76431ba), closes #2899KVS.listKeys() prefix and collection parameters (#3001) (5c4726d), closes #2974ImpitHttpClient (#2991) (120f0a7)PlaywrightGotoOptions won't result in unknown when playwright is not installed (#2995) (93eba38), closes #2994body from iframe elements (#2986) (c36166e), closes #2979MinimumSpeedStream and ByteCounterStream helpers (#2970) (921c4ee)systemInfoV2 in snapshotter (#2961) (4100eab), closes #2958KVS.setRecord calls (#2962) (d31d90e)_createPageForBrowser in browser pool (#2950) (27ba74b), closes #2789[@apilink](https://github.com/apilink) to [@link](https://github.com/link) on build (#2949) (abe1dee), closes #2717autoscaledPoolOptions.isTaskReadyFunction option (#2948) (fe2d206), closes #2922BrowserCrawler (#2908) (3107e55), closes #2851RobotsFile to RobotsTxtFile (#2913) (3160f71), closes #2910406 as other 4xx status codes in HttpCrawler (#2907) (b0e6f6d), closes #2892context.body (#2838) (32d6d0e), closes #2401camoufox template correctly (#2864) (a9d008c), closes #2863handleCloudflareChallenge helper (#2865) (9a1725f)impit streaming (#2833) (af2fe23), closes #2756CrawlerRunOptions before passing them to addRequests (#2803) (02a598c), closes #2802BasicCrawler tidy-up on CriticalError (#2817) (53331e8), closes #2807impit-based HttpClient implementation (#2787) (61d7ffa)BasicCrawler.stop() (#2792) (af2966f), closes #2777.trim() urls from pretty-printed sitemap.xml files (#2709) (802a6fe), closes #2698fingerprintGeneratorOptions types (#2705) (fcb098d), closes #2703forefront request fetching in RQv2 (#2689) (03951bd), closes #2669prolong- and deleteRequestLock forefront option (#2690) (cba8da3), closes #2681 #2689 #2669.isFinished() before RequestList reads (#2695) (6fa170f)UInt8Array in KVS.setValue() (#2682) (8ef0e60)errorHandler for session errors (#2683) (7d72bcb), closes #2678username and password (#2696) (0f0fcc5)ignoreHTTPSErrors to acceptInsecureCerts to support v23 (#2684) (f3927e6)forefront option in MemoryStorage's RequestQueue (#2681) (b0527f9), closes #2669SitemapRequestList.teardown() doesn't break persistState calls (#2673) (fb2c5cd), closes /github.com/apify/crawlee/blob/f3eb99d9fa9a7aa0ec1dcb9773e666a9ac14fb76/packages/core/src/storages/sitemap_request_list.ts#L446 #2672FACEBOOK_REGEX to match older style page URLs (#2650) (a005e69), closes #2216inProgress cache, rely solely on locked states (#2601) (57fcb08)globs & regexps for SitemapRequestList (#2631) (b5fd3a9)iframe expansion to parseWithCheerio in browsers (#2542) (328d085), closes #2507ignoreIframes opt-out from the Cheerio iframe expansion (#2562) (474a8dc)