fix: suppress Crawlee post-run ENOENT unhandledRejection in fs-com.ts

After PlaywrightCrawler.run() resolves, Crawlee's internal task loop
schedules one final _isTaskReadyFunction call that tries to read a
request queue .json file already cleaned up during processing. This
ENOENT fires as an unhandledRejection and calls process.exit(1),
aborting Phase 2 before prices are written to the database.

Added a targeted unhandledRejection handler in the require.main block
that swallows ENOENT errors from request_queues paths (benign Crawlee
cleanup race) while re-raising all other rejections.
This commit is contained in:
Rene Fichtmueller 2026-04-18 02:51:00 +02:00
parent 6be2c131d3
commit e552e08015

View File

@ -810,6 +810,23 @@ export async function scrapeFs(): Promise<void> {
}
if (require.main === module) {
// Crawlee's FileSystemStorage emits spurious unhandledRejection errors after
// crawler.run() resolves: the internal task loop schedules one final
// _isTaskReadyFunction call which tries to read a request .json file that
// Crawlee already cleaned up during normal processing. This ENOENT is benign
// (crawling is done), but the default unhandledRejection handler would call
// process.exit(1) and abort Phase 2. We swallow it here.
process.on("unhandledRejection", (reason) => {
const msg = reason instanceof Error ? reason.message : String(reason);
if (msg.includes("ENOENT") && msg.includes("request_queues")) {
// Benign Crawlee post-run cleanup race — ignore
return;
}
// All other unhandled rejections are real errors
console.error("Unhandled rejection:", reason);
process.exit(1);
});
scrapeFs()
.then(() => pool.end())
.catch((err) => {