Blog Project

Spider parser

#astro#nextjs#react#tailwind

Spider is a powerful web article parser that transforms cluttered web pages into clean, readable content.

Parsing Strategies

Spider supports multiple parsing strategies optimized for different types of websites:

  • auto (default): Automatically selects the best strategy based on domain
  • googlebot: Mimics Google's crawler for general content
  • facebook: Uses Facebook's external hit agent
  • archive: Optimized for archived or paywall content

Custom Parsers

You can extend Spider with custom parsers for specific domains by modifying the functions/node-fetch/node-fetch.mjs file.

Acknowledgments

Rest of the article under construction