Would be nice if docs had a comparison between traditional scraping (e.g. using ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		nextworddev on May 7, 2024 \| parent \| context \| favorite \| on: ScrapeGraphAI: Web scraping using LLM and direct g... Would be nice if docs had a comparison between traditional scraping (e.g. using headless browsers, beautifulsoup, etc) versus this approach. Exactly how is AI used?

geuis on May 7, 2024 [–]

A lot of larger LLM's have been trained on millions of pages of html. They have the ability to understand raw html structure and extract content from them. I've been having some success with this using Mixtral 8x7B.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact