Web data extraction — articles, products, discussions, images, videos, and auto-detect.
“Despite its promising capabilities in web data extraction across various content types, the service fell woefully short in execution, with every test resulting in failure and painfully slow response times that rendered it practically unusable. The inability to successfully extract even basic information casts serious doubt on the reliability and efficiency of the service, making it a frustrating and costly experience.”
“Diffbot is a non-starter—it failed every single extraction task I threw at it, returned 502 errors on basic e-commerce pages, and took 8+ seconds on simple article parsing when it bothered to respond at all. For a paid API service, zero reliability means zero value, period.”
Real requests we sent and the responses we received.
Extract article content from a standard news webpage
POST /diffbot/articletypical8405ms{"data":{"error":"Could not download page (403)","errorCode":500},"success":true}
Extract images from a photo gallery page
POST /diffbot/imagetypical71msHTTP 429
Extract content from a minimal or sparse webpage
POST /diffbot/analyzeedge62msHTTP 429
Extract product information from an e-commerce page
POST /diffbot/producttypical1686msHTTP 502
POST /diffbot/article$4200POST /diffbot/product$4200POST /diffbot/discussion$4200POST /diffbot/image$4200POST /diffbot/video$4200POST /diffbot/analyze$4200POST /diffbot/event$4200POST /diffbot/list$4200POST /diffbot/job$4200https://diffbot.mpp.paywithlocus.com