Mijn gedachten over internet, media, politiek, technologie en innovatie
zaterdag 9 oktober 2004
RSS Penetration Among Online Publishers
"Only 7% of the sources Topix.net crawls have XML feeds. I'd estimate that only a few hundreds of the top 3,000 newspapers we crawl have RSS support. The rest we obtain with a news crawler which is good about finding articles on news sites, leaving behind the ads and navigation sidebars. It's low maintenance so we don't have to change anything everytime a site redesigns its html. "