Skip to content

Dumping my scripts

Pushing four scripts that I wrote:

  1. Scraper for NewsClick -- can scrape both RSS feeds and can crawl through an arbitrarily large number of articles
  2. Scraper for theHindu -- can scrape both RSS feeds and can crawl through an arbitrarily large number of articles -- Input isn't very streamlined as of now
  3. Scraper for ToI -- can scrape the RSS feed
  4. A solrSearch script that actually searches through the current solr core.

NOTE: All solr code is written assuming that everything is being written to a solr CORE, named "Articles"

Merge request reports