Dumping my scripts
Pushing four scripts that I wrote:
- Scraper for NewsClick -- can scrape both RSS feeds and can crawl through an arbitrarily large number of articles
- Scraper for theHindu -- can scrape both RSS feeds and can crawl through an arbitrarily large number of articles -- Input isn't very streamlined as of now
- Scraper for ToI -- can scrape the RSS feed
- A solrSearch script that actually searches through the current solr core.
NOTE: All solr code is written assuming that everything is being written to a solr CORE, named "Articles"