Project - BLOOD LINKS issueshttps://code.swecha.org/healthcare/project---blood-links/-/issues2019-11-02T05:28:31Zhttps://code.swecha.org/healthcare/project---blood-links/-/issues/41Data Cleaning and Pattern Matching/Record Finding2019-11-02T05:28:31ZGanesh KatrapatiData Cleaning and Pattern Matching/Record FindingSkill Required : RegExp, Pandas
Input : Text/CSV Files
Output : Unified CSV format, clean bad data, missing data, write REGEXP patterns for various fields.Skill Required : RegExp, Pandas
Input : Text/CSV Files
Output : Unified CSV format, clean bad data, missing data, write REGEXP patterns for various fields.Data Gatheringhttps://code.swecha.org/healthcare/project---blood-links/-/issues/40Review Existing Data2019-11-02T05:26:09ZGanesh KatrapatiReview Existing DataSkill : Shell Basics and CSV operations
Input : Existing CSVs in the repo.
Output : Check Duplicates, Quality Check (Is it clean ?, does it have all the fields ?)Skill : Shell Basics and CSV operations
Input : Existing CSVs in the repo.
Output : Check Duplicates, Quality Check (Is it clean ?, does it have all the fields ?)Data Gatheringhttps://code.swecha.org/healthcare/project---blood-links/-/issues/39HTML Scraper (Generic)2019-11-02T05:24:36ZGanesh KatrapatiHTML Scraper (Generic)Skills Req : BeautifulSoup, Adv Python
Input : HTML from the Crawler
Output : Generic code for scraping HTML tables and putting into textSkills Req : BeautifulSoup, Adv Python
Input : HTML from the Crawler
Output : Generic code for scraping HTML tables and putting into textData Gatheringhttps://code.swecha.org/healthcare/project---blood-links/-/issues/38Run Crawler for multiple URLs2019-11-02T05:22:49ZGanesh KatrapatiRun Crawler for multiple URLsSkill : Basic Python
Input : List of URLs, Python Crawler Script
Output : Make a script using the crawler script that runs for the list of URLs.
+ Do QASkill : Basic Python
Input : List of URLs, Python Crawler Script
Output : Make a script using the crawler script that runs for the list of URLs.
+ Do QAData Gatheringhttps://code.swecha.org/healthcare/project---blood-links/-/issues/37QA for URL Crawler2019-11-02T05:21:04ZGanesh KatrapatiQA for URL CrawlerSkill : Basic Python
Input : Python script for URL Crawling
Output : Code Review and Testing.Skill : Basic Python
Input : Python script for URL Crawling
Output : Code Review and Testing.Data Gatheringhttps://code.swecha.org/healthcare/project---blood-links/-/issues/36Website Crawling into HTML2019-11-02T05:18:31ZGanesh KatrapatiWebsite Crawling into HTMLSkill Required - Basic Python
Input : URLs
Output : HTML given the URL.
Time Req : 45 mins.Skill Required - Basic Python
Input : URLs
Output : HTML given the URL.
Time Req : 45 mins.Data Gatheringhttps://code.swecha.org/healthcare/project---blood-links/-/issues/35Identification of Donor Websites2019-11-02T05:16:22ZGanesh KatrapatiIdentification of Donor WebsitesSkills : Web & Search Engines
Output : List of blood bank urls in hyderabad which have data. Specify the kind of data they have and where to find it in each URLSkills : Web & Search Engines
Output : List of blood bank urls in hyderabad which have data. Specify the kind of data they have and where to find it in each URL2019-11-03