Add README.md

6491f7de · besendorf · 988ec605 · 6491f7de
Commit 6491f7de authored 4 years ago by besendorf
--- a/README.md
+++ b/README.md
+# Android Security Scraper
+This is a set of python scripts to do web scraping on websites to gather security properties of Android Smartphones. It is part of my bachelor thesis, a link to it will be added after it is finished.
+# websites and scraped attributes
+## [GSMArena](gsmarena.com)
+GSMArena is a website with hardware specs of phones targeted to consumers. 
+The following attributes are scraped:
+* phone model
+* chipset
+* operating system version (at launch and updated)
+* presence of a fingerprint scanner
+As this website used rate limiting and blocking when too many request are issued it was scraped using a trial account at scraperapi.com
+## [Android Enterprise Solutions Directory](https://androidenterprisepartners.withgoogle.com/)
+The Android Enterprise Solutions Directory is a platform by Google with information about Android smartphones that are useful to enterprises.
+The following attributes are scraped:
+* phone model
+* operating system version (at launch and updated)
+* presence of a fingerprint scanner
+* ioxt certification
+* common criteria certification
+## [Common Criteria Portal](https://www.commoncriteriaportal.org/products/)
+The Common Criteria Portal lists all product that are common critera certified and all associated reports. The scraper download a .csv file provided by the website, filters for mobile devices and then downloads all the reports that are linked there into a directory. Those reports can then be searched for smarphone models using [pdfgep](https://pdfgrep.org/)
+# Usage
+## Dependencies
+* python
+* scrapy
+* pdfgrep (for common criteria portal)
+## GSMArena
+```
+cd gsmarena
+scrapy crawl attributes -o output.csv
+```
+## Android Enterprise Solutions Directory
+```
+cd adnroid_enterpise
+scrapy crawl attributes -o output.csv
+```
+## Common Criteria Portal
+```
+cd common_criteria_scraper
+python cc_portal_scraper.py
+pdfgrep -riH "pixel 3" pdf
+```
+substitute "pixel 3" for the phone model you are looking for
\ No newline at end of file