Commit Graph

332 Commits

Author SHA1 Message Date
Ricky Smith d4dbc94267 Move to multi-stage docker build
Also add additional packages to the build container for hosts that don't know anything about python, like Docker Desktop on OSX/Windows.
2019-01-31 16:11:12 -05:00
Christopher Kent Hoadley 6b31cd87e3 Merge pull request #156 from TheYahya/hoadlck-tests-coverage
Add More Tests For Sites
2019-01-30 18:41:54 -06:00
Christopher K. Hoadley 7fb6d26cc7 Previous code was failing the flake8 tests because the random module was not imported. 2019-01-30 18:37:03 -06:00
Christopher K. Hoadley 83aed9aeee Add test methods for Error Message detection method as well. Add Dribbble to tests. 2019-01-30 18:24:28 -06:00
Christopher K. Hoadley 6fc5c131db Convert Designspiration to use the Status Code detection method. The site gives a clean 404 error. Add to tests. 2019-01-30 18:15:42 -06:00
Christopher K. Hoadley 26ef2e1b9b Convert Codementor to use the Status Code detection method. The site gives a clean 404 error. Add to tests. 2019-01-30 18:10:06 -06:00
Christopher K. Hoadley 110b93a757 Convert Codecademy to use the Status Code detection method. The site gives a clean 404 error. Add to tests. 2019-01-30 18:07:29 -06:00
Christopher K. Hoadley 223d9716cb Convert BuzzFeed to use the Status Code detection method. The site gives a clean 404 error. Add to tests. 2019-01-30 18:04:24 -06:00
Christopher K. Hoadley 08ac008828 Convert Behance to use the Status Code detection method. The site gives a clean 404 error. Add to tests. 2019-01-30 17:58:14 -06:00
Christopher K. Hoadley 65e3820608 Convert Bandcamp to use the Status Code detection method. The site gives a clean 404 error. Add to tests. 2019-01-30 17:54:22 -06:00
Christopher K. Hoadley c76b4524da Convert BLIP.fm to use the Status Code detection method. The site gives a clean 404 error. Add to tests. 2019-01-30 17:51:46 -06:00
Christopher K. Hoadley 8a82d883c6 Convert AngelList to use the Status Code detection method. The site gives a clean 404 error. Add to tests. 2019-01-30 17:49:04 -06:00
Christopher K. Hoadley 89787b1509 Add test methods for HTTP Status detection method as well. 2019-01-30 17:45:36 -06:00
Christopher K. Hoadley bd941c8034 Convert Academia.edu to use the Status Code detection method. The site gives a clean 404 error. 2019-01-30 17:44:09 -06:00
Christopher K. Hoadley f609320d3c Convert Canva to the more robust Response URL detection method. Add to tests to ensure that it is covered. 2019-01-30 17:25:05 -06:00
Yahya SayadArbabi 916fdd0603 Merge branch 'BlucyBlue-master' 2019-01-29 13:46:44 +03:30
Yahya SayadArbabi f69be05803 Rebase & bump version 2019-01-29 10:11:18 +03:30
BlucyBlue 465f4c85c3 Typo in printout when reading proxies from file. 2019-01-29 10:10:31 +03:30
BlucyBlue 9f523365f7 Finally importing load_proxies module. 2019-01-29 10:10:31 +03:30
BlucyBlue 8587d1a835 If the ProxyError gets raised in the 'get_response' function, the request will be tried with another proxy selected from the 'proxy_list' global var. New parameter 'retry_no' is the number of retries that will be made before throwing a final ProxyError. 2019-01-29 10:10:31 +03:30
BlucyBlue 6bf8358342 Set new parameter 'retry_no' of the 'get_response' function to 3 (can be changed). This will be used if retrying a ProxyError. 2019-01-29 10:10:31 +03:30
BlucyBlue 855f154d9b If the 'proxy_list' we select a random member and pass it as the proxy to the session. If the list is empty, the proxy parameter will be set to arg.proxy, which defaults to None if the user did not pass an individual proxy as well. 2019-01-29 10:10:31 +03:30
BlucyBlue 2accdcafea If the user selected --check_proxies option along with --proxy_list option, proxies loaded from the .csv file are checked using the check_proxies function from the load_proxies module. Proxies which pass the test are stored in the proxy_list global var. 2019-01-29 10:10:06 +03:30
BlucyBlue 6cc4e22898 If the user selected --proxy_list option, we attempt to read proxies from the csv, and store the list in global var proxy_list. 2019-01-29 10:10:06 +03:30
BlucyBlue bd683022b3 Exception is raised if both a single proxy and the proxy_list are used. As needed, this can be changed to merging the single proxy with the proxy list, but seems a bit unnecessary at this time. 2019-01-29 10:10:06 +03:30
BlucyBlue dc32d473e0 Exception will now be raised if etiher a single proxy or proxy_list options are used along with Tor. 2019-01-29 10:10:06 +03:30
BlucyBlue c5e06b068e Added two new arguments, '--proxy_list'/'-pl' and '--check_proxies'/'-cp', for users to activate options of reading proxies from a document (at this time, only .csv is supported), and check their anonimity before using them. 2019-01-29 10:10:06 +03:30
BlucyBlue 166d224423 First change to 'sherlock.py' for use of load_proxies module. Global variable proxy_list is created, and by default set to an empty list. This variable will store proxies from a proxy list (if this option is used), and will enable different threads to access proxies at the same time. 2019-01-29 10:09:33 +03:30
BlucyBlue 65a040dbbb Function 'check_proxy_list' which checks anonimity of each proxy contained in a list of named tuples. Proxies are checked by using the 'check_proxy' function. 2019-01-29 10:09:33 +03:30
BlucyBlue 901074ea4e Function 'check_proxy', which checks anonimity of a signle proxy by anaylizing return headers received from a request using the proxy in question. 2019-01-29 10:09:33 +03:30
BlucyBlue a63bdb3152 Created new file 'load_proxies.py' to store functions for reading proxies from files, and checking proxy anonimity. Created the function 'load_proxies_from_csv' which reads proxies from a .csv file to a list of named tuples. 2019-01-29 10:09:33 +03:30
Yahya SayadArbabi 263b8b3b90 Merge branch 'aditisrinivas97-master' 2019-01-29 10:03:07 +03:30
Yahya SayadArbabi 67108071e5 bump version 2019-01-29 10:02:54 +03:30
aditisrinivas97 619d9ab6bc Fix issue with site name and url 2019-01-27 16:45:01 +05:30
Yahya SayadArbabi 011df7af55 bump version 2019-01-27 13:57:03 +03:30
Yahya SayadArbabi fd63e1093f Merge branch 'avinashshenoy97/patch-1' 2019-01-27 13:56:28 +03:30
Avinash Shenoy 3db3f4558b Parallelized updating alexa ranking 2019-01-27 15:20:45 +05:30
Avinash Shenoy 1442f333c2 Parallelised updating Alexa.com ranking of sites
Script now fetches Alexa ranks for sites concurrently on separate threads. Cuts down the time to sync ranks from approximately **5 minutes** to about **18 seconds**.
2019-01-27 15:01:55 +05:30
Yahya SayadArbabi 269df6d549 Merge pull request #151 from ptalmeida/master
Fix readme and instrallpackages.sh typo
2019-01-26 16:15:05 +03:30
ptalmeida 8ee50e6717 Fix typo
necessery -> necessary
2019-01-26 11:20:36 +00:00
ptalmeida 85d7be3e77 Actually bring README.md up to date 2019-01-26 11:14:05 +00:00
Yahya SayadArbabi d6b7c0ac55 Merge branch 'ptalmeida-Add-sorgin-by-alexa-rank-functionality' 2019-01-26 14:23:51 +03:30
ptalmeida 8b681158bc small corrections to rank sort 2019-01-25 17:36:38 +00:00
ptalmeida 78ade00dee Update outdated REAME.md 2019-01-25 15:10:03 +00:00
ptalmeida 5d972a3138 add --rank -r option to sherlock 2019-01-25 15:05:38 +00:00
ptalmeida 55d43b0ee6 Update requirements.txt 2019-01-25 12:50:50 +00:00
ptalmeida db0cf7c289 Update requirements.txt 2019-01-25 12:46:05 +00:00
ptalmeida 826af1ec19 remove unused import 2019-01-25 12:45:55 +00:00
Yahya SayadArbabi 2408bb520e Merge branch 'UltraWelfare/optional_output' 2019-01-25 02:27:02 +03:30
George Tsomlektsis 0e6b8d0dca Added optional parameters for outputting files and folders. 2019-01-24 21:59:06 +02:00