23 Commits

Author SHA1 Message Date
Paul Pfeister 4e2a4f6b66 Merge pull request #2919 from quan-nguyen-2110/fix-cracked-forum-false-positive
Fix Cracked Forum false positives
2026-05-04 23:28:52 -04:00
Paul Pfeister 2b985b57ad Merge pull request #2921 from quan-nguyen-2110/fix-akniga-false-negative
Fix akniga false negatives
2026-05-04 23:28:14 -04:00
Paul Pfeister ed0865363f Merge pull request #2929 from mohamedsolaiman/fix/false-positives
fix: resolve false positives for ArtStation, GeeksforGeeks, and LushStories
2026-05-04 23:23:43 -04:00
Paul Pfeister 43a354b235 Merge pull request #2853 from salmanrajz/fix/unicode-decode-error-special-chars
fix: handle UnicodeDecodeError on usernames with special characters
2026-05-04 23:12:52 -04:00
Paul Pfeister aa5c3b0010 Merge pull request #2930 from mohamedsolaiman/feature/new-sites
feat: add Carrd, SpaceHey, and Substack as supported sites
2026-05-04 23:07:07 -04:00
Siddharth Dushantha 2df7c61be8 Merge pull request #2939 from sherlock-project/fix-vuln
Fix command injection vuln
2026-05-02 09:46:59 +02:00
Siddharth Dushantha 61aae782ee version bump 2026-05-02 09:42:36 +02:00
Siddharth Dushantha 6eaec5cccd Fix command injection vuln 2026-05-02 09:27:28 +02:00
Mohamed Solaiman dca64e35d3 feat: add Carrd, SpaceHey, and Substack as supported sites
- Carrd: Simple website builder with profiles at {username}.carrd.co.
  Uses status_code detection (404 for non-existing profiles).

- SpaceHey: Retro social network inspired by MySpace.
  Uses message detection ("Not Found (Error 404) | SpaceHey" title
  for non-existing profiles).

- Substack: Newsletter/publishing platform with profiles at
  {username}.substack.com. Uses status_code detection (404 for
  non-existing publications).
2026-04-28 17:03:23 +00:00
Mohamed Solaiman 2e2248a8a6 fix: resolve false positives for ArtStation, GeeksforGeeks, and LushStories
- ArtStation: Add urlProbe using the JSON API endpoint
  (https://www.artstation.com/users/{}.json) which returns proper
  404 for non-existing users, instead of the main page which
  returns 200 for both existing and non-existing profiles.
  Closes #2714

- GeeksforGeeks: Switch from status_code to message detection.
  Both existing and non-existing profiles return HTTP 200, but
  non-existing profiles have "false" in the page title.
  Closes #2782

- LushStories: Switch from status_code to response_url detection.
  Non-existing profiles redirect (302) to /login while existing
  profiles return 200. Closes #2371
2026-04-28 17:01:37 +00:00
QuanNguyen a9960ff9a4 Fix akniga false negatives
Made-with: Cursor
2026-04-26 16:00:27 +02:00
QuanNguyen d731f715bf Fix Cracked Forum false positives
Made-with: Cursor
2026-04-26 15:44:27 +02:00
Siddharth Dushantha 271608fb22 Merge pull request #2898 from sherlock-project/improvements
Make Minor Improvements
2026-04-12 17:54:11 +02:00
Siddharth Dushantha eb79980c33 Remove unused line of code 2026-04-12 17:48:42 +02:00
Siddharth Dushantha e2a225697f Fix missing punctuation 2026-04-12 17:38:01 +02:00
Siddharth Dushantha 173ae5b824 Update usage 2026-04-12 17:35:14 +02:00
Siddharth Dushantha dcb935337c Remove --no-txt
It was removed a long time ago but the argumenet still exists.
2026-04-12 17:32:35 +02:00
Siddharth Dushantha ed883ad7c8 fix copy paste error 2026-04-12 16:55:54 +02:00
Siddharth Dushantha a68ea46fb4 Removed unesseary unnecessary returns 2026-04-12 16:54:42 +02:00
Siddharth Dushantha ed73b175d7 Use data.sherlockproject.xyz
I've created data.sherlockproject.xyz so that it will be eaiser for
people use Sherlock's data in other projects if needed.
2026-04-12 16:49:37 +02:00
Siddharth Dushantha a192cb4bfe Merge pull request #2897 from sherlock-project/clean-up
Minor clean up
2026-04-12 13:54:08 +02:00
salmanrajz 32fde9bfc6 fix: update NSFW tests to use sites not in exclusions list
Pornhub was added to the remote false_positive_exclusions.txt, causing
test_remove_nsfw and test_nsfw_explicit_selection to fail since the
site gets filtered out before the test runs. Replaced with Xvideos and
Erome which are NSFW-flagged but not excluded.
2026-03-31 20:11:55 +04:00
salmanrajz 4656d95702 fix: handle UnicodeDecodeError on usernames with special characters
Fixes #2730. Usernames containing non-ASCII characters (e.g. 'Émile')
can trigger a UnicodeDecodeError inside the requests library during
redirect handling. This exception is not a subclass of
requests.exceptions.RequestException, so it escaped all existing
except blocks in get_response() and crashed the program.

Added a catch for UnicodeError (parent of both UnicodeDecodeError and
UnicodeEncodeError) so these sites are gracefully skipped instead of
crashing the entire scan.

Added regression tests in tests/test_unicode.py.
2026-03-31 19:57:54 +04:00
9 changed files with 108 additions and 64 deletions
@@ -20,6 +20,7 @@ jobs:
# Checkout the base branch but fetch all history to avoid a second fetch call
ref: ${{ github.base_ref }}
fetch-depth: 0
persist-credentials: false
- name: Set up Python
uses: actions/setup-python@v6
@@ -90,11 +91,11 @@ jobs:
# --- The rest of the steps below are unchanged ---
- name: Validate modified targets
if: steps.discover-modified.outputs.changed_targets != ''
continue-on-error: true
env:
CHANGED_TARGETS: ${{ steps.discover-modified.outputs.changed_targets }}
run: |
poetry run pytest -q --tb no -rA -m validate_targets -n 20 \
--chunked-sites "${{ steps.discover-modified.outputs.changed_targets }}" \
--chunked-sites "$CHANGED_TARGETS" \
--junitxml=validation_results.xml
- name: Prepare validation summary
+15 -25
View File
@@ -33,7 +33,7 @@
Community-maintained packages are available for Debian (>= 13), Ubuntu (>= 22.10), Homebrew, Kali, and BlackArch. These packages are not directly supported or maintained by the Sherlock Project.
See all alternative installation methods [here](https://sherlockproject.xyz/installation)
See all alternative installation methods [here](https://sherlockproject.xyz/installation).
## General usage
@@ -51,51 +51,41 @@ Accounts found will be stored in an individual text file with the corresponding
```console
$ sherlock --help
usage: sherlock [-h] [--version] [--verbose] [--folderoutput FOLDEROUTPUT]
[--output OUTPUT] [--tor] [--unique-tor] [--csv] [--xlsx]
[--site SITE_NAME] [--proxy PROXY_URL] [--json JSON_FILE]
[--timeout TIMEOUT] [--print-all] [--print-found] [--no-color]
[--browse] [--local] [--nsfw]
usage: sherlock [-h] [--version] [--verbose] [--folderoutput FOLDEROUTPUT] [--output OUTPUT] [--csv] [--xlsx] [--site SITE_NAME] [--proxy PROXY_URL] [--dump-response]
[--json JSON_FILE] [--timeout TIMEOUT] [--print-all] [--print-found] [--no-color] [--browse] [--local] [--nsfw] [--txt] [--ignore-exclusions]
USERNAMES [USERNAMES ...]
Sherlock: Find Usernames Across Social Networks (Version 0.14.3)
Sherlock: Find Usernames Across Social Networks (Version 0.16.0)
positional arguments:
USERNAMES One or more usernames to check with social networks.
Check similar usernames using {?} (replace to '_', '-', '.').
USERNAMES One or more usernames to check with social networks. Check similar usernames using {?} (replace to '_', '-', '.').
optional arguments:
options:
-h, --help show this help message and exit
--version Display version information and dependencies.
--verbose, -v, -d, --debug
Display extra debugging information and metrics.
--folderoutput FOLDEROUTPUT, -fo FOLDEROUTPUT
If using multiple usernames, the output of the results will be
saved to this folder.
If using multiple usernames, the output of the results will be saved to this folder.
--output OUTPUT, -o OUTPUT
If using single username, the output of the result will be saved
to this file.
--tor, -t Make requests over Tor; increases runtime; requires Tor to be
installed and in system path.
--unique-tor, -u Make requests over Tor with new Tor circuit after each request;
increases runtime; requires Tor to be installed and in system
path.
If using single username, the output of the result will be saved to this file.
--csv Create Comma-Separated Values (CSV) File.
--xlsx Create the standard file for the modern Microsoft Excel
spreadsheet (xlsx).
--site SITE_NAME Limit analysis to just the listed sites. Add multiple options to
specify more than one site.
--xlsx Create the standard file for the modern Microsoft Excel spreadsheet (xlsx).
--site SITE_NAME Limit analysis to just the listed sites. Add multiple options to specify more than one site.
--proxy PROXY_URL, -p PROXY_URL
Make requests over a proxy. e.g. socks5://127.0.0.1:1080
--dump-response Dump the HTTP response to stdout for targeted debugging.
--json JSON_FILE, -j JSON_FILE
Load data from a JSON file or an online, valid, JSON file.
Load data from a JSON file or an online, valid, JSON file. Upstream PR numbers also accepted.
--timeout TIMEOUT Time (in seconds) to wait for response to requests (Default: 60)
--print-all Output sites where the username was not found.
--print-found Output sites where the username was found.
--print-found Output sites where the username was found (also if exported as file).
--no-color Don't color terminal output
--browse, -b Browse to all results on default browser.
--local, -l Force the use of the local data.json file.
--nsfw Include checking of NSFW sites from default list.
--txt Enable creation of a txt file
--ignore-exclusions Ignore upstream exclusions (may return more false positives)
```
## Credits
+1 -1
View File
@@ -8,7 +8,7 @@ source = "init"
[tool.poetry]
name = "sherlock-project"
version = "0.16.0"
version = "0.16.1"
description = "Hunt down social media accounts by username across social networks"
license = "MIT"
authors = [
+2 -9
View File
@@ -37,7 +37,6 @@ class QueryNotify:
self.result = result
# return
def start(self, message=None):
"""Notify Start.
@@ -56,7 +55,6 @@ class QueryNotify:
Nothing.
"""
# return
def update(self, result):
"""Notify Update.
@@ -75,7 +73,6 @@ class QueryNotify:
self.result = result
# return
def finish(self, message=None):
"""Notify Finish.
@@ -94,7 +91,6 @@ class QueryNotify:
Nothing.
"""
# return
def __str__(self):
"""Convert Object To String.
@@ -137,7 +133,6 @@ class QueryNotifyPrint(QueryNotify):
self.print_all = print_all
self.browse = browse
return
def start(self, message):
"""Notify Start.
@@ -163,7 +158,6 @@ class QueryNotifyPrint(QueryNotify):
# An empty line between first line and the result(more clear output)
print('\r')
return
def countResults(self):
"""This function counts the number of results. Every time the function is called,
@@ -238,7 +232,7 @@ class QueryNotifyPrint(QueryNotify):
Fore.WHITE + "]" +
Fore.GREEN + f" {self.result.site_name}:" +
Fore.YELLOW + f" {msg}")
elif result.status == QueryStatus.WAF:
if self.print_all:
print(Style.BRIGHT + Fore.WHITE + "[" +
@@ -254,10 +248,9 @@ class QueryNotifyPrint(QueryNotify):
f"Unknown Query Status '{result.status}' for site '{self.result.site_name}'"
)
return
def finish(self, message="The processing has been finished."):
"""Notify Start.
"""Notify Finish.
Will print the last line to the standard output.
Keyword Arguments:
self -- This object.
+32 -7
View File
@@ -159,6 +159,7 @@
"errorType": "status_code",
"url": "https://www.artstation.com/{}",
"urlMain": "https://www.artstation.com/",
"urlProbe": "https://www.artstation.com/users/{}.json",
"username_claimed": "Blue"
},
"Asciinema": {
@@ -404,6 +405,13 @@
"urlMain": "https://carbonmade.com/",
"username_claimed": "jenny"
},
"Carrd": {
"errorType": "status_code",
"regexCheck": "^[a-zA-Z0-9_-]{3,50}$",
"url": "https://{}.carrd.co/",
"urlMain": "https://carrd.co/",
"username_claimed": "blue"
},
"Career.habr": {
"errorMsg": "<h1>\u041e\u0448\u0438\u0431\u043a\u0430 404</h1>",
"errorType": "message",
@@ -602,10 +610,9 @@
"username_claimed": "blue"
},
"Cracked Forum": {
"errorMsg": "The member you specified is either invalid or doesn't exist",
"errorType": "message",
"url": "https://cracked.sh/{}",
"urlMain": "https://cracked.sh/",
"errorType": "status_code",
"url": "https://cracked.ax/{}",
"urlMain": "https://cracked.ax/",
"username_claimed": "Blue"
},
"Credly": {
@@ -952,7 +959,8 @@
"username_claimed": "blue"
},
"GeeksforGeeks": {
"errorType": "status_code",
"errorMsg": "false | GeeksforGeeks Profile",
"errorType": "message",
"url": "https://auth.geeksforgeeks.org/user/{}",
"urlMain": "https://www.geeksforgeeks.org/",
"username_claimed": "adam"
@@ -1526,7 +1534,8 @@
"username_claimed": "lottiefiles"
},
"LushStories": {
"errorType": "status_code",
"errorType": "response_url",
"errorUrl": "https://www.lushstories.com/login",
"isNSFW": true,
"url": "https://www.lushstories.com/profile/{}",
"urlMain": "https://www.lushstories.com/",
@@ -2279,6 +2288,13 @@
"urlMain": "https://sourceforge.net/",
"username_claimed": "blue"
},
"SpaceHey": {
"errorType": "message",
"errorMsg": "Not Found (Error 404) | SpaceHey",
"url": "https://spacehey.com/{}",
"urlMain": "https://spacehey.com/",
"username_claimed": "blue"
},
"SoylentNews": {
"errorMsg": "The user you requested does not exist, no matter how much you wish this might be the case.",
"errorType": "message",
@@ -2376,6 +2392,13 @@
"urlMain": "https://www.strava.com/",
"username_claimed": "blue"
},
"Substack": {
"errorType": "status_code",
"regexCheck": "^[a-zA-Z0-9][a-zA-Z0-9_-]{1,60}$",
"url": "https://{}.substack.com/",
"urlMain": "https://substack.com/",
"username_claimed": "green"
},
"SublimeForum": {
"errorType": "status_code",
"url": "https://forum.sublimetext.com/u/{}",
@@ -2827,8 +2850,10 @@
},
"akniga": {
"errorType": "status_code",
"errorCode": 404,
"request_method": "GET",
"url": "https://akniga.org/profile/{}",
"urlMain": "https://akniga.org/profile/blue/",
"urlMain": "https://akniga.org/",
"username_claimed": "blue"
},
"authorSTREAM": {
+3 -10
View File
@@ -136,6 +136,9 @@ def get_response(request_future, error_type, social_network):
except requests.exceptions.RequestException as err:
error_context = "Unknown Error"
exception_text = str(err)
except UnicodeError as err:
error_context = "Encoding Error"
exception_text = str(err)
return response, error_context, exception_text
@@ -675,16 +678,6 @@ def main():
help="Include checking of NSFW sites from default list.",
)
# TODO deprecated in favor of --txt, retained for workflow compatibility, to be removed
# in future release
parser.add_argument(
"--no-txt",
action="store_true",
dest="no_txt",
default=False,
help="Disable creation of a txt file - WILL BE DEPRECATED",
)
parser.add_argument(
"--txt",
action="store_true",
+1 -6
View File
@@ -8,7 +8,7 @@ import requests
import secrets
MANIFEST_URL = "https://raw.githubusercontent.com/sherlock-project/sherlock/master/sherlock_project/resources/data.json"
MANIFEST_URL = "https://data.sherlockproject.xyz"
EXCLUSIONS_URL = "https://raw.githubusercontent.com/sherlock-project/sherlock/refs/heads/exclusions/false_positive_exclusions.txt"
class SiteInformation:
@@ -121,11 +121,6 @@ class SitesInformation:
# users from creating issue about false positives which has already been fixed or having outdated data
data_file_path = MANIFEST_URL
# Ensure that specified data file has correct extension.
if not data_file_path.lower().endswith(".json"):
raise FileNotFoundError(f"Incorrect JSON file extension for data file '{data_file_path}'.")
# if "http://" == data_file_path[:7].lower() or "https://" == data_file_path[:8].lower():
if data_file_path.lower().startswith("http"):
# Reference is to a URL.
try:
+47
View File
@@ -0,0 +1,47 @@
"""Tests for handling usernames with special/unicode characters."""
from concurrent.futures import Future
from sherlock_project.sherlock import get_response
def _make_future_with_exception(exc):
"""Create a Future that raises the given exception."""
future = Future()
future.set_exception(exc)
return future
def test_get_response_handles_unicode_decode_error():
"""Regression test for issue #2730.
Usernames with special characters (e.g. 'Émile') can trigger a
UnicodeDecodeError inside the requests library during redirect
handling. This must not crash the program.
"""
future = _make_future_with_exception(
UnicodeDecodeError("utf-8", b"\xe9", 0, 1, "invalid continuation byte")
)
response, error_context, exception_text = get_response(
request_future=future,
error_type=["status_code"],
social_network="TestSite",
)
assert response is None
assert error_context == "Encoding Error"
assert "utf-8" in exception_text
def test_get_response_handles_unicode_encode_error():
"""UnicodeEncodeError should also be caught (subclass of UnicodeError)."""
future = _make_future_with_exception(
UnicodeEncodeError("ascii", "É", 0, 1, "ordinal not in range(128)")
)
response, error_context, exception_text = get_response(
request_future=future,
error_type=["status_code"],
social_network="TestSite",
)
assert response is None
assert error_context == "Encoding Error"
assert "ascii" in exception_text
+3 -3
View File
@@ -4,7 +4,7 @@ from sherlock_interactives import Interactives
from sherlock_interactives import InteractivesSubprocessError
def test_remove_nsfw(sites_obj):
nsfw_target: str = 'Pornhub'
nsfw_target: str = 'Xvideos'
assert nsfw_target in {site.name: site.information for site in sites_obj}
sites_obj.remove_nsfw_sites()
assert nsfw_target not in {site.name: site.information for site in sites_obj}
@@ -12,8 +12,8 @@ def test_remove_nsfw(sites_obj):
# Parametrized sites should *not* include Motherless, which is acting as the control
@pytest.mark.parametrize('nsfwsites', [
['Pornhub'],
['Pornhub', 'Xvideos'],
['Xvideos'],
['Xvideos', 'Erome'],
])
def test_nsfw_explicit_selection(sites_obj, nsfwsites):
for site in nsfwsites: