feat: v1.8.1 — decoy detection + script_id in error logs + disable_padding flag

Three small, ship-able-now changes from the past day's issue triage:

1. Client-side detection of the v1.8.0 bad-auth decoy HTML
   (#404 w0l4i, #310 sina-b4hrm)

   When mhrv-rs gets back the decoy HTML body that v1.8.0's Code.gs/
   CodeFull.gs/tunnel-node return on bad AUTH_KEY, the client now
   string-matches the body's distinctive "The script completed but
   did not return anything" sentinel and emits an explicit ERROR
   line naming AUTH_KEY mismatch as the likely cause + walking the
   user through "redeploy as new version" + the DIAGNOSTIC_MODE
   escape hatch — instead of the previous cryptic "WARN batch
   failed: bad response: no json in batch response: <!DOCTYPE...".

   Saves users hours of debugging. Reported pattern hits everyone
   who edits Code.gs's AUTH_KEY without redeploying as a new version
   (Apps Script doesn't auto-pick-up that change).

2. script_id in every batch-failure log (#404 w0l4i)

   Previously WARN batch-failed lines didn't say which deployment
   failed. In multi-deployment setups (5–10 deployments where
   some have stale AUTH_KEY), users couldn't identify the culprit
   without the per-deployment curl probe loop.

   All four failure paths in tunnel_client::fire_batch — timeout,
   bad response, decoy detection, missing-response-in-batch — now
   include the script_id short prefix: `batch failed (script
   AKfycbz4): ...`. Combined with #1 above, this is the first
   reliable diagnostic for the "1 of 8 deployments has bad
   AUTH_KEY" pattern.

3. New disable_padding config flag (#391 EBRAHIM-AM)

   Default false (padding active = stronger DPI defense). For
   users on heavily-throttled ISPs where v1.8.0's ~25% bandwidth
   overhead from random padding compounds with the throttle and
   pushes borderline-working batches into timeouts, setting
   `"disable_padding": true` in config.json recovers headroom at
   the cost of losing length-distribution DPI defense.

   Don't flip on speculatively — only enable if you've measured
   actual throughput improvement on your specific ISP path. For
   users where Apps Script outbound flows freely, padding is free
   defense.

Tested:
- cargo build --release --bin mhrv-rs: clean
- cargo build --release --bin mhrv-rs-ui --features ui: clean
- cargo test --release --lib: 154 passed
- UI FormState round-trips disable_padding through save/load

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
therealaleph
2026-04-28 13:24:54 +03:00
parent 0d54c5c6fb
commit ce3030f6b3
8 changed files with 86 additions and 10 deletions
Generated
+1 -1
View File
@@ -2222,7 +2222,7 @@ dependencies = [
[[package]] [[package]]
name = "mhrv-rs" name = "mhrv-rs"
version = "1.8.0" version = "1.8.1"
dependencies = [ dependencies = [
"base64 0.22.1", "base64 0.22.1",
"bytes", "bytes",
+1 -1
View File
@@ -1,6 +1,6 @@
[package] [package]
name = "mhrv-rs" name = "mhrv-rs"
version = "1.8.0" version = "1.8.1"
edition = "2021" edition = "2021"
description = "Rust port of MasterHttpRelayVPN -- DPI bypass via Google Apps Script relay with domain fronting" description = "Rust port of MasterHttpRelayVPN -- DPI bypass via Google Apps Script relay with domain fronting"
license = "MIT" license = "MIT"
+2 -2
View File
@@ -14,8 +14,8 @@ android {
applicationId = "com.therealaleph.mhrv" applicationId = "com.therealaleph.mhrv"
minSdk = 24 // Android 7.0 — covers 99%+ of live devices. minSdk = 24 // Android 7.0 — covers 99%+ of live devices.
targetSdk = 34 targetSdk = 34
versionCode = 157 versionCode = 158
versionName = "1.8.0" versionName = "1.8.1"
// Ship all four mainstream Android ABIs: // Ship all four mainstream Android ABIs:
// - arm64-v8a — 95%+ of real-world Android phones since 2019 // - arm64-v8a — 95%+ of real-world Android phones since 2019
+8
View File
@@ -0,0 +1,8 @@
<!-- see docs/changelog/v1.1.0.md for the file format: Persian, then `---`, then English. -->
• تشخیص خطای decoy v1.8.0 در سمت کلاینت — پیغام واضح به‌جای cryptic ([#404](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/404)، [#310](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/310)): قبلاً وقتی deployment auth fail می‌گرفت + decoy HTML برمی‌گردوند، client پیغام `WARN batch failed: bad response: no json in batch response: <!DOCTYPE html>...` می‌داد. کاربر باید خودش متن decoy رو می‌شناخت تا تشخیص بده. حالا client decoy رو با string-match تشخیص می‌ده + پیغام explicit می‌ده: "got the v1.8.0 bad-auth decoy — your AUTH_KEY in mhrv-rs config does NOT match the AUTH_KEY in this deployment's Code.gs. Either fix the mismatch + redeploy as a NEW VERSION, or set DIAGNOSTIC_MODE=true at the top of Code.gs + redeploy to see the explicit JSON `unauthorized` error during setup." — کاربر مستقیم می‌فهمه چی بکنه + ساعت‌ها debug ذخیره می‌شه
• اضافه شدن `script_id` به همه log‌های batch-failure ([#404](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/404)): قبلاً log `WARN batch failed: ...` نام deployment که fail کرد رو نشون نمی‌داد. در multi-deployment scenarios (5-10 deployment که برخی AUTH_KEY اشتباه داره)، کاربر نمی‌تونست بدون سختی deployment معیوب رو identify کنه. حالا همه پیغام‌های failure (timeout، bad response، decoy، missing-response-in-batch) شامل short prefix script_id هستند: `batch failed (script AKfycbz4): ...`. این + flag تشخیص decoy، اولین diagnostic از سرنوشت توزیع کاربری به طور reliable
• Flag config جدید `disable_padding: true` ([#391](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/391)): پیش‌فرض `false` (padding فعال = DPI defense). برای کاربران روی ISP‌های heavily-throttled که هزینه padding ~۲۵٪ bandwidth با throttle compounds + batchهای borderline-working رو into timeout می‌اندازه، گذاشتن `"disable_padding": true` در config.json در ازای محافظت length-distribution DPI، headroom برمی‌گردونه. توصیه نیست speculatively فعال بشه — فقط بعد از measurement throughput improvement.
---
• Client-side decoy detection — clear hint instead of cryptic error ([#404](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/404), [#310](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/310)): previously when a deployment had a stale/wrong AUTH_KEY, mhrv-rs returned the v1.8.0 bad-auth decoy HTML, and the client logged `WARN batch failed: bad response: no json in batch response: <!DOCTYPE html>...` — leaving the user to recognize the decoy body string and infer the cause. Now the client string-matches the decoy and emits an explicit error: "got the v1.8.0 bad-auth decoy — your AUTH_KEY in mhrv-rs config does NOT match the AUTH_KEY in this deployment's Code.gs. Either fix the mismatch + redeploy as a NEW VERSION (Apps Script doesn't auto-pick-up AUTH_KEY edits without an explicit redeploy), or set DIAGNOSTIC_MODE=true at the top of Code.gs + redeploy to see the explicit JSON `unauthorized` error during setup." Saves users hours of staring at "no json in batch response" trying to figure out what's wrong.
• Add `script_id` to every batch-failure log line ([#404](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/404)): previously `WARN batch failed: ...` didn't identify which deployment failed. In multi-deployment setups (5-10 deployments where one or two have a stale AUTH_KEY), users couldn't identify the culprit without the per-deployment curl probe loop. Every failure log line now includes the short script_id prefix: `batch failed (script AKfycbz4): ...`, applied to all four failure paths (timeout, bad response, decoy, missing-response-in-batch). Together with the decoy detection above, the first reliable diagnostic for the multi-deployment-with-one-bad-AUTH_KEY user pattern.
• New `disable_padding: true` config flag ([#391](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/391)): default `false` (padding active, full DPI defense). For users on heavily-throttled ISPs where the v1.8.0 random-padding cost (+~25% bandwidth per batch) compounds with the throttle to push borderline-working batches into timeouts, setting `"disable_padding": true` in `config.json` recovers headroom in exchange for losing length-distribution DPI defense. Don't flip on speculatively — for users where Apps Script outbound is uncongested, padding is free defense. Only enable if you've measured throughput improvement after the flip on your specific ISP path.
+9
View File
@@ -243,6 +243,10 @@ struct FormState {
/// drop the user's setting. Not currently exposed as a UI control; /// drop the user's setting. Not currently exposed as a UI control;
/// users edit `block_quic` directly in `config.json` (Issue #213). /// users edit `block_quic` directly in `config.json` (Issue #213).
block_quic: bool, block_quic: bool,
/// Round-tripped from config.json. Not exposed as a UI control —
/// users edit `disable_padding` directly when needed (Issue #391).
/// Default false (padding active).
disable_padding: bool,
} }
#[derive(Clone, Debug)] #[derive(Clone, Debug)]
@@ -326,6 +330,7 @@ fn load_form() -> (FormState, Option<String>) {
youtube_via_relay: c.youtube_via_relay, youtube_via_relay: c.youtube_via_relay,
passthrough_hosts: c.passthrough_hosts.clone(), passthrough_hosts: c.passthrough_hosts.clone(),
block_quic: c.block_quic, block_quic: c.block_quic,
disable_padding: c.disable_padding,
} }
} else { } else {
FormState { FormState {
@@ -354,6 +359,7 @@ fn load_form() -> (FormState, Option<String>) {
youtube_via_relay: false, youtube_via_relay: false,
passthrough_hosts: Vec::new(), passthrough_hosts: Vec::new(),
block_quic: false, block_quic: false,
disable_padding: false,
} }
}; };
(form, load_err) (form, load_err)
@@ -500,6 +506,9 @@ impl FormState {
// control yet). Round-trip through the file so save // control yet). Round-trip through the file so save
// doesn't drop a user-set true. // doesn't drop a user-set true.
block_quic: self.block_quic, block_quic: self.block_quic,
// Issue #391: disable_padding is config-only for now.
// Round-trip preserves the user's choice.
disable_padding: self.disable_padding,
}) })
} }
} }
+15
View File
@@ -190,6 +190,21 @@ pub struct Config {
/// failure modes later. Issue #213. /// failure modes later. Issue #213.
#[serde(default)] #[serde(default)]
pub block_quic: bool, pub block_quic: bool,
/// When true, suppress the random `_pad` field that v1.8.0+ adds
/// to outbound Apps Script requests for DPI evasion. Default off
/// (padding active). Some users on heavily-throttled ISPs find
/// the +25% bandwidth cost from padding compounds with the
/// throttle to push borderline-working batches into timeouts;
/// turning padding off recovers a bit of headroom at the cost of
/// length-distribution defense against DPI fingerprinting. Issue
/// #391 (EBRAHIM-AM).
///
/// Don't flip this on speculatively — for users where Apps Script
/// outbound is uncongested, padding is free DPI defense. Only
/// turn off if you've measured throughput improvement after the
/// flip on your specific ISP path.
#[serde(default)]
pub disable_padding: bool,
} }
fn default_fetch_ips_from_api() -> bool { false } fn default_fetch_ips_from_api() -> bool { false }
+14 -3
View File
@@ -131,6 +131,10 @@ pub struct DomainFronter {
today_calls: AtomicU64, today_calls: AtomicU64,
today_bytes: AtomicU64, today_bytes: AtomicU64,
today_key: std::sync::Mutex<String>, today_key: std::sync::Mutex<String>,
/// Suppress the random `_pad` field that v1.8.0+ adds to outbound
/// payloads. Mirrors `Config::disable_padding` (#391). Default false
/// (padding active = stronger DPI defense at +25% bandwidth cost).
disable_padding: bool,
} }
/// Aggregated stats for one remote host. /// Aggregated stats for one remote host.
@@ -289,6 +293,7 @@ impl DomainFronter {
today_calls: AtomicU64::new(0), today_calls: AtomicU64::new(0),
today_bytes: AtomicU64::new(0), today_bytes: AtomicU64::new(0),
today_key: std::sync::Mutex::new(current_pt_day_key()), today_key: std::sync::Mutex::new(current_pt_day_key()),
disable_padding: config.disable_padding,
}) })
} }
@@ -1160,7 +1165,9 @@ impl DomainFronter {
// discards. // discards.
let mut v = serde_json::to_value(&req)?; let mut v = serde_json::to_value(&req)?;
if let Value::Object(map) = &mut v { if let Value::Object(map) = &mut v {
add_random_pad(map); if !self.disable_padding {
add_random_pad(map);
}
} }
Ok(serde_json::to_vec(&v)?) Ok(serde_json::to_vec(&v)?)
} }
@@ -1290,7 +1297,9 @@ impl DomainFronter {
if let Some(d) = data { if let Some(d) = data {
map.insert("d".into(), Value::String(d)); map.insert("d".into(), Value::String(d));
} }
add_random_pad(&mut map); if !self.disable_padding {
add_random_pad(&mut map);
}
Ok(serde_json::to_vec(&Value::Object(map))?) Ok(serde_json::to_vec(&Value::Object(map))?)
} }
@@ -1318,7 +1327,9 @@ impl DomainFronter {
map.insert("k".into(), Value::String(self.auth_key.clone())); map.insert("k".into(), Value::String(self.auth_key.clone()));
map.insert("t".into(), Value::String("batch".into())); map.insert("t".into(), Value::String("batch".into()));
map.insert("ops".into(), serde_json::to_value(ops)?); map.insert("ops".into(), serde_json::to_value(ops)?);
add_random_pad(&mut map); if !self.disable_padding {
add_random_pad(&mut map);
}
let payload = serde_json::to_vec(&Value::Object(map))?; let payload = serde_json::to_vec(&Value::Object(map))?;
let path = format!("/macros/s/{}/exec", script_id); let path = format!("/macros/s/{}/exec", script_id);
+36 -3
View File
@@ -857,11 +857,15 @@ async fn fire_batch(
}) })
.sum(); .sum();
f.record_today(response_bytes); f.record_today(response_bytes);
let sid_short = &script_id[..script_id.len().min(8)];
for (idx, reply) in data_replies { for (idx, reply) in data_replies {
if let Some(resp) = batch_resp.r.get(idx) { if let Some(resp) = batch_resp.r.get(idx) {
let _ = reply.send(Ok((resp.clone(), script_id.clone()))); let _ = reply.send(Ok((resp.clone(), script_id.clone())));
} else { } else {
let _ = reply.send(Err("missing response in batch".into())); let _ = reply.send(Err(format!(
"missing response in batch from script {}",
sid_short
)));
} }
} }
} }
@@ -876,7 +880,30 @@ async fn fire_batch(
f.record_timeout_strike(&script_id); f.record_timeout_strike(&script_id);
} }
let err_msg = format!("{}", e); let err_msg = format!("{}", e);
tracing::warn!("batch failed: {}", err_msg); let sid_short = &script_id[..script_id.len().min(8)];
// Detect the v1.8.0 bad-auth decoy HTML body. The relay layer
// wraps any non-JSON response in `BadResponse("no json in
// batch response: <body prefix>")`. The decoy body string
// `"The script completed but did not return anything"` is
// distinctive — Apps Script's stock pages never include it,
// and our own `Code.gs` only returns it when AUTH_KEY check
// fails. Surfacing this as an actionable hint saves users
// (and #404 / #310 sina-b4hrm class issues) hours of
// staring at "no json in batch response".
if err_msg.contains("The script completed but did not return anything") {
tracing::error!(
"batch failed (script {}): got the v1.8.0 bad-auth decoy — \
your AUTH_KEY in mhrv-rs config does NOT match the AUTH_KEY \
in this deployment's Code.gs. Either fix the mismatch + \
redeploy as a NEW VERSION (Apps Script doesn't auto-pick-up \
AUTH_KEY edits without an explicit redeploy), or set \
DIAGNOSTIC_MODE=true at the top of Code.gs + redeploy to \
see the explicit JSON `unauthorized` error during setup.",
sid_short
);
} else {
tracing::warn!("batch failed (script {}): {}", sid_short, err_msg);
}
for (_, reply) in data_replies { for (_, reply) in data_replies {
let _ = reply.send(Err(err_msg.clone())); let _ = reply.send(Err(err_msg.clone()));
} }
@@ -886,7 +913,13 @@ async fn fire_batch(
// stronger signal than a per-read timeout — count it the same // stronger signal than a per-read timeout — count it the same
// way so a truly-stuck deployment exits round-robin fast. // way so a truly-stuck deployment exits round-robin fast.
f.record_timeout_strike(&script_id); f.record_timeout_strike(&script_id);
tracing::warn!("batch timed out after {:?} ({} ops)", BATCH_TIMEOUT, n_ops); let sid_short = &script_id[..script_id.len().min(8)];
tracing::warn!(
"batch timed out after {:?} (script {}, {} ops)",
BATCH_TIMEOUT,
sid_short,
n_ops
);
for (_, reply) in data_replies { for (_, reply) in data_replies {
let _ = reply.send(Err("batch timed out".into())); let _ = reply.send(Err("batch timed out".into()));
} }