feat: v1.8.1 — decoy detection + script_id in error logs + disable_padding flag

Three small, ship-able-now changes from the past day's issue triage: 1. Client-side detection of the v1.8.0 bad-auth decoy HTML (#404 w0l4i, #310 sina-b4hrm) When mhrv-rs gets back the decoy HTML body that v1.8.0's Code.gs/ CodeFull.gs/tunnel-node return on bad AUTH_KEY, the client now string-matches the body's distinctive "The script completed but did not return anything" sentinel and emits an explicit ERROR line naming AUTH_KEY mismatch as the likely cause + walking the user through "redeploy as new version" + the DIAGNOSTIC_MODE escape hatch — instead of the previous cryptic "WARN batch failed: bad response: no json in batch response: <!DOCTYPE...". Saves users hours of debugging. Reported pattern hits everyone who edits Code.gs's AUTH_KEY without redeploying as a new version (Apps Script doesn't auto-pick-up that change). 2. script_id in every batch-failure log (#404 w0l4i) Previously WARN batch-failed lines didn't say which deployment failed. In multi-deployment setups (5–10 deployments where some have stale AUTH_KEY), users couldn't identify the culprit without the per-deployment curl probe loop. All four failure paths in tunnel_client::fire_batch — timeout, bad response, decoy detection, missing-response-in-batch — now include the script_id short prefix: `batch failed (script AKfycbz4): ...`. Combined with #1 above, this is the first reliable diagnostic for the "1 of 8 deployments has bad AUTH_KEY" pattern. 3. New disable_padding config flag (#391 EBRAHIM-AM) Default false (padding active = stronger DPI defense). For users on heavily-throttled ISPs where v1.8.0's ~25% bandwidth overhead from random padding compounds with the throttle and pushes borderline-working batches into timeouts, setting `"disable_padding": true` in config.json recovers headroom at the cost of losing length-distribution DPI defense. Don't flip on speculatively — only enable if you've measured actual throughput improvement on your specific ISP path. For users where Apps Script outbound flows freely, padding is free defense. Tested: - cargo build --release --bin mhrv-rs: clean - cargo build --release --bin mhrv-rs-ui --features ui: clean - cargo test --release --lib: 154 passed - UI FormState round-trips disable_padding through save/load Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 21:24:48 +03:00 · 2026-04-28 13:24:54 +03:00
parent 0d54c5c6fb
commit ce3030f6b3
8 changed files with 86 additions and 10 deletions
@@ -2222,7 +2222,7 @@ dependencies = [
 [[package]]
 name = "mhrv-rs"
-version = "1.8.0"
+version = "1.8.1"
 dependencies = [
 "base64 0.22.1",
 "bytes",
@@ -1,6 +1,6 @@
 [package]
 name = "mhrv-rs"
-version = "1.8.0"
+version = "1.8.1"
 edition = "2021"
 description = "Rust port of MasterHttpRelayVPN -- DPI bypass via Google Apps Script relay with domain fronting"
 license = "MIT"
@@ -14,8 +14,8 @@ android {
        applicationId = "com.therealaleph.mhrv"
        minSdk = 24 // Android 7.0 — covers 99%+ of live devices.
        targetSdk = 34
-        versionCode = 157
+        versionCode = 158
-        versionName = "1.8.0"
+        versionName = "1.8.1"
        // Ship all four mainstream Android ABIs:
        //   - arm64-v8a      — 95%+ of real-world Android phones since 2019
@@ -0,0 +1,8 @@
 <!-- see docs/changelog/v1.1.0.md for the file format: Persian, then `---`, then English. -->
 • تشخیص خطای decoy v1.8.0 در سمت کلاینت — پیغام واضح به‌جای cryptic ([#404](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/404)، [#310](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/310)): قبلاً وقتی deployment auth fail می‌گرفت + decoy HTML برمی‌گردوند، client پیغام `WARN batch failed: bad response: no json in batch response: <!DOCTYPE html>...` می‌داد. کاربر باید خودش متن decoy رو می‌شناخت تا تشخیص بده. حالا client decoy رو با string-match تشخیص می‌ده + پیغام explicit می‌ده: "got the v1.8.0 bad-auth decoy — your AUTH_KEY in mhrv-rs config does NOT match the AUTH_KEY in this deployment's Code.gs. Either fix the mismatch + redeploy as a NEW VERSION, or set DIAGNOSTIC_MODE=true at the top of Code.gs + redeploy to see the explicit JSON `unauthorized` error during setup." — کاربر مستقیم می‌فهمه چی بکنه + ساعت‌ها debug ذخیره می‌شه
 • اضافه شدن `script_id` به همه log‌های batch-failure ([#404](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/404)): قبلاً log `WARN batch failed: ...` نام deployment که fail کرد رو نشون نمی‌داد. در multi-deployment scenarios (5-10 deployment که برخی AUTH_KEY اشتباه داره)، کاربر نمی‌تونست بدون سختی deployment معیوب رو identify کنه. حالا همه پیغام‌های failure (timeout، bad response، decoy، missing-response-in-batch) شامل short prefix script_id هستند: `batch failed (script AKfycbz4): ...`. این + flag تشخیص decoy، اولین diagnostic از سرنوشت توزیع کاربری به طور reliable
 • Flag config جدید `disable_padding: true` ([#391](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/391)): پیش‌فرض `false` (padding فعال = DPI defense). برای کاربران روی ISP‌های heavily-throttled که هزینه padding ~۲۵٪ bandwidth با throttle compounds + batchهای borderline-working رو into timeout می‌اندازه، گذاشتن `"disable_padding": true` در config.json در ازای محافظت length-distribution DPI، headroom برمی‌گردونه. توصیه نیست speculatively فعال بشه — فقط بعد از measurement throughput improvement.
 ---
 • Client-side decoy detection — clear hint instead of cryptic error ([#404](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/404), [#310](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/310)): previously when a deployment had a stale/wrong AUTH_KEY, mhrv-rs returned the v1.8.0 bad-auth decoy HTML, and the client logged `WARN batch failed: bad response: no json in batch response: <!DOCTYPE html>...` — leaving the user to recognize the decoy body string and infer the cause. Now the client string-matches the decoy and emits an explicit error: "got the v1.8.0 bad-auth decoy — your AUTH_KEY in mhrv-rs config does NOT match the AUTH_KEY in this deployment's Code.gs. Either fix the mismatch + redeploy as a NEW VERSION (Apps Script doesn't auto-pick-up AUTH_KEY edits without an explicit redeploy), or set DIAGNOSTIC_MODE=true at the top of Code.gs + redeploy to see the explicit JSON `unauthorized` error during setup." Saves users hours of staring at "no json in batch response" trying to figure out what's wrong.
 • Add `script_id` to every batch-failure log line ([#404](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/404)): previously `WARN batch failed: ...` didn't identify which deployment failed. In multi-deployment setups (5-10 deployments where one or two have a stale AUTH_KEY), users couldn't identify the culprit without the per-deployment curl probe loop. Every failure log line now includes the short script_id prefix: `batch failed (script AKfycbz4): ...`, applied to all four failure paths (timeout, bad response, decoy, missing-response-in-batch). Together with the decoy detection above, the first reliable diagnostic for the multi-deployment-with-one-bad-AUTH_KEY user pattern.
 • New `disable_padding: true` config flag ([#391](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/391)): default `false` (padding active, full DPI defense). For users on heavily-throttled ISPs where the v1.8.0 random-padding cost (+~25% bandwidth per batch) compounds with the throttle to push borderline-working batches into timeouts, setting `"disable_padding": true` in `config.json` recovers headroom in exchange for losing length-distribution DPI defense. Don't flip on speculatively — for users where Apps Script outbound is uncongested, padding is free defense. Only enable if you've measured throughput improvement after the flip on your specific ISP path.
@@ -243,6 +243,10 @@ struct FormState {
    /// drop the user's setting. Not currently exposed as a UI control;
    /// users edit `block_quic` directly in `config.json` (Issue #213).
    block_quic: bool,
    /// Round-tripped from config.json. Not exposed as a UI control —
    /// users edit `disable_padding` directly when needed (Issue #391).
    /// Default false (padding active).
    disable_padding: bool,
 }
 #[derive(Clone, Debug)]
@@ -326,6 +330,7 @@ fn load_form() -> (FormState, Option<String>) {
            youtube_via_relay: c.youtube_via_relay,
            passthrough_hosts: c.passthrough_hosts.clone(),
            block_quic: c.block_quic,
            disable_padding: c.disable_padding,
        }
    } else {
        FormState {
@@ -354,6 +359,7 @@ fn load_form() -> (FormState, Option<String>) {
            youtube_via_relay: false,
            passthrough_hosts: Vec::new(),
            block_quic: false,
            disable_padding: false,
        }
    };
    (form, load_err)
@@ -500,6 +506,9 @@ impl FormState {
            // control yet). Round-trip through the file so save
            // doesn't drop a user-set true.
            block_quic: self.block_quic,
            // Issue #391: disable_padding is config-only for now.
            // Round-trip preserves the user's choice.
            disable_padding: self.disable_padding,
        })
    }
 }
@@ -190,6 +190,21 @@ pub struct Config {
    /// failure modes later. Issue #213.
    #[serde(default)]
    pub block_quic: bool,
    /// When true, suppress the random `_pad` field that v1.8.0+ adds
    /// to outbound Apps Script requests for DPI evasion. Default off
    /// (padding active). Some users on heavily-throttled ISPs find
    /// the +25% bandwidth cost from padding compounds with the
    /// throttle to push borderline-working batches into timeouts;
    /// turning padding off recovers a bit of headroom at the cost of
    /// length-distribution defense against DPI fingerprinting. Issue
    /// #391 (EBRAHIM-AM).
    ///
    /// Don't flip this on speculatively — for users where Apps Script
    /// outbound is uncongested, padding is free DPI defense. Only
    /// turn off if you've measured throughput improvement after the
    /// flip on your specific ISP path.
    #[serde(default)]
    pub disable_padding: bool,
 }
 fn default_fetch_ips_from_api() -> bool { false }
@@ -131,6 +131,10 @@ pub struct DomainFronter {
    today_calls: AtomicU64,
    today_bytes: AtomicU64,
    today_key: std::sync::Mutex<String>,
    /// Suppress the random `_pad` field that v1.8.0+ adds to outbound
    /// payloads. Mirrors `Config::disable_padding` (#391). Default false
    /// (padding active = stronger DPI defense at +25% bandwidth cost).
    disable_padding: bool,
 }
 /// Aggregated stats for one remote host.
@@ -289,6 +293,7 @@ impl DomainFronter {
            today_calls: AtomicU64::new(0),
            today_bytes: AtomicU64::new(0),
            today_key: std::sync::Mutex::new(current_pt_day_key()),
            disable_padding: config.disable_padding,
        })
    }
@@ -1160,7 +1165,9 @@ impl DomainFronter {
        // discards.
        let mut v = serde_json::to_value(&req)?;
        if let Value::Object(map) = &mut v {
-            add_random_pad(map);
+            if !self.disable_padding {
                add_random_pad(map);
            }
        }
        Ok(serde_json::to_vec(&v)?)
    }
@@ -1290,7 +1297,9 @@ impl DomainFronter {
        if let Some(d) = data {
            map.insert("d".into(), Value::String(d));
        }
-        add_random_pad(&mut map);
+        if !self.disable_padding {
            add_random_pad(&mut map);
        }
        Ok(serde_json::to_vec(&Value::Object(map))?)
    }
@@ -1318,7 +1327,9 @@ impl DomainFronter {
        map.insert("k".into(), Value::String(self.auth_key.clone()));
        map.insert("t".into(), Value::String("batch".into()));
        map.insert("ops".into(), serde_json::to_value(ops)?);
-        add_random_pad(&mut map);
+        if !self.disable_padding {
            add_random_pad(&mut map);
        }
        let payload = serde_json::to_vec(&Value::Object(map))?;
        let path = format!("/macros/s/{}/exec", script_id);
@@ -857,11 +857,15 @@ async fn fire_batch(
                    })
                    .sum();
                f.record_today(response_bytes);
                let sid_short = &script_id[..script_id.len().min(8)];
                for (idx, reply) in data_replies {
                    if let Some(resp) = batch_resp.r.get(idx) {
                        let _ = reply.send(Ok((resp.clone(), script_id.clone())));
                    } else {
-                        let _ = reply.send(Err("missing response in batch".into()));
+                        let _ = reply.send(Err(format!(
                            "missing response in batch from script {}",
                            sid_short
                        )));
                    }
                }
            }
@@ -876,7 +880,30 @@ async fn fire_batch(
                    f.record_timeout_strike(&script_id);
                }
                let err_msg = format!("{}", e);
-                tracing::warn!("batch failed: {}", err_msg);
+                let sid_short = &script_id[..script_id.len().min(8)];
                // Detect the v1.8.0 bad-auth decoy HTML body. The relay layer
                // wraps any non-JSON response in `BadResponse("no json in
                // batch response: <body prefix>")`. The decoy body string
                // `"The script completed but did not return anything"` is
                // distinctive — Apps Script's stock pages never include it,
                // and our own `Code.gs` only returns it when AUTH_KEY check
                // fails. Surfacing this as an actionable hint saves users
                // (and #404 / #310 sina-b4hrm class issues) hours of
                // staring at "no json in batch response".
                if err_msg.contains("The script completed but did not return anything") {
                    tracing::error!(
                        "batch failed (script {}): got the v1.8.0 bad-auth decoy — \
                         your AUTH_KEY in mhrv-rs config does NOT match the AUTH_KEY \
                         in this deployment's Code.gs. Either fix the mismatch + \
                         redeploy as a NEW VERSION (Apps Script doesn't auto-pick-up \
                         AUTH_KEY edits without an explicit redeploy), or set \
                         DIAGNOSTIC_MODE=true at the top of Code.gs + redeploy to \
                         see the explicit JSON `unauthorized` error during setup.",
                        sid_short
                    );
                } else {
                    tracing::warn!("batch failed (script {}): {}", sid_short, err_msg);
                }
                for (_, reply) in data_replies {
                    let _ = reply.send(Err(err_msg.clone()));
                }
@@ -886,7 +913,13 @@ async fn fire_batch(
                // stronger signal than a per-read timeout — count it the same
                // way so a truly-stuck deployment exits round-robin fast.
                f.record_timeout_strike(&script_id);
-                tracing::warn!("batch timed out after {:?} ({} ops)", BATCH_TIMEOUT, n_ops);
+                let sid_short = &script_id[..script_id.len().min(8)];
                tracing::warn!(
                    "batch timed out after {:?} (script {}, {} ops)",
                    BATCH_TIMEOUT,
                    sid_short,
                    n_ops
                );
                for (_, reply) in data_replies {
                    let _ = reply.send(Err("batch timed out".into()));
                }