mirror of
https://github.com/therealaleph/MasterHttpRelayVPN-RUST.git
synced 2026-05-18 06:34:41 +03:00
fix: v1.9.8 — Android disconnect crash + UI test-button gate for non-apps_script modes
Android (#666 from @ilok67 with full root cause): - MainActivity.onStop was sending ACTION_STOP via startService() AND immediately calling stopService() on the same service. ACTION_STOP runs teardown() on a background thread that stopSelf()s at the end; the redundant stopService() triggered onDestroy() in parallel, racing the lifecycle and crashing on every Disconnect tap. Removed the stopService() — ACTION_STOP alone is sufficient for both the live-service and the zombie-after-process-death cases. The tornDown AtomicBoolean already guards against double-teardown of native state but couldn't protect against OS-level stopSelf vs stopService race. UI (#665 from @cmptrnb): - Test Relay button was showing red "test result: fail" status when used in full or direct mode. The underlying test_cmd::run deliberately refuses in those modes because probing Apps Script directly while the data plane goes via tunnel-node would give a misleading result, but the refuse path was getting translated to generic "test failed". UI now checks mode before running and shows a mode-specific explainer for full/direct (point users at https://whatismyipaddress.com in the browser via the proxy as the right way to verify). Includes already-merged PR #674 from @yyoyoian-pixel: drop client coalesce_step + tunnel-node straggler settle_step from 40 ms → 10 ms, raise tunnel-node settle max from 500 ms → 1000 ms. Asymmetric tuning: fast-fire when nothing else is queued, but adaptive coalesce on bursts. Backwards compatible — existing configs with explicit `coalesce_step_ms: 40` keep old behavior. Tests: 179 lib + 33 tunnel-node green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Generated
+1
-1
@@ -2222,7 +2222,7 @@ dependencies = [
|
|||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "mhrv-rs"
|
name = "mhrv-rs"
|
||||||
version = "1.9.7"
|
version = "1.9.8"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"base64 0.22.1",
|
"base64 0.22.1",
|
||||||
"bytes",
|
"bytes",
|
||||||
|
|||||||
+1
-1
@@ -1,6 +1,6 @@
|
|||||||
[package]
|
[package]
|
||||||
name = "mhrv-rs"
|
name = "mhrv-rs"
|
||||||
version = "1.9.7"
|
version = "1.9.8"
|
||||||
edition = "2021"
|
edition = "2021"
|
||||||
description = "Rust port of MasterHttpRelayVPN -- DPI bypass via Google Apps Script relay with domain fronting"
|
description = "Rust port of MasterHttpRelayVPN -- DPI bypass via Google Apps Script relay with domain fronting"
|
||||||
license = "MIT"
|
license = "MIT"
|
||||||
|
|||||||
@@ -173,30 +173,36 @@ class MainActivity : AppCompatActivity() {
|
|||||||
}
|
}
|
||||||
},
|
},
|
||||||
onStop = {
|
onStop = {
|
||||||
// Three-step teardown. Each step is defensive against a
|
// Single-step graceful teardown. ACTION_STOP delivered via
|
||||||
// different failure mode we've actually hit in testing:
|
// startService() reaches MhrvVpnService.onStartCommand,
|
||||||
|
// which spawns the `mhrv-teardown` background thread that
|
||||||
|
// tears down tun2proxy + the Rust runtime and then calls
|
||||||
|
// stopSelf() at the end of teardown. Service stops on its
|
||||||
|
// own — we don't need (and must not) follow up with
|
||||||
|
// stopService().
|
||||||
//
|
//
|
||||||
// 1. ACTION_STOP — graceful path. The service receives it,
|
// History (#666 from @ilok67): we used to call stopService()
|
||||||
// runs its teardown (stops tun2proxy, closes the TUN
|
// immediately after startService(stopAction), as belt-and-
|
||||||
// fd, shuts down the Rust runtime) and stopSelf()'s.
|
// suspenders against a "force-closed then reopened zombie"
|
||||||
// This is what we want 99% of the time.
|
// case. That second call was firing onDestroy() while the
|
||||||
|
// mhrv-teardown thread was still running, racing two threads
|
||||||
|
// through the lifecycle and crashing on tap-to-disconnect.
|
||||||
|
// The teardown thread's idempotency guard (tornDown
|
||||||
|
// AtomicBoolean) protects against double-teardown of native
|
||||||
|
// state, but it can't protect against OS-level lifecycle
|
||||||
|
// races on stopSelf vs stopService. ACTION_STOP alone is
|
||||||
|
// enough for both the live-service and zombie cases —
|
||||||
|
// startService creates a fresh service in the new process
|
||||||
|
// for zombies, runs teardown (no-op on already-clean state)
|
||||||
|
// and stops it.
|
||||||
//
|
//
|
||||||
// 2. stopService() — covers the "force-closed then
|
// We do NOT touch the VpnService permission — that's the
|
||||||
// reopened" zombie case. Android may auto-restart our
|
// OS-wide VPN grant and the user approved it deliberately.
|
||||||
// START_STICKY service in a fresh process after the
|
// Revoking it would force a re-prompt on next Start, which
|
||||||
// user swipes us away from Recents, and the user's
|
// is worse UX.
|
||||||
// next Stop tap needs to actually unbind even if our
|
|
||||||
// in-memory TUN fd reference is gone. stopService is
|
|
||||||
// idempotent so it's safe to follow the graceful path.
|
|
||||||
//
|
|
||||||
// 3. We do NOT touch the VpnService permission — that's
|
|
||||||
// the OS-wide VPN grant and the user approved it
|
|
||||||
// deliberately. Revoking it would force a re-prompt
|
|
||||||
// on next Start, which is worse UX.
|
|
||||||
val stopAction = Intent(this, MhrvVpnService::class.java)
|
val stopAction = Intent(this, MhrvVpnService::class.java)
|
||||||
.setAction(MhrvVpnService.ACTION_STOP)
|
.setAction(MhrvVpnService.ACTION_STOP)
|
||||||
startService(stopAction)
|
startService(stopAction)
|
||||||
stopService(Intent(this, MhrvVpnService::class.java))
|
|
||||||
},
|
},
|
||||||
onInstallCaConfirmed = {
|
onInstallCaConfirmed = {
|
||||||
// The flow is (1) export cert, (2) copy it to Downloads so
|
// The flow is (1) export cert, (2) copy it to Downloads so
|
||||||
|
|||||||
@@ -0,0 +1,14 @@
|
|||||||
|
<!-- see docs/changelog/v1.1.0.md for the file format: Persian, then `---`, then English. -->
|
||||||
|
• Fix v1.9.7 Android: کرش روی tap Disconnect ([#666](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/666) از @ilok67 با root cause + fix کامل): `MainActivity.onStop` بعد از `startService(ACTION_STOP)` بلافاصله `stopService()` رو هم میزد. ACTION_STOP داخل `MhrvVpnService` یک thread پسزمینه به نام `mhrv-teardown` میسازه که `teardown()` (بستن tun2proxy، fd TUN، runtime) رو اجرا میکنه و در پایانش `stopSelf()` رو فرامیخونه. ولی `stopService()` بلافاصله `onDestroy()` رو روی همان service trigger میکرد — دو thread همزمان دارن از lifecycle میگذرن، و OS process service رو میکشه قبل از اینکه teardown تمام بشه. crash بعد از تب Disconnect، در حدود ۹۹٪ از تستها قابل reproduce. حالا `stopService()` حذف شده — `ACTION_STOP` تنها کافی است (هم برای service زنده هم برای حالت زامبی). idempotency guard `tornDown` AtomicBoolean قبلاً موجود بود ولی محافظت OS-level lifecycle race رو نمیکرد. تشکر از @ilok67 برای triage عالی.
|
||||||
|
• Fix v1.9.7 UI: دکمهٔ Test Relay در حالت `full` (و `direct`) "test result: fail" قرمز نشون میداد ([#665](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/665) از @cmptrnb). `mhrv-rs test` فقط برای حالت apps_script سیمکشی شده — در `full` mode عمداً refuse میکنه چون probe مستقیم Apps Script در حالی که data plane از tunnel-node رد میشه گمراهکننده است. ولی پیام refuse توسط UI بهعنوان test failure ترجمه میشد + کاربر فکر میکرد proxy خراب است. حالا UI mode رو قبل از اجرای test چک میکنه + برای حالتهای نامناسب پیام explainer میده بهجای fail قرمز:
|
||||||
|
> Test Relay is wired only for apps_script mode. In full mode the data plane is the tunnel-node — to verify it end-to-end, start the proxy and load https://whatismyipaddress.com in your browser via 127.0.0.1:8085. The IP shown should be your tunnel-node's VPS IP.
|
||||||
|
|
||||||
|
- Tune adaptive batch coalesce (PR [#674](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/pull/674) از @yyoyoian-pixel): از 40 ms → **10 ms** برای client coalesce step و tunnel-node straggler settle step. tunnel-node settle max از 500 ms → **1000 ms**. منطق asymmetric: وقتی هیچ op دیگری نیست، fast-fire (10 ms کافی برای catch کردن opهایی که در همان event-loop tick میرسن مثل ۶ موازی parallel browser connection)؛ ولی وقتی هر دو طرف data دارن (uploads، page load بستی)، adaptive reset همچنان batch میکنه تا 1 s cap. در short: «وقتی چیزی برای انتظار نیست منتظر نباش، وقتی هست با تمام توان batch کن.» سازگار به عقب: کاربران با `coalesce_step_ms: 40` در config.json رفتار قدیمی رو نگه میدارن.
|
||||||
|
• تست: ۱۷۹ lib + ۳۳ tunnel-node test همه pass.
|
||||||
|
---
|
||||||
|
• Fix Android crash on tap-Disconnect from v1.9.7 ([#666](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/666) by @ilok67 with full root cause + fix): `MainActivity.onStop` was calling `stopService()` immediately after `startService(ACTION_STOP)`. ACTION_STOP inside `MhrvVpnService` spawns the `mhrv-teardown` background thread that runs `teardown()` (stops tun2proxy, closes TUN fd, shuts down the Rust runtime) and then calls `stopSelf()` at the end. But `stopService()` immediately triggered `onDestroy()` on the same service — two threads racing through the lifecycle, and the OS would kill the process before teardown finished. Crash on every Disconnect tap, ~99% reproducible. Removed the `stopService()` call — `ACTION_STOP` alone is sufficient for both the live-service and the zombie-after-process-death cases. The existing `tornDown` AtomicBoolean idempotency guard protects against double-teardown of native state, but it can't protect against OS-level lifecycle races on stopSelf vs stopService. Thanks @ilok67 for the precise triage.
|
||||||
|
• Fix UI showing "test result: fail" red status for `full` (and `direct`) modes from v1.9.7 ([#665](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/665) by @cmptrnb). `mhrv-rs test` is wired only for the apps_script relay path — it deliberately refuses in `full` mode because probing Apps Script directly while the actual data plane goes via tunnel-node would give a misleading green result. But the refuse path was getting translated by the UI as a generic "test failed" with red status, scaring users into thinking their proxy was broken. Now the UI checks mode before running the test and shows a friendly explainer for `full`/`direct`:
|
||||||
|
> Test Relay is wired only for apps_script mode. In full mode the data plane is the tunnel-node — to verify it end-to-end, start the proxy and load https://whatismyipaddress.com in your browser via 127.0.0.1:8085. The IP shown should be your tunnel-node's VPS IP.
|
||||||
|
|
||||||
|
• Tune adaptive batch coalesce (PR [#674](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/pull/674) from @yyoyoian-pixel): client coalesce step + tunnel-node straggler settle step from 40 ms → **10 ms**, tunnel-node settle max from 500 ms → **1000 ms**. The asymmetric design — small step, generous max — picks up "fire-and-forget when nothing else is queued" without giving up batching on bursts. The 10 ms still catches ops that arrive in the same event-loop tick (e.g. a browser opening 6 parallel connections on page load), so we don't degenerate into single-op batches; but on a download where the client is just waiting for the next chunk, the per-batch dead-air shrinks by ~30 ms. Backwards-compatible: existing configs with explicit `coalesce_step_ms: 40` keep the old behaviour.
|
||||||
|
• Tests: 179 lib + 33 tunnel-node tests all passing.
|
||||||
@@ -2171,6 +2171,41 @@ fn background_thread(shared: Arc<Shared>, rx: Receiver<Cmd>) {
|
|||||||
|
|
||||||
Ok(Cmd::Test(cfg)) => {
|
Ok(Cmd::Test(cfg)) => {
|
||||||
let shared2 = shared.clone();
|
let shared2 = shared.clone();
|
||||||
|
// Short-circuit modes where `test_cmd::run` deliberately
|
||||||
|
// refuses (full mode, direct mode). Those return false
|
||||||
|
// even when the proxy is healthy, which surfaced as
|
||||||
|
// "Test failed" + alarming red status — see #665. Show
|
||||||
|
// a friendly notice instead and skip the test path.
|
||||||
|
let mode_kind = cfg.mode_kind().ok();
|
||||||
|
let mode_explainer = match mode_kind {
|
||||||
|
Some(mhrv_rs::config::Mode::Full) => Some(
|
||||||
|
"Test Relay is wired only for apps_script mode. \
|
||||||
|
In full mode the data plane is the tunnel-node — \
|
||||||
|
to verify it end-to-end, start the proxy and load \
|
||||||
|
https://whatismyipaddress.com in your browser \
|
||||||
|
via 127.0.0.1:8085. The IP shown should be your \
|
||||||
|
tunnel-node's VPS IP. Tracking a real Full-mode \
|
||||||
|
test in #160."
|
||||||
|
),
|
||||||
|
Some(mhrv_rs::config::Mode::Direct) => Some(
|
||||||
|
"Test Relay is wired only for apps_script mode. \
|
||||||
|
In direct mode there is no Apps Script relay — \
|
||||||
|
every request goes through the SNI-rewrite tunnel \
|
||||||
|
straight to Google's edge. Verify by loading \
|
||||||
|
https://www.google.com via the proxy."
|
||||||
|
),
|
||||||
|
_ => None,
|
||||||
|
};
|
||||||
|
if let Some(msg) = mode_explainer {
|
||||||
|
{
|
||||||
|
let mut st = shared.state.lock().unwrap();
|
||||||
|
st.last_test_ok = None;
|
||||||
|
st.last_test_msg = msg.into();
|
||||||
|
st.last_test_msg_at = Some(Instant::now());
|
||||||
|
}
|
||||||
|
push_log(&shared, &format!("[ui] test skipped: {}", msg));
|
||||||
|
continue;
|
||||||
|
}
|
||||||
push_log(&shared, "[ui] running test...");
|
push_log(&shared, "[ui] running test...");
|
||||||
rt.spawn(async move {
|
rt.spawn(async move {
|
||||||
let ok = test_cmd::run(&cfg).await;
|
let ok = test_cmd::run(&cfg).await;
|
||||||
|
|||||||
Reference in New Issue
Block a user