fix: v1.9.8 — Android disconnect crash + UI test-button gate for non-apps_script modes

Android (#666 from @ilok67 with full root cause):
- MainActivity.onStop was sending ACTION_STOP via startService() AND immediately calling stopService() on the same service. ACTION_STOP runs teardown() on a background thread that stopSelf()s at the end; the redundant stopService() triggered onDestroy() in parallel, racing the lifecycle and crashing on every Disconnect tap. Removed the stopService() — ACTION_STOP alone is sufficient for both the live-service and the zombie-after-process-death cases. The tornDown AtomicBoolean already guards against double-teardown of native state but couldn't protect against OS-level stopSelf vs stopService race.

UI (#665 from @cmptrnb):
- Test Relay button was showing red "test result: fail" status when used in full or direct mode. The underlying test_cmd::run deliberately refuses in those modes because probing Apps Script directly while the data plane goes via tunnel-node would give a misleading result, but the refuse path was getting translated to generic "test failed". UI now checks mode before running and shows a mode-specific explainer for full/direct (point users at https://whatismyipaddress.com in the browser via the proxy as the right way to verify).

Includes already-merged PR #674 from @yyoyoian-pixel: drop client coalesce_step + tunnel-node straggler settle_step from 40 ms → 10 ms, raise tunnel-node settle max from 500 ms → 1000 ms. Asymmetric tuning: fast-fire when nothing else is queued, but adaptive coalesce on bursts. Backwards compatible — existing configs with explicit `coalesce_step_ms: 40` keep old behavior.

Tests: 179 lib + 33 tunnel-node green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
therealaleph
2026-05-03 15:57:53 +03:00
parent 994dd0b23c
commit 677ec26bee
5 changed files with 76 additions and 21 deletions
@@ -173,30 +173,36 @@ class MainActivity : AppCompatActivity() {
}
},
onStop = {
// Three-step teardown. Each step is defensive against a
// different failure mode we've actually hit in testing:
// Single-step graceful teardown. ACTION_STOP delivered via
// startService() reaches MhrvVpnService.onStartCommand,
// which spawns the `mhrv-teardown` background thread that
// tears down tun2proxy + the Rust runtime and then calls
// stopSelf() at the end of teardown. Service stops on its
// own — we don't need (and must not) follow up with
// stopService().
//
// 1. ACTION_STOP — graceful path. The service receives it,
// runs its teardown (stops tun2proxy, closes the TUN
// fd, shuts down the Rust runtime) and stopSelf()'s.
// This is what we want 99% of the time.
// History (#666 from @ilok67): we used to call stopService()
// immediately after startService(stopAction), as belt-and-
// suspenders against a "force-closed then reopened zombie"
// case. That second call was firing onDestroy() while the
// mhrv-teardown thread was still running, racing two threads
// through the lifecycle and crashing on tap-to-disconnect.
// The teardown thread's idempotency guard (tornDown
// AtomicBoolean) protects against double-teardown of native
// state, but it can't protect against OS-level lifecycle
// races on stopSelf vs stopService. ACTION_STOP alone is
// enough for both the live-service and zombie cases —
// startService creates a fresh service in the new process
// for zombies, runs teardown (no-op on already-clean state)
// and stops it.
//
// 2. stopService() — covers the "force-closed then
// reopened" zombie case. Android may auto-restart our
// START_STICKY service in a fresh process after the
// user swipes us away from Recents, and the user's
// next Stop tap needs to actually unbind even if our
// in-memory TUN fd reference is gone. stopService is
// idempotent so it's safe to follow the graceful path.
//
// 3. We do NOT touch the VpnService permission — that's
// the OS-wide VPN grant and the user approved it
// deliberately. Revoking it would force a re-prompt
// on next Start, which is worse UX.
// We do NOT touch the VpnService permission — that's the
// OS-wide VPN grant and the user approved it deliberately.
// Revoking it would force a re-prompt on next Start, which
// is worse UX.
val stopAction = Intent(this, MhrvVpnService::class.java)
.setAction(MhrvVpnService.ACTION_STOP)
startService(stopAction)
stopService(Intent(this, MhrvVpnService::class.java))
},
onInstallCaConfirmed = {
// The flow is (1) export cert, (2) copy it to Downloads so