Files
MasterHttpRelayVPN-RUST/docs/changelog/v1.9.18.md
T
therealaleph 3e599f29e8 chore(release): v1.9.18 — perf: zero-copy mux + base64 off mux thread (#881)
Bumps Cargo.toml v1.9.17 → v1.9.18 and ships the changelog for the
zero-copy mux refactor merged in 54552bb. No user-visible behavior
change; perf-focused release.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 05:56:41 +03:00

19 lines
3.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
<!-- see docs/changelog/v1.1.0.md for the file format: Persian, then `---`, then English. -->
• Performance refactor of full-tunnel mux hot path ([#881](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/pull/881) by @dazzling-no-more) — zero-copy reads via `Bytes`/`BytesMut` و base64 encoding از روی single mux thread برداشته شد. هیچ wire-protocol change نداره — فقط internal data flow. (1) `tunnel_loop` و SOCKS5 UDP receive loop دیگه per-iteration `Vec::to_vec()` copy ندارن. `MuxMsg::{ConnectData,Data,UdpOpen,UdpData}` حالا `Bytes` (Arc-backed) carry می‌کنن به جای `Vec<u8>`/`Arc<Vec<u8>>`. TCP path threshold-based: ≥32 KB → `BytesMut::split().freeze()` (saves 64 KB memcpy on hot downloads); <32 KB → `Bytes::copy_from_slice` + `buf.clear()` (payload-sized retention). UDP path: fixed `Vec<u8>` recv buffer + size-guarded copy. (2) base64 encoding (تا ~3 MB per batch) از mux thread رفت به spawned task تو `fire_batch` بعد از per-deployment semaphore — single mux task دیگه serialize نمی‌شه. (3) Code quality: `BatchAccum::push_or_fire` (۴ match arm به ۱ کلپس)، `should_fire()` predicate با `saturating_add`، `encode_pending()` free function. ۲۰۰ → **۲۰۸ lib test** (+۸ regression: encode_pending × ۴، should_fire × ۳، batch_accum_reindexes_after_flush). API change: `TunnelMux::udp_open`/`udp_data` حالا `impl Into<Bytes>` می‌گیرن — existing callers با Vec<u8>/Bytes/BytesMut بدون تغییر کار می‌کنن.
---
• Performance refactor of the full-tunnel mux hot data path ([#881](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/pull/881) by @dazzling-no-more). No wire-protocol changes — internal data flow only.
**1. Zero-copy reads via `Bytes`/`BytesMut`.** `tunnel_loop` and the SOCKS5 UDP receive loop drop per-iteration `Vec::to_vec()` copies. `MuxMsg::{ConnectData,Data,UdpOpen,UdpData}` now carry `Bytes` (Arc-backed internally) instead of `Vec<u8>`/`Arc<Vec<u8>>`; the `Arc::try_unwrap` dance for `pending_client_data` is gone. TCP path is threshold-based to avoid memory regressions:
- **n ≥ 32 KB**: `BytesMut::split().freeze()` — saves the 64 KB memcpy on hot downloads.
- **n < 32 KB**: `Bytes::copy_from_slice` + `buf.clear()` — payload-sized retention. Without this split, `bytes` 1.x's whole-allocation refcount would pin a full 64 KB per queued tiny read under semaphore stall (worst case ~96 MB on a backpressured tunnel).
UDP path: fixed `Vec<u8>` recv buffer + `Bytes::copy_from_slice` after the 9 KB `MAX_UDP_PAYLOAD_BYTES` guard. `parse_socks5_udp_packet` split into `_offsets` + `&[u8]` wrapper so callers stay on the reusable buffer.
**2. Base64 encoding moved off the single mux thread.** New internal `PendingOp { data: Option<Bytes>, encode_empty: bool }` flows through `mux_loop` with raw bytes. Actual `B64.encode(...)` runs in `fire_batch`'s spawned task, after the per-deployment semaphore. Up to ~3 MB of encoding per batch (50 ops × 64 KB) no longer serializes the single mux task.
**3. Code quality (drive-bys).** `BatchAccum::push_or_fire` collapses 4× ~25-line match arms into ~10 lines each. `should_fire(pending_len, payload_bytes, op_bytes)` predicate extracted with `saturating_add`. `encode_pending(p) -> BatchOp` extracted as a free function for direct test coverage.
**Public API change**: `TunnelMux::udp_open` and `udp_data` now take `data: impl Into<Bytes>` instead of `Vec<u8>` — existing in-tree callers passing `Vec<u8>`, `&'static [u8]`, `Bytes`, or `BytesMut` all keep compiling.
200 → **208 lib tests** (+8 regression: `encode_pending_*` × 4, `should_fire_*` × 3, `batch_accum_reindexes_after_flush`).