Performance Notes
Sources:
Documentation/source/PERFORMANCE_FIXES_SUMMARY.mdDocumentation/source/BUFFER_SIZE_ANALYSIS.mdDocumentation/source/GETVALUES_PERFORMANCE_ANALYSIS.mdDocumentation/source/GRACEFUL_SHUTDOWN_FIX.md
Critical Fix: TCP_NODELAY on Protocol Client
Problem
The ProtocolServer/ProtocolClient RPC implementation was experiencing sporadic 100–200 ms delays caused by Nagle's algorithm batching small TCP packets.
Root Cause
| Side | Socket.NoDelay |
Status |
|---|---|---|
Server (ProtocolSession) |
true |
✅ already set |
Client (ProtocolClient) |
(not set) | ❌ missing — root cause |
Fix Applied
In ProtocolClient.OnConnected():
Socket.NoDelay = true; // CRITICAL — disables Nagle's algorithm
Socket.ReceiveBufferSize = 32768; // 32 KB
Socket.SendBufferSize = 32768; // 32 KB
Socket.ReceiveTimeout = 5000; // 5 s
Socket.SendTimeout = 5000; // 5 s
Same settings applied to ProtocolSession.OnConnected() (buffers/timeouts were new there).
Connection loop changed from Thread.Yield() (tight spin) to Thread.Sleep(10) to reduce CPU during connection establishment.
TCP Buffer Sizing
Application-Level Buffers (ProtocolReceiver)
| Constant | Value | Purpose |
|---|---|---|
DefaultQueueCapacity |
1 024 chunks | In-memory receive queue depth |
InitialFrameCapacity |
8 192 bytes (8 KB) | Initial frame buffer (grows via ArrayPool) |
MaxPacketBytes |
4 194 304 bytes (4 MB) | Safety guard against malformed packets |
OS-Level Socket Buffers
| Platform | Default SO_RCVBUF |
Default SO_SNDBUF |
|---|---|---|
| Linux / Raspberry Pi | ~87 380 bytes (auto-tuned) | ~16 384 bytes |
| Windows | ~65 536 bytes | ~65 536 bytes |
Both sides are now explicitly set to 32 KB — optimised for typical message sizes of 250–2 100 bytes.
Packet Wire Format
[SOF(1)] + [MAGIC(4)] + [GUID(36)] + [Base64Data] + [EOF(1)]
= 42 bytes fixed overhead + Base64-encoded payload
Base64 inflates binary payload by ~33%:
TotalPacketSize = 42 + ceil(PayloadSize / 3) × 4
Typical sizes: | Call type | Serialized | On wire | |-----------|-----------|---------| | Simple (1–2 params) | 136–356 bytes | 250–550 bytes | | Medium (5–10 params) | 500–1 500 bytes | 750–2 100 bytes |
GetValues() Performance Call Stack
The full round-trip for a ResourceValuesProxy.GetValues(string[] names) call:
Client:
1. CreateMethodCall()
2. Serialize → Base64 encode → send TCP packet
Network:
3. TCP/IP transmission
Server:
4. ProtocolSession receives packet
5. Deserialize ReflectionCall
6. Invoke GetValues() via compiled delegate (reflection is cached)
7. Serialize response → Base64 → send TCP packet
Client:
8. Receive packet, deserialize ReflectionProperties
9. Return result to caller
Instrumented Warning Thresholds
| Location | Warning threshold | Info threshold |
|---|---|---|
ProtocolClient.SendReceiveAsync — total |
>10 ms | >5 ms |
ReflectionStub.SendReceive — RPC overhead |
>20 ms | >10 ms |
ProtocolSession — server deserialization |
>5 ms | — |
ProtocolSession — server request processing |
>10 ms | — |
ProtocolSession — server total |
>20 ms | >10 ms |
ReflectionStub.FunctionInvocation — method call |
>10 ms | >5 ms |
If you see consistent warnings above these thresholds, check:
Socket.NoDelay = trueon both sides (see above)- Network round-trip latency (
ping <device>) - Server-side
GetValues()implementation for lock contention
WebApi GetValues Performance
The HTTP WebApi (/api/resources/values/get) batches multiple resource reads in a single HTTP round-trip. Prefer batch reads over individual reads when fetching multiple values:
// ✗ N round-trips
foreach (var name in names)
values[name] = await client.Resources.GetValueAsync(name);
// ✓ 1 round-trip
var values = await client.Resources.GetValuesAsync(names);