Strategies¶
VelocityFL ships six aggregation strategies. All six are implemented in Rust and exposed as frozen Python dataclasses in velocity.strategy. Pick one based on your threat model and client heterogeneity.
Decision guide¶
| If you have… | Reach for |
|---|---|
| IID clients, no adversary, you want a baseline | FedAvg |
| Heterogeneous clients, drifting local updates | FedProx(mu=…) |
| Untrusted clients, possible Byzantine updates, <½ of clients compromised | FedMedian |
Untrusted clients, up to k compromised per coordinate, want averaged survivors |
TrimmedMean(k=…) |
Untrusted clients, up to f compromised, want a single winner |
Krum(f=…) |
Untrusted clients, up to f compromised, want to average m survivors |
MultiKrum(f=…, m=…) |
All six are value objects: compare with ==, safe to hash, safe to share between threads.
from velocity import FedAvg, FedProx, FedMedian, TrimmedMean, Krum, MultiKrum
FedAvg() == FedAvg() # True
FedProx(mu=0.01) != FedProx(mu=0.1) # True
FedAvg¶
Weighted average by local sample count — the McMahan et al. (2017) baseline.
Use when clients are IID and trusted. Fast, stable, easy to reason about.
from velocity import VelocityServer, FedAvg
server = VelocityServer(model_id=..., dataset=..., strategy=FedAvg())
FedProx¶
FedAvg-style aggregation with a proximal term that penalizes local updates for drifting too far from the global model. From Li et al. (2020).
Use when clients are heterogeneous — differing data distributions, compute budgets, or local epoch counts. The proximal term μ dampens client drift.
| Field | Default | Effect |
|---|---|---|
mu |
0.01 |
Higher → more conservative updates, slower convergence, better stability on non-IID data. |
from velocity import VelocityServer, FedProx
server = VelocityServer(model_id=..., dataset=..., strategy=FedProx(mu=0.05))
FedMedian¶
Coordinate-wise median of client updates. From Yin et al. (2018).
Use when you cannot trust every client. Robust against up to ⌊K/2⌋ Byzantine updates — a poisoned client cannot shift the median, only extend its tail.
from velocity import VelocityServer, FedMedian
server = VelocityServer(model_id=..., dataset=..., strategy=FedMedian())
Pair with attack simulation —
FedMedianis the natural companion to the attacks catalog. Run the same experiment withFedAvgandFedMedian, then compareglobal_losstrajectories to see resilience in action.
TrimmedMean¶
Coordinate-wise mean after dropping the k smallest and k largest values. From Yin et al. (2018, arXiv:1803.01498).
Use when you want a Byzantine-robust aggregator that is cheaper than FedMedian for small k and dimension-independent. Tunes the trim budget directly: k=0 is the uniform mean, k=⌊(n-1)/2⌋ reduces to the (odd-n) median.
| Field | Constraint | Effect |
|---|---|---|
k |
0 ≤ k, 2k < n |
Per-coordinate Byzantine tolerance. Raises if the round has fewer than 2k + 1 clients. |
from velocity import VelocityServer, TrimmedMean
server = VelocityServer(model_id=..., dataset=..., strategy=TrimmedMean(k=1))
# Tolerates 1 Byzantine client per coordinate; needs at least 3 clients per round.
Why uniform, not sample-weighted — same reason as
MultiKrum: a Byzantine client can lie aboutnum_samplesto pull the mean in a weighted average.TrimmedMeandiscardsnum_samplesdeliberately; pinned bytest_trimmed_mean_k0_is_uniform_mean.Per-coordinate robustness, not per-client. Different coordinates trim different client subsets — there is no single "selected" client list.
RoundSummary.selected_client_idstherefore returns every participating client (same convention asFedMedian).
Krum¶
Select the single client whose update looks most like its n − f − 2 nearest neighbours, by squared Euclidean distance. From Blanchard et al. (2017).
score(i) = Σ_{j ∈ N_i} ‖w_i - w_j‖² where N_i = the (n-f-2) closest clients
w_{t+1} = w_{argmin_i score(i)}
Use when you assume up to f of the n clients are Byzantine and you want a provably-bounded winner rather than a blended average.
| Field | Constraint | Effect |
|---|---|---|
f |
n ≥ 2f + 3 |
Byzantine-tolerance bound. Raises if the round has fewer than 2f + 3 clients. |
from velocity import VelocityServer, Krum
server = VelocityServer(model_id=..., dataset=..., strategy=Krum(f=2))
# Needs at least 2*2 + 3 = 7 clients per round.
The round summary exposes the winner's index so you can audit selections:
Breakdown point — Krum provably converges when strictly fewer than
n − 2f − 2of thenclients are Byzantine. Falling below that threshold (too many attackers, or too few honest clients) silently degrades robustness; keepn ≫ 2f + 3in practice.
MultiKrum¶
Run the Krum scoring, then return the uniform (not sample-weighted) mean of the m lowest-scoring updates. From El Mhamdi et al. (2018).
Use when you want Krum's outlier suppression and the variance reduction of averaging. Tunes between the two extremes: m=1 collapses to Krum, m=n−f is Multi-Krum's default.
| Field | Default | Constraint |
|---|---|---|
f |
— | n ≥ 2f + 3. |
m |
n − f |
Must satisfy 1 ≤ m ≤ n − f. None uses the default. |
from velocity import VelocityServer, MultiKrum
server = VelocityServer(model_id=..., dataset=..., strategy=MultiKrum(f=2, m=5))
Why uniform, not sample-weighted — Byzantine clients can lie about
num_samplesto amplify their pull in a weighted average. Multi-Krum deliberately discards that signal; this is pinned bytest_multi_krum_m_equals_n_minus_f_is_uniform_meanandstrategy::tests::multikrum_uniform_weighting_ignores_sample_counts.
CLI shorthand¶
The CLI accepts Name for parameter-free strategies and Name:key=value[,key=value] for the parameterised ones:
velocity run --strategy FedAvg --model-id demo/m --dataset demo/d
velocity run --strategy FedProx:mu=0.05 --model-id demo/m --dataset demo/d
velocity run --strategy TrimmedMean:k=1 --model-id demo/m --dataset demo/d --min-clients 3
velocity run --strategy Krum:f=2 --model-id demo/m --dataset demo/d --min-clients 7
velocity run --strategy MultiKrum:f=2,m=5 --model-id demo/m --dataset demo/d --min-clients 7
velocity sweep --strategies FedAvg,Krum:f=1 --rounds 5
Sweep TOML files accept either the string form or a dict form — see Sweep spec.
Adding your own¶
Strategies are defined in vfl-core/src/strategy.rs. To add a new one:
- Add a variant to the
Strategyenum and implement the aggregation kernel in Rust. Return anAggregation { weights, selected_client_ids }. - Expose a PyO3 constructor in
vfl-core/src/lib.rs(e.g.Strategy::trimmed_mean(trim_ratio)). - Add a frozen dataclass to
python/velocity/strategy.pyand include it inALL_STRATEGIES. - Wire the isinstance dispatch in
VelocityServer._map_strategy. - Extend
parse_strategyif the new dataclass has non-trivial coercion needs. - Add tests in
tests/and a section to this page.
See Architecture for the full layer map.