Skip to content

Byzantine-FL Attack Arena Leaderboard

Final-round test accuracy of each aggregation strategy under the FLPoison canonical attack set, on real MNIST (mean over seeds). Higher is more robust. Data lineage: out/attack_arena/aggregated.csv (regenerate via scripts/dump_attack_arena.py).

Worst-case defender ranking

If you must pick one strategy without knowing the attack, this is the order — ranked by each strategy's weakest result across all attacks.

Rank Strategy Worst-case accuracy Weakest under
1 Bulyan 95.7% Gaussian (Krum-paper)
2 MultiKrum 93.8% Gaussian (Krum-paper)
3 Krum 90.9% Gaussian (Krum-paper)
4 FedAvg 9.8% Gaussian (Krum-paper)
5 ArKrum 9.6% Fang-Krum (Fang 2020)

Final accuracy by attack

Strategy Gaussian (Krum-paper) IPM (Fall of Empires) Label flip (Tolpegin 2020) Sign flip (Damaskinos 2018) ALIE (Baruch 2019) Fang-Krum (Fang 2020)
FedAvg 9.8% 89.3% 93.3% 86.2% 96.5% 13.4%
Krum 90.9% 90.9% 90.9% 90.9% 96.3% 90.9%
MultiKrum 93.8% 93.8% 93.8% 93.8% 95.7% 93.8%
Bulyan 95.7% 95.7% 95.7% 95.7% 96.0% 95.7%
ArKrum 96.2% 95.8% 96.2% 94.5% 96.4% 9.6%