Gateway overhead measured against a mock LLM server with instant
responses — numbers reflect pure proxy cost.
Latency: P50 / P99 in milliseconds. Lower is better.
| RPS | direct | crabllm | bifrost | litellm |
| 100 | 0.38 / 0.63 | 1.00 / 1.31 | 1.10 / 1.64 | 5.35 / 10.79 |
| 500 | 0.28 / 0.42 | 0.66 / 1.07 | 0.36 / 0.91 | 168.79 / 223.69 |
| 1000 | 0.15 / 0.31 | 0.44 / 0.83 | 0.27 / 0.46 | 172.00 / 201.55 |
| 2000 | 0.17 / 0.33 | 0.29 / 0.88 | 0.29 / 0.53 | 169.99 / 194.34 |
| 5000 | 0.13 / 0.33 | 0.26 / 0.57 | 0.26 / 0.48 | 159.86 / 492.82 |
| RPS | direct | crabllm | bifrost | litellm |
| 100 | 0.45 / 0.62 | 43.53 / 48.14 | 1.51 / 2.20 | 670.25 / 3357.70 |
| 500 | 0.34 / 0.54 | 42.90 / 47.14 | 0.51 / 0.93 | 659.97 / 3569.92 |
| 1000 | 0.22 / 0.42 | 44.18 / 48.30 | 0.45 / 0.98 | 645.59 / 2797.66 |
| 2000 | 44.04 / 48.23 | 44.25 / 48.52 | 44.18 / 48.64 | 596.90 / 2678.08 |
| 5000 | 44.04 / 48.23 | 44.24 / 48.50 | 44.20 / 48.66 | 571.96 / 2563.73 |
| RPS | direct | crabllm | bifrost | litellm |
| 100 | 0.39 / 0.47 | 1.18 / 1.48 | 1.15 / 1.70 | 7.09 / 10.72 |
| 500 | 0.30 / 0.42 | 0.78 / 1.15 | 0.43 / 1.03 | 356.71 / 414.36 |
| 1000 | 0.17 / 0.27 | 0.51 / 0.91 | 0.38 / 0.85 | 332.53 / 6516.44 |
| 2000 | 0.18 / 0.32 | 0.36 / 1.08 | 0.39 / 0.94 | 317.53 / 365.68 |
| 5000 | 0.14 / 0.32 | 0.34 / 0.64 | 0.39 / 1.57 | 305.91 / 8778.06 |
| Gateway | Peak RSS |
| direct | 15.3 MB |
| crabllm | 34.9 MB |
| bifrost | 171.7 MB |
| litellm | 541.8 MB |