› From the team
Parasail Blog
Product updates, engineering deep dives, and thought leadership from the Parasail team.
Making an EAGLE fly: How We Got 2.6x Faster LLM Inference (Without Cheating)
We trained a custom EAGLE-3 speculative decoding head for OLMo-3.1-32B-Think and got 2.6x faster inference.
EngineeringMaking Cold Start Latencies go Brrrr: A Multi-pronged Approach (Part 1)
We walk through how we combined fastsafetensors, O_DIRECT, and io_uring to get fast cold-starts and fast warm-starts on the same stack.
Engineering