In this talk we will be exploring strategies to deliver scalable low latency real-time speech recognition solutions. To do this we will look at the entire end-to-end pipeline, from efficiently getting a quality signal from an audio device, propagating it over the network, to feeding it into various models for inference and processing their outputs. Along the journey we will discuss design decisions related to stability, latency, and throughput.
Even the most accurate model can’t provide you with a winning product if your inference engine doesn’t perform well. At the end of this talk, you’ll be equipped with new methods to ensure your own product will be able to handle the workloads of your wildest dreams.
Emil Loer is the lead architect of Hi Auto, market leader in conversational AI for restaurant drive-thrus, where he designs large scale distributed systems using state of the art speech recognition and NLP technologies. Emil is an active figure in both Python and Rust communities and has a passion for anything involving real-time computing and audio processing. Vim and a good cup of coffee are the secrets to his success.