This week at the 2020 International Society for Music Information Retrieval Conference, Spotify open-sourced Klio, an ecosystem that allows data scientists to process audio files (or any binary files) easily and at scale. It was built to run Spotify’s large-scale audio intelligence systems and is leveraged by the company’s engineers and audio scientists to help develop and deploy next-generation audio algorithms.
The Apache Beam-based Klio enables organizations to create media processing systems that share tooling and infrastructure between production systems and research teams. The platform’s architecture encourages reusable jobs and shared outputs, ostensibly lowering maintenance and recomputation costs. Moreover, Klio supports continuous, event-driven processing of rapidly growing catalogs of media content, providing engineers a framework to productize processing jobs and organizations a way to process new content on ingestion.