Orian Sharoni

The Sound of AI – Challenges & Solutions in Speech & Audio

Founder & Machine Learning at Up·AI

Orian Sharoni

The Sound of AI – Challenges & Solutions in Speech & Audio

Founder & Machine Learning at Up·AI

Bio

Orian is the cofounder of Up·AI, a consultancy of two specialists focusing on hands-on speech and audio machine learning projects. Up·AI ‘s most well-known collaborator is Waves Audio Ltd., with whom they are working on dubbing related pro-audio algorithms. Additionally Up·AI is part of the HarmonAI community, working on generative audio tools as part of Stability Ai.

Orian specializes in the intersection of academic research and industry end-to-end ML development. Orian’s passion for machine learning is rooted in a Master’s degree in Computational Cognition, where Orian researched and mathematically modeled human behavior.

Bio

Orian is the cofounder of Up·AI, a consultancy of two specialists focusing on hands-on speech and audio machine learning projects. Up·AI ‘s most well-known collaborator is Waves Audio Ltd., with whom they are working on dubbing related pro-audio algorithms. Additionally Up·AI is part of the HarmonAI community, working on generative audio tools as part of Stability Ai.

Orian specializes in the intersection of academic research and industry end-to-end ML development. Orian’s passion for machine learning is rooted in a Master’s degree in Computational Cognition, where Orian researched and mathematically modeled human behavior.

Abstract

Audio representation methods are becoming more varied and rich with each day. The possibilities are endless and so are the different challenges they can solve. Text based or textless, DSP feature selection or raw WAV? How bounded are we by available datasets and do we have to use our own ears to test the results for naturalness? 

The methods for representing audio have evolved over time. From Mel spectrograms, gammatone filters and wavelet transforms to the latest state-of-the-art

transformers models (including wav2vec2, hubert, etc.). Each method has its pros and cons. In this round table discussion, we will review a few of these methods. We will then turn to the application side and see how audio representations are used in different tasks.

Discussion of real life applications will be encouraged, so to learn of the challenges of moving from the research domain to practical usage. We will follow-up with a discussion of different methods for computing audio similarities and audio generation.

Abstract

Audio representation methods are becoming more varied and rich with each day. The possibilities are endless and so are the different challenges they can solve. Text based or textless, DSP feature selection or raw WAV? How bounded are we by available datasets and do we have to use our own ears to test the results for naturalness? 

The methods for representing audio have evolved over time. From Mel spectrograms, gammatone filters and wavelet transforms to the latest state-of-the-art

transformers models (including wav2vec2, hubert, etc.). Each method has its pros and cons. In this round table discussion, we will review a few of these methods. We will then turn to the application side and see how audio representations are used in different tasks.

Discussion of real life applications will be encouraged, so to learn of the challenges of moving from the research domain to practical usage. We will follow-up with a discussion of different methods for computing audio similarities and audio generation.

Planned Agenda

8:45 Reception
9:30 Opening words by WiDS TLV ambassador Nitzan Gado and by Lily Ben Ami, CEO of the Michal Sela Forum
9:50 Prof. Bracha Shapira – Data Challenges in Recommender Systems Research: Insights from Bundle Recommendation
10:20 Juan Liu – Accounting Automation: Making Accounting Easier So That People Can Forget About It
10:50 Break
11:00 Lightning talks
12:20 Lunch & poster session
13:20 Roundtable session & poster session
14:05 Roundtable closure
14:20 Break
14:30 Merav Mofaz – “Every Breath You Take and Every Move You Make…I'll Be Watching You:” The Sensitive Side of Smartwatches
14:50 Reut Yaniv – Ad Serving in the Online Geo Space Along Routes
15:10 Rachel Wities - It’s Not Just the Doctor’s Handwriting: Challenges and Opportunities in Healthcare NLP
15:30 Closing remarks
15:40 End