clumma's blurblog

I only post this because LeCun is one of the most enthusiastic researchers about coming up with new AI architectures to build human-level AI. Not sure this is the best timing for fundraising with all the bubble talk getting louder, but oh well.

Excited to see what comes out of this!

submitted by /u/Tobio-Star to r/newAIParadigms
[link] [comments]

Read the whole story

clumma

3 hours ago

reply

Berkeley, CA

[R] WavJEPA: Semantic learning unlocks robust audio foundation models for raw waveforms by /u/ComprehensiveTop3297
Wednesday November 19^th, 2025 at 12:54 PM

upvoted by clumma

[R] WavJEPA: Semantic learning unlocks robust audio foundation models for raw waveforms

https://preview.redd.it/7u5do1x19uzf1.png?width=1103&format=png&auto=webp&s=bfc314716f4e33593b16e6e131870dae62d7577a

Hey All,

We have just released our new pre-print on WavJEPA. WavJEPA is an audio foundation model that operates on raw waveforms (time-domain). Our results showcase that WavJEPA excel at general audio representation tasks with a fraction of compute and training data.

In short, WavJEPA leverages JEPA like semantic token prediction tasks in the latent space. This make WavJEPA stand out from other models such as Wav2Vec2.0, HuBERT, and WavLM that utilize speech level token prediction tasks.

In our results, we saw that WavJEPA was extremely data efficent. It exceeded the downstream performances of other models with magnitudes of less compute required.

https://preview.redd.it/7uxj7wgz9uzf1.png?width=1084&format=png&auto=webp&s=6d05cf829a65bfaec5871dfe0487e4d11c80b132

We were further very interested in models with good robustness to noise and reverberations. Therefore, we benchmarked state-of-the-art time domain audio models using Nat-HEAR (Naturalistic HEAR Benchmark with added reverb + noise). The differences between HEAR and Nat-HEAR indicated that WavJEPA was very robust compared to the other models. Possibly thanks to semantically rich tokens.

Furthermore, in this paper we proposed WavJEPA-Nat. WavJEPA-Nat is trained with naturalistic scenes (reverb + noise + spatial), and is optimized for learning robust representations. We showed that WavJEPA-Nat is more robust than WavJEPA on naturalistic scenes, and performs better on dry scenes.

As we are an academic institution, we did not have huge amounts of compute available. We tried to make the best out of it, and with clever tricks we managed to create a training methadology that is extremely fast and efficent. To go more in-depth please refer to our paper and the code:

Paper: https://arxiv.org/abs/2509.23238
Code: https://github.com/labhamlet/wavjepa

And, to use WavJEPA models, please use our huggingface endpoint.

https://huggingface.co/labhamlet/wavjepa-base

Looking forward to your thoughts on the paper!

submitted by /u/ComprehensiveTop3297 to r/MachineLearning
[link] [comments]

Read the whole story

clumma

3 hours ago

reply

Berkeley, CA

Starlink announces 8M active customers (and 8M+ direct-to-cell users) by /u/NikStalwart
Wednesday November 19^th, 2025 at 12:54 PM

upvoted by clumma

submitted by /u/NikStalwart to r/SpaceXLounge
[link] [comments]

Read the whole story

clumma

3 hours ago

reply

Berkeley, CA

AbbVie Ends 11-Year Relationship With Calico by /u/towngrizzlytown Wednesday November 19th, 2025 at 12:57 PM

Los Alamos National Laboratory and Valar Atomics Announce Project NOVA Criticality Milestone by /u/clumma Wednesday November 19th, 2025 at 12:56 PM

Bremsstrahlung constraints on proton-Boron 11 inertial fusion - looks not very promising for ICF by /u/steven9973 Wednesday November 19th, 2025 at 12:55 PM

Yann LeCun, long-time advocate for new AI architectures, is launching a startup focused on "World Models" by /u/Tobio-Star Wednesday November 19th, 2025 at 12:55 PM

[R] WavJEPA: Semantic learning unlocks robust audio foundation models for raw waveforms by /u/ComprehensiveTop3297 Wednesday November 19th, 2025 at 12:54 PM

Starlink announces 8M active customers (and 8M+ direct-to-cell users) by /u/NikStalwart Wednesday November 19th, 2025 at 12:54 PM

AbbVie Ends 11-Year Relationship With Calico by /u/towngrizzlytown
Wednesday November 19^th, 2025 at 12:57 PM

Los Alamos National Laboratory and Valar Atomics Announce Project NOVA Criticality Milestone by /u/clumma
Wednesday November 19^th, 2025 at 12:56 PM

Bremsstrahlung constraints on proton-Boron 11 inertial fusion - looks not very promising for ICF by /u/steven9973
Wednesday November 19^th, 2025 at 12:55 PM

Yann LeCun, long-time advocate for new AI architectures, is launching a startup focused on "World Models" by /u/Tobio-Star
Wednesday November 19^th, 2025 at 12:55 PM

[R] WavJEPA: Semantic learning unlocks robust audio foundation models for raw waveforms by /u/ComprehensiveTop3297
Wednesday November 19^th, 2025 at 12:54 PM

Starlink announces 8M active customers (and 8M+ direct-to-cell users) by /u/NikStalwart
Wednesday November 19^th, 2025 at 12:54 PM