1Zhejiang University 2Peking University 3Speech & Audio Team, ByteDance AI Lab
*Equal Contribution
desc | Make-An-Audio | DiffSound | Ground Truth |
a cat meowing and young female speaking | |||
a group of sheep are baaing | |||
a horse galloping | |||
Engine noise with other engines passing by | |||
a chainsaw cutting as wood cracks and creaks | |||
drums and music playing with a man speaking | |||
piano and violin plays | |||
fireworks pop and explode | |||
a speedboat running as wind blows into a microphone | |||
thunder as rain falling down | |||
Train passing followed by short honk | |||
Water flowing down a river |
prompt | origin audio | personalized audio |
a baby crying | ||
vehicle running and horn ringing | ||
a man whistling | ||
dog barks |
Inpainted Audio | Corrupted Audio | Ground Truth Audio |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |