Can machines rap?

In the past months I created Deep-ESP Voces, a telegram community for discussing about speech synthesis in spanish. In the community we just created the first rap synthesizer in Spanish.

After several tests using voice recordings from different sources, we obtained the best results by creating a dataset with the voice of Ritchse, a music producer and member of the community, he recorded himself rapping 215 paragraphs with a high quality microphone and in an environment with good acoustic qualities, isolated from external sound.

To create the rapbot, this code was used as a starting point, which is based on the model described in this paper: Efficiently trainable text-to-speech system based on deep convolutional networks with guided attention, by Tachibana, H., Uenoyama, K., & Aihara, S. (2018, April).

The bot converts written text into rapped speech, it is open source and you can try it online from this link

You can check a demo of the rapper voice here, the demo song was produced by Ritchse.