Unknown
At the inaugural Peabody hackathon, Hacking Harmony, I built my first attempt at choral music synthesis. The concept was simple: autotune google text-to-speech. Using Python, I built a music parsing engine which would read in a (well formed) MusicXML file, and then perform the following steps:
At this point, Matlab takes over and splices all of the words together according to the recipe, while also autotuning each word. This part was performed in Matlab as it had better tools for quickly writing an autotuner, including pitch detection, pitch shifting, and very robust tools for working with audio data in general.
The output of this whole process sounds about how you'd expect autotuned text-to-speech to sound—my favorite reaction was that it sounded like a choir of demon chipmunks.
As usual, accurate pitch detection was the biggest problem I had to deal with (runner up being phoneme segmenting). I'm always surprised by how difficult such a simple sounding problem always turns out to be—in this case though, I think the out-of-tune-ness of it is quite in line with the quality of the rest of the result.