hey, been a long time! that last list was outdated and i was mostly grasping for straws with some obscure singing synthesizers without stable releases. so today i'm here to give updated information on synths that currently are publicly accessible. here is a quick list of singing synths that i'm going to tell you how to get. (some do cost money.)
an asterisk indicates that the synthesis language is one unsupported by VOCALOID, but may have been attempted in UTAU. otherwise, it's exclusive to the singing synthesizer(s) listed.
Japanese Singing: CeVIO Creative Studio, Alter/Ego, Sinsy, RenoidPlayer, Wonder Horn, AquesTone, Virtual Singer.
English Singing: Alter/Ego, Sinsy, CANTOR, ChipSpeech, Virtual Singer.
Chinese Singing: NIAOniao, MUTA, Sharpkey, Sinsy, Virtual Singer.
Korean Singing: VOCALINA Studio.
*French Singing: Alter/Ego, Virtual Singer.
*German Singing: CANTOR, Virtual Singer.
Spanish Singing: Virtual Singer.
*Italian Singing: Virtual Singer.
*Latin Singing: Virtual Singer.
*Finnish Singing: Virtual Singer.
*Occitan Singing: Virtual Singer.
CeVIO Creative Studio
CeVIO Creative Studio is a Japanese Singing and Speech Synthesis Workshop. "CeVIO" is actually pronounced "Cheh-vee-oh", not "Seh-vee-oh". You can visit the home page
here, the Studio is free to download. Most vocals cost money, approximately $40~$70 per voice. There are links to the shop on the advertisements to the right.
I recommend you buy vocals on the Vector PC Shop due to hassle rakuten tends to give about foreign credit cards. Don't be intimidated by the Japanese, it's quite easy to navigate. What's unique about this Singing Synthesizer is that it is also capable of Speech Synthesis, and the initially introduced characters featured at least 3 different talking voices each. The face of this Singing Synthesizer is known as
Satou Sasara, a brown haired girl you'll be seeing a lot of on the home page. Some vocals are exclusive to singing, and some exclusive to talking. Satou Sasara can do both, but she does not come free, and both capabilities must be purchased. Other vocals introduced with the program's release were
Suzuki Tsudumi, a close girl friend of Sasara's, and a young man named
Takahashi that was introduced as a senior to Sasara, who she felt was more of an older brother figure than friend. Tsudumi and Takahashi are only capable of talking.
Additional Vocals include a group of 6 digital vocalists called the Color Voice Series you can read about
here, and purchase
here. They are only capable of singing, but are noted for unique sounds. Following, a powerful vocal called
ONE was released for CeVIO, she is considered to be the most high quality CeVIO vocal, rivaling popular VOCALOID and UTAU alike. ONE is produced by 1st Place, the company that also produced IA, a popular VOCALOID. ONE can sing
and speak, like Sasara. IA additionally has a talking voice exclusively for CeVIO. You can buy ONE at
this link.
A project featuring the restoration of the deceased singer, Haruo Minami, was released to the public under the character name
HAL-O-ROID. He is the only CeVIO voice being given out for
no charge, and is only capable of singing.
This program has a much more manual touch to it than most other singing synthesizers. The fine detail customization of pitch, phoneme length, phoneme input, multiple character entry, and many other features make it different from VOCALOID and UTAU, where the editing of these features can be much more timely without experience.
EDIT 1/25/2018: You need a Japanese locale to use the Speech Engine so when speaking, the engine can accurately track real time movement.
Alter/Ego
Alter/Ego is a Free VST Singing Synthesis Plug-in/Standalone Singing Synthesis Program (and other formats) by Plogue.
You do need a DAW, such as FL Studio, to use the VST version. Plogue is much more famous for making ChipSpeech (I'll get to that later). Alter/Ego itself is a refreshing program that takes a bit of configuration with parameters to make vocals sound fluent, but successful end results can be very rewarding. Alter/Ego boasts a wide variety of
vocal types and language capabilities. More vocals tend to have bilingualism on the side of Japanese, but the majority do sing in English. A vocal that stands out is
ALYS, by Voxwave, one of the first notable Singing Synthesizer voices able to sing in
French and
Japanese. A few other vocals at the moment include
Bones, a male bilingual (ENG/JP) singer,
Marie Ork, a vocally flexible and bilingual (ENG/"JP") singer that can perform heavy metal screams in addition to regular singing, and
LEORA, a bilingual (FR/ENG) female vocal under development (by Voxwave) who features the first ever "Power" voice libraries for the languages she sings, and "Crossfade" feature which allows smooth transitions between the voice libraries when singing. Currently, only her French voice libraries are released and must be paid for, like ALYS.
Tuning and other features are able to be edited within the host program and in the VST, but this singing synthesizer is also manual in terms of editing and requires some getting used to, as it is very different from most Singing Synthesizers.
Daisy, the previous bilingual (ENG/JP) default voice, was originally intended to be part of ChipSpeech, as she was introduced as Dandy 704's love interest. His voice was sampled from the first computerized voice to sing "Daisy Bell". Daisy's name was a tribute to the song title and woman mentioned in the song. She was introduced as a time-space anomaly, her Alter/Ego art showing her exiting a time machine in pursuit of Dandy 704. Eventually, Daisy stopped being distributed officially, the reason given is that she was a time-space anomaly who vanished back into her time.
Additionally, I had stated
NATA, an alto bilingual (ENG/"JP") female vocal, was no longer being distributed. Which is incorrect. I was misinformed as to how she would be obtained. She is being sold
here for roughly $58 and tax (46.99 euros + VAT). NATA unfortunately doesn't have very many sample usages. She is an unofficial vocal made by Vocallective. In this case, Plogue will not help customers who buy NATA since she is not supported software.
Another unofficial vocal named
Vera is under development by Vocallective
also. She only has one JP demo. She is also unsupported software and has not yet been released to the public.
EDIT 1/25/2018: If you use the standalone program or formats other than VST, you will need a MIDI keyboard.
Sinsy
Sinsy('s Web Demo) is a free browser-based singing synthesizer focused primarily on sounding human without much editing. It supports 3 languages, but only select vocals have the capability to sing in another language.
Xiang-Ling, one of the first vocals produced, is the sole
multilingual vocal (JP/CHI/ENG). To use Sinsy, you must upload a (vocal/lyrical) Musical XML to the website, choose a voice, then hit the button to the right. That will generate a WAV file you can download. An easy way to make a Musical XML is to take a MIDI and export it as one out of Cadencii, MuseScore, CeVIO, or any way you prefer. Sinsy will not respond to Control Parameters in VOCALOID or CeVIO. Sinsy's gimmick is to sound as human as possible with as little editing as possible, this is the goal of HMM synthesis. The best way to tune is to manually add dynamics and etc in MuseScore, finale_NotePad, or a similar program. Sinsy's latest update supports this feature, and also Phonetic Input. If you run Linux and are keen in C- you can actually make your own Sinsy voice using the
source code and the
official tutorials.
RenoidPlayer
An unrelated free browser-based singing synthesizer is
RenoidPlayer, which you can read about
here. It is exclusively Japanese, and not difficult to utilize. Drop a .ust/.vsq/.vsqx/.ccs file into the Piano roll and it will appear. It also does not react to Control Parameters from other synthesizers, and dynamics don't appear to work either. There is a
tutorial on how to make your own voice library for it, but you will need
Renoise, a DAW, (specifically the 3.0 version in the archive). The reason is that the DAW makes a special Soundfont file that is associated with the synthesizer. The tutorial eventually becomes difficult to comprehend once the video tutorials end, so don't try it for fun. The downside to this synth is external editing and occasionally, the overlap to help transition between notes is too great and the voice becomes unintelligible.
Wonder Horn Studio
A Japanese singing synthesis that was created by NTT-AT in 2004. A similar method that Sinsy used in synthesis was used in the program, and the vocals could be quite realistic for the time period but were known to occasionally clip. The program seems to have been discontinued as of 3/31/2017. I managed to save
one demo featuring two vocals out of many. If you're curious, some usages are still deep within the reserves of NicoNicoDouga, search "γ―γ³γγΌγγ«γ³". The website it was hosted on was
utabara.com, but the website seems to have been deleted with the discontinuation of the product.
The discontinuation seems to have been from the expense of the program and customer reluctance to buy it. On NicoNicoDouga, it's said that its popularity has remained low, with the tag containing less than 100,000 plays. It was said to have featured adult voices and one child. There were options to make a choir, and various pitch and vibrato editing additions, seemingly like an early CeVIO. The downside was, allegedly, editing a MIDI had to be done outside of the synthesizer. Editing note lengths was not possible within the Synthesizer if a MIDI was being imported.
EDIT 5/23/2018: After some digging I found that it may still be available within the Japanese software MUSIC PRO V5 as a built-in plugin under the name "Sound Jauman2". NTT seems to have cut all ties with the name "Wonder Horn".
AquesTone
AquesTone is a free VSTi singing synthesis plug-in that sings exclusively in Japanese, created by A-QUEST. It features a male and female voice. Possibly one of the more famous singing synthesizers among VOCALOID fans, due to its partner product,
AquesTalk, a speech synthesizer, being typically used in memes, Let's Player voice-overs, and etc. AquesTone itself is very simple and easy to use, focusing primarily on uploading a text file of lyrics onto a blank MIDI through the UI. Tuning is only limited to the few parameters in the UI, though depending on the host program, it may be able to be tuned in the host.
Because it is a VSTi, you will need a DAW, such as FL Studio. AquesTone also has an upgrade to
AquesTone2 that features a different vocal and more parameters. AquesTone2 expires May 2018, so please hurry if you're interested!
CANTOR
You can learn about its history
here. VOCALOID's initial rival, this program utilizes additive morphing synthesis sounds that emulate human speech in order to generate singing, rather than using human-recorded samples. It was developed by VirSyn, but has remained inactive since it's 2.10 update. Its languages are
English and
German, and there are 50 vocals in the Full Edition (some variations of a single vocal), but the user can create a new voice simply by playing with settings. It has a simple layout and is user-friendly, but just as VOCALOID needs a lot of editing, so does CANTOR. Consonant timing was known to be one of the less-friendly elements.
If you are determined, you can buy CANTOR 2.10 for about $370~$400 (299 Euro) depending on shipping,
here.
EDIT 4/25/2018:
In order to use CANTOR 2.10's demo AND full version you need to buy an "eLicenser" which contains copy protection software. An eLicenser is a physical USB Drive that must be plugged in to the USB slot. CANTOR will refuse to run if the eLicenser is not plugged in.
Virtual Singer
A Product of Myriad, maker of MelodyAssistant and HarmonyAssistant,
Virtual Singer comes with multilingual capabilities. It does not use the Piano roll format like others, instead it uses a Sheet Music format. Vocals must be fine-tuned with musical parameters. It is known for relatively impressive results, and even in 2018, despite the program being aged, it sees use from its Community. It isn't very expensive, (about $40 will be spent acquiring a host program and Virtual Singer) and you can make your own voice for it as well. Bundled with the purchase is the official tutorial to make a Latin voice.
ChipSpeech
ChipSpeech is Plogue's other vocal synthesizer, though it wasn't directly made for singing, so calling it a singing synthesis isn't exactly right. It comes in a VST format, other formats, and is also a standalone program.
You will need a DAW to use the VST edition. The vocals in this vocal synthesizer are often restored vocals from early recordings of human voices and discontinued text-to-speech engines, way before VOCALOID existed. ChipSpeech is a more recent creation, sampling those older vocals. ChipSpeech can be used to create Speech, Singing, and generation of ambiguous sound effects. Singing requires a MIDI and the functionality is similar to Alter/Ego--very manual editing. ChipSpeech does cost money, and currently (1/17/2017), its 11 different vocals are
$95 and $5.46 tax. The vocal many people are impressed with and is often seen as the icon of ChipSpeech is
Lady Parsec.
EDIT 1/25/2018: If you use the standalone program or formats other than VST, you will need a MIDI keyboard.
NIAONIAO
NIAONiao is a Chinese Mandarin Free Singing Synthesizer much similar to UTAU in the way that a user can create their own vocal for it. Initially, it wasn't considered to be impressive, but after several updates it became quite popular in the Chinese Community. It can be downloaded
here, and comes with a default voice, her name is
Yu Niaoniao. Many people have created vocals for this program, and it is known to be language flexible in terms of limited English, Japanese, Cantonese, and Korean possibilities. You can look at downloadable voices
here. The Wiki also has some tutorials on using the program to create your own character vocal, and materials needed are
here.
MUTA
MUTA is a Chinese Mandarin Free Singing Synthesizer much similar to CeVIO Creative Studio in terms of layout, though the fine amplitude timing is known to make the program receive an error in playback. There also appears to be a future plan to include talking, as there is an unusable speech option. MUTA accepts Pinyin input, (if you're unaware, Pinyin to Chinese is what Romaji is to Japanese) so for those who don't know Mandarin, a simple reference point of the Mandarin alphabet is all they need. MUTA features the vocals of a character named
Yan Xi, a charismatic and young sounding female vocal you can't read about because the home page is deleted. You can look her up, though, her official demo is still on
YouTube. You can download MUTA
here.
EDIT 1/25/2018: The speech feature is usable to some, the system requirements are unknown but suspected to be a Chinese or Japanese locale. The author planned to have others be able to import their voices into the program as well, but MUTA appears to no longer be under development.
Sharpkey Studio
Sharpkey is a lesser known but higher quality Chinese Mandarin Free Singing Synthesizer. It has many desirable traits from several other Singing Synthesizers and is claimed to be very user friendly to those experienced with VOCALOID, UTAU, and CeVIO. Reactions were very positive in the West, but since not many know Mandarin, it is often not used. The program features a mature and powerful sounding female vocal named
Huan Xiao Yi. You can download Sharpkey Studio and Huan Xiao Yi
here.
Using the program is much similar to other synthesizers to assure user comfort. It has a variety of Parameters like VOCALOID and UTAU, but some relating more to Mandarin, such as Tone and Cross Tone.
EDIT 1/25/2018: The homepage is
here, which you can check for updates! Sharpkey also has another vocal that I missed, her name is
Kiana. The author eventually plans to have others be able to import their voices into the program as well.
EDIT 5/23/2018: The producers have begun to organize popular Mandarin UTAU/NIAOniao voices to officially import into the software. Such characters include
Yong Qi and a few others.
VOCALINA Studio
VOCALINA Studio is a Free Korean Singing Synthesizer and DAW. It was originally released about the same time as the first Korean VOCALOID, SeeU, and there was a brief rivalry between products. The VOCALINA character that was introduced was a mature female vocal named
Vora (Choi Bora). Vora had her own webcomic about being a pop-star in disguise, and has remained a popular character in the Korean Community. As the program developed more, a second vocal was added,
Khylin. Khylin was much higher quality than Vora, and due to new engine configurations, Vora was unable to be included in the newest version of VOCALINA Studio (2.3.0 and onwards), so Khylin was able to take a rise in popularity. Khylin normally costs money and is bought through a paid subscription through anywhere between 30 days to 1 year, and once that time period runs out you have to renew your subscription. She is unable to be purchased outside of Korea due to needing a Korean Banking Account/Card.
In order to use VOCALINA Studio, you have to register for an account on
Vocalina, then download the Studio, which is estimated to be about 2 GB, so downloading may take a while. Once installed, the Studio will ask you to sign in, so
you do need an internet connection.
Additionally, a Korean Locale is needed, so a program like
Locale Emulator is a good choice to install. The Studio itself is very manual, vocal editing-wise, and relies on direct text input opposed to text to phonetic input. The engine is smart enough to make pronunciation smooth when possible, such as "seong-eol" being written as such, but being sang as "seon-geol". Some features do glitch and vowel transitions are seldom smooth, so learning the tricks and trivials of the program can be slippery, but producing good results is very rewarding.
EDIT: VOCALINA will be terminated in October 2018.
thanks for tuning in! if you'd like, please answer the poll to the right.
Honorable Mentions -
- Macne Nana and Macne Petit were upgraded to the VOCALOID4 and VOCALOID Neo software, effectively crushing the MACLOID series as a Mac-exclusive VOCALOID spin-off genre. The fanbase hopes the other members of the MACLOID series also get upgraded, since they can no longer be purchased. Macne Nana and Petit were given English singing capabilities, prior they were Japanese exclusive.
- SaltCase is a Simple Japanese Indie Singing Synthesizer that was developed by sota for Mac OS X, though they have ceased development in 2012. In 2013, sota released an April Fools joke that suggested an upgrade, showcasing a vocal that only sang the notes "Nwi" and "Gyo" in a deep, distorted voice. The homepage can be visited here. The mascot character doesn't appear to have a name, but additional voices can be imported. It is an inferior singing synthesizer to the Mac-exclusive UTAU-Synth.
- Unity-chan was a Japanese Singing Synthesizer ("Vocaloid") developed for Unity3D. She was later upgraded to VOCALOID4.
- Voctro Labs has several projects involving Singing Synthesizers. One project was their own new singing synthesizer testing a new synthesis method in Spanish and English, another called Revivos is being used to sample deceased singers, and a current project called Voiceful is a Speech and Singing Synthesizer that can only be used under a private license, but a demo is available.
- Image-line recently created a "Vocal Resynthesis" tool called Harmor that allows the instrumentation of a human voice in FL Studio. It doesn't appear to have limits on language. Here's a sample, if you don't want to watch the tutorial.
- Realivox has an English Singing Synthesizer named BLUE.
- Kanru Hua, developer of Moresampler, SHIRO, etc, is creating a multilingual Singing Synthesizer currently called "Synthesizer V".
- Emvoice is developing a vocal synthesizer called SOHO that can speak and sing. Development demonstrations are available here.
- NTT-AT and HOYA have teamed up to make a Speech and Singing Synthesizer called VoiceText. So far only one female vocal named Hikari is able to sing, but there is an additional cast of characters (scroll down) who can speak and show multiple different emotions when speaking. The focus of this project is to develop showing different emotions in speech, and to find as many uses for speech synthesis as possible. They seem to have a story with their cast of characters. Additionally, there is a business version of VoiceText for those who need speech synthesis, and voices can be created under private licenses.
more to be added at a later date, if possible! thanks again, and i hope you decide to use some of them!