Unaware's Guide To UTAU Part 1 (RE-UPLOAD)

DISCLAIMER: This guide was originally written nearly three years ago and is slightly outdated by now. Among other things, it predated the CV-VC recording method becoming more popular for Japanese, and the release of a plugin that made it easier to use. Also, the footnotes re-direct to the original page this was on.

NOTE/DISCLAIMER: This guide does not attempt to be 100% through in any way, and I did skip some more advanced aspects of UTAU/try to gear this towards people totally not familiar with UTAU. So, if I left something out, that might be the reason why it’s not here. Anyway, this part contains the introduction and the basics of installing UTAU.


One of the hardest sounds to synthesize is the human voice. For this reason, up until recently the only application of it was in speech synthesis, which, as anyone who has played around with Microsoft Sam before knows, is very stiff and robotic. One of the earliest applications of singing synthesis was an optional feature of the NEC PC 6001 mkII computer, which came out in 1983.

About a decade and a half later, a singing synthesizer called Flinger was created. Based off the Festival speech synthesis system, this program lacked a GUI, or, a graphical user interface, which is how most of us use programs. These limitations meant that not many used the program. The first successful singing synthesizer was Vocaloid, which was released in early 2004 and developed in Spain at the Pompeu Fabra University, with financing by Yamaha. The first voices were Leon and Lola, a pair meant for soul music. The first Japanese voices were KAITO and MEIKO. However, this software did not take off until the introduction of Vocaloid 2 and Hatsune Miku respectively. Miku became a “virtual pop idol,” of sorts, thanks to the contributions of many independent, or, “doujin” music producers. There are even concerts that star Miku. While, Miku is by far the most popular Vocaloid, there are in fact over 40 Vocaloids, some who are quite popular in their own respect, while others are not purchased and used that often.

UTAU came out of the Japanese internet phenomenon known as “Jinriki Vocaloid,” which involves taking a singing voice, getting the tones as .wav files, and re-arranging them manually. Ameya, a programmer, made a tool originally intended to make this easier. One of the earliest and most popular UTAU is Kasane Teto, who was originally made as an April Fools’ joke by VIP, a group on 2ch. Other UTAU’s by the group, known as “Vippaloids,” include Yokune Ruko, Namine Ritsu, Sukone Tei, and Ooka Miko. At first, the UTAU program was clearly inferior to Vocaloid, but as improvements were made, the program is now roughly on-par, and many voices exist as a result of the fact that it is possible for a regular individual to create an UTAU voice, simply using a microphone, a recording application such as Audacity, and, of course, the program itself. As a result of this, voice bank quality can be vary from very low to very high, depending on the microphone, pronunciation, and consistency of recordings.

So, why should you, dear reader, record an UTAU, when there are nearly 3,000 voices out there already? For one, there is always more room for good UTAU voices! UTAU users and producers appreciate the variety of voices to choose from very much, and more are always appreciated. Secondly, design options are wide open, even with the number of UTAU now; there are many design ideas that have been not touched upon. However, this is all really optional; UTAU’s do not necessarily have to have a design, and some, in fact, don’t. Thirdly, it’s fun whether you’re simply voicing for someone, doing all the work in configuring and designing it yourself, or getting involved in the character aspects. There are many degrees of involvement in UTAU, really.

Before proceeding in this guide, I want to make a quick note. If you are planning on both using the program, and voicing your own UTAU, read the whole guide. If you simply want to work with the program, you can go ahead and skip the parts concerning pronunciation and voicing. If you’re going to simply voice an UTAU for someone else, you can skip any parts about working in the program.


This guide was primarily made for people who are going to be solely voicing an UTAU, but learning the basics of using the program itself wasn’t something that I could easily ignore, speaking as someone who was once a beginner at UTAU who made all the initial newbie mistakes. In addition, I figured that these aspects would be useful to someone.

The first thing to put to consideration is system requirements. UTAU is primarily a Windows-only application, however, a Mac version, UTAU-Synth, also exists; there are several differences between UTAU and UTAU-Synth. There is no Linux version at the moment, and Wine compatibility is limited. If you do have Windows, the very first step, before you do anything else, is to switch your locale to Japanese. This is extremely important, because you need it to use any UTAU that is encoded in hiragana. Most Japanese UTAU’s and a decent amount of overseas UTAU’s are encoded this way. If you have Windows 7, this is as easy as going into Control Panel > Region and Language > Administrative, and changing the language for non-unicode languages to Japanese. If you have Windows XP, this is more difficult. You need to have the original operating system installation disk or XP Service Pack 3 to install East Asian languages to change your locale, in that case. Once you’ve done this, installing the Japanese keyboard from same menu in Control Panel is recommended. After restarting your computer, the next thing to do, of course, is to install UTAU. You can find the program at http://utau-synth.com/.1


Post installation, it’s recommended to install the English patch;2 without it, it’s likely that you’ll go nowhere with the program. To use it, extract the .rar file, take the folder labeled “res,” and put it in the UTAU folder in whatever directory you happened to install it in. Finally, you’ll need some UST’s3 and UTAU’s to use. Links to recommended ones are located in the Additional Materials section.

1Note: be sure to download the .exe, not the .zip file.

2Link is in the “Additional Materials” section.

3UTAU Sequence Text; the file type that UTAU song files are saved in.


Werde, Bill. “Could I Get That Song in Elvis, Please?” New York Times. New York Times. November 23, 2003. Accessed October 17, 2012.

<http://www.nytimes.com/2003/11/23/arts/music/23WERD.html >

“PC-6001 mkII.” Retro1000BiT. Retro1000BiT 2012. Accessed October 17, 2012.

<http://www.1000bit.it/scheda.asp?id=769 >

“Flinger.” Flinger. Oregon Health and Science University. Accessed October 17, 2012.

<http://www.cslu.ogi.edu/tts/flinger/ >

Wikipedia contributors. “Vocaloid.” Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, October 14, 2012. Accessed October 17, 2012.

<http://en.wikipedia.org/wiki/Vocaloid >

Wikipedia contributors. “Utau.” Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, October 13, 2012. Accessed October 17, 2012.

<http://en.wikipedia.org/wiki/UTAU >


Private comment