Among the numerous subtleties of the Japanese language that cause learners difficulty, intonation is not normally mentioned. When you're struggling with the unfamiliar vocabulary, agonizing writing system, and perplexing grammar, pronunciation takes a back seat. After all, Japanese doesn't have too many sounds, and most of them are at least vaguely familiar to speakers of European languages. And intonation? Well, you're commonly told that you shouldn't put the stress on any one syllable, so it's best to ignore it until later.
The problem, though, is that if you learn an incorrect intonational pattern from the start, you'll find the habit hard to break when you finally get around to it. Japanese, like English, doesn't require correct intonation to communicate information, unlike Mandarin, Cantonese, or Thai. Nonetheless, Japanese intonation is important to the language. Correctly pitching your speech makes you easier to understand and helps you pick up clues in the blazing fast word streams of native speakers, very unlike the careful enunciation of the classroom, that will let you make sense of the conversation. And, unlike the fiendishly inconsistent phenomenon of stress in English, intonation in Japanese follows fairly consistent rules.
Pitch accent and morae
Yes, Virginia, 'n' counts as its own syllable
First of all, what am I talking about when I say 'intonation?' Intonation is the way your voice rises or falls as you speak, like a musical scale going from low to high or vice versa. In standard English, intonation occurs mostly across an entire sentence. For example, declarative utterances tend to fall in tone across the entire sentence (I haven't bathed in a week), wh- questions tend to stay at a high tone across the sentence and fall sharply at the end (Why are you telling me this?), and questions that echo a statement or assumption known to both speakers tend to stay at a low tone across the sentence and rise sharply at the end (Is that not good way to introduce myself?). Meanwhile, there are mini dips and peaks depending on emotional nuance and stress. None of this is communicated in writing, though it's essential to competent speech. Emoticons frequently act as a quick-and-easy written compensation for this deficit.
Japanese intonation also applies across the entire sentence. In fact, it's not much different from English. Most sentences have a generally falling intonation. But there's more to it than just that. Individual Japanese words have their own set intonations. Every word in the Japanese language is divided into discrete units, each of which is called a mora. A mora is similar to a syllable, except that it can only contain one (1) consonant and one (1) vowel. No more. There are no consonant clusters in Japanese like 'sk' 'pr' 'bl' 'rchn' or any number of examples one can think of in English. There are also no diphthongs. Each mora is one consonant prefacing one vowel, or a vowel by itself without a paired consonant. Thus phonetic Japanese writing does not have symbols for individual consonants or vowels, but for morae. One symbol for 'ka,' another for 'ki,' another for 'ke,' another for 'ko,' another for 'ku.' 'A,' 'i,' 'u,' etc. each have their own symbol, but 'k,' 't,' 'h,' etc. do not have individual symbols. The only exception to all this is the sound 'n,' which is considered its own mora, as it acts more like a vowel than a consonant depending on where it is in the word.
When people say that 'Japanese words have no stress,' they are partially correct. In English, we're accustomed to pronouncing certain syllables louder and longer than others, but the Japanese generally pronounce every mora with the same length and volume. Morae can be taken as 'beats' of a sort, so that the phrase boku wa zabon wo haiteinai yo is (officially) pronounced bo-ku-wa-za-bo-n-o-ha-i-te-i-na-i-yo, with every mora (including the 'n' and the vowels in isolation) pronounced like you were following a metronome. A metronome turned to prestissimo.
You'll noticed that I didn't put spaces between the morae when I wrote them out. That's because, in spoken Japanese, phrases are pronounced as if you were saying one very long word, similar to Spanish. So how exactly do you tell where one word ends and another begins? Enter pitch accent. Japanese words each have a pitch accent (not a stress accent as in English) that follows one of three patterns.
The three patterns of pitch accent
How they expect you to figure this stuff out just from listening, I'll never know
If the word has its pitch accent on the first mora, then this mora will be of a higher tone than all the subsequent mora of the word.
So, for example, in the word shizukana, the first mora shi is pronounced at a high pitch and the subsequent three morae zukana are pronounced at a lower pitch relative to the first mora. I say 'relative' because what constitutes high and low pitch depends on the place of the word in the sentence, the speaker's vocal patterns, and various other factors. Pitch is not fixed as in tonal languages, rather it's a matter of comparison between morae. So, if a mora in all uppercase letters represents a mora at high pitch, and one in all lowercase letters represents a mora at low pitch, a word with first mora pitch accent will have a tonal pattern of H-l, H-l-l, H-l-l-l, or so on. Thus shizukana is pronounced SHI-zu-ka-na.
The second variation of pitch accent is when the pitch accent is placed on any mora other than the first. This becomes more complicated, as the placement of the pitch accent doesn't just affect the mora that's accented. This is why Japanese is considered to be a pitch-tone language, half way between tonal languages and stressed languages. It doesn't have fixed tones, but it doesn't have fixed stresses either. So, to begin with, the first mora of a word will be of low pitch. All subsequent morae up to and including the pitch accented mora will then be high. All morae following the pitch accented mora will be low again.
Confused? Here are some examples. The word mochiron has its pitch accent on the chi mora. Thus, mo is of low pitch, chi is of high pitch, and ro and n are both low pitched again. The diagram for mochiron would be mo-CHI-ro-n. The word kanshasai has its pitch accent on the sha mora, so the morae ka, sa, and i are low, but the morae n and sha are high, yielding a diagam of ka-N-SHA-sa-i. And the word nanjigoro has its pitch accent on the go mora, meaning that its schematic is the relatively drawn out na-N-JI-GO-ro. Remember that n is its own mora, so that it can actually take a high pitch even though the na before it is at a low pitch. How this is accomplished will be explained in a moment.
The last pattern is a default of sorts. If there is no pitch accent on any mora of a word, then its first mora begins low, and all subsequent morae including grammatical particles that are not part of the word itself take a high pitch. So, for example, the word yotei has no pitch accent, so its intonational pattern is yo-TE-I. If it's the direct object of a sentence, yielding the word plus grammatical particle yoteiwo, the intonational pattern is simply yo-TE-I-WO, with the high intonation spreading into the particle. If you're expressing the idea of "according to the plan," the word plus grammatical particle is yoteinitsuite and the intonational pattern remains yo-TE-I-NI-TSU-I-TE.
How to say 'now' instead of 'living room'
Now we arrive at the juicy stuff. Japanese is not a tonal language, so it is not the case that one word can have six widely divergent meanings depending on its tone, as in Vietnamese, for example (the word ma, depending on tone, can mean 'ghost,' 'cheek,' 'mother,' 'tomb,' or 'rice seedling'). Unlike English, however, there is an extremely small set of words in Japanese that can have up to three different meanings depending on the placement of the pitch accent. The difficulty of distinguishing between them is magnified by the similarity of non-first-mora pitch accent and zero pitch accent, such that you cannot distinguish between the two unless a grammatical particle is present.
The canonical example of this homonym phenomenon is the word hashi. Hashi can mean 'bridge,' 'chopsticks,' or 'edge.' If it has its pitch accent on the first mora (ha), yielding an intonation pattern of HA-shi, it means 'chopsticks.' If it has its pitch accent on the second mora (shi), yielding an intonation pattern of ha-SHI, it means 'bridge.' If it has no pitch accent, it takes the default intonational pattern of ha-SHI and means 'edge.'
The words meaning 'bridge' and 'edge' thus sound exactly the same in isolation. They differentiate each other, however, as soon as they take grammatical particles. Recall that a word with non-first-mora pitch accent will distribute low intonations to all morae following the pitch accented mora. Since the pitch accent of hashi meaning 'bridge' is on the shi mora, any adjoining grammatical particle will automatically take a low intonation. So "as for the bridge" is ha-SHI-wa, "on the bridge" is ha-SHI-ni, "up to the bridge" is ha-SHI-ma-de, etc. In contrast, the zero pitch accent hashi meaning 'edge' distributes high intonations to its grammatical particles, yielding ha-SHI-WA, ha-SHI-NI, ha-SHI-MA-DE, etc.
The number of words that can take all three variations of pitch accent is extremely limited. Words that can take two of the variations (usually zero and first mora pitch accent) are somewhat less rare, but still make up a small portion of the vocabulary. The majority of Japanese vocabulary is made up of words that compound one or two mora semi-meaningful units together, thus reducing the possibility for pitch accent variation due to complex factors that influence the phonological concatenation.
You didn't think it was that easy, did you?
Of all the examples I've given you, you'll notice that the pitch accent fell on easily recognizable morae. But Japanese allows for both doubled vowels and doubled consonants, which count as two morae. So, for example, the word yappari is not made up of three morae, but four: ya-p-pa-ri. Likewise, the word tanoshii isn't made up of three morae, but four: ta-no-shi-i. So what happens when a pitch accent falls on one of these doubled entities?
Well, it at least simplifies matter a bit that a doubled consonant cannot carry a pitch accent. You don't have to worry about the 'p' of yappari somehow being of higher pitch than the 'pa.' Doubled vowels, however, can carry pitch accent, and in this case the differentiation you make is no longer one of stepping up or down, but of contour. So, in the example of tanoshii above, the pitch accent falls on shi. In rapid speech, there is not a discrete step down from shi to i. Instead, the pitch falls uniformly across shii from high to low, making a contour.
This, incidentally, is how you accomplish pitch changes between vowels and moraic 'n.' Since 'n' actually nasalizes the vowel before it as it's being pronounced, a contour is stretched over the nasalized vowel, either gliding from low to high or high to low.
Some rules of thumb
Very few exceptions. Rejoice.
Pitch accent on individual words in Japanese is not something you can determine by context, meaning, or any other immediately apparent factor of the word. It has everything to do with complicated patterns of etymology. Pitch accent on compound nouns is somewhat more predictable, but not by much. You'll still be left memorizing where the pitch accent falls until you gain an ear for what 'sounds right' and what doesn't.
With verbs and i-adjectives, however, you're in luck. The polite conjugation of verbs and all i-adjectives have a regular pattern of pitch accent. Where that pitch accent falls for verbs depends on its conjugation, but holds consistently across any given conjugation. So, for example, all verbs in affirmative, polite conjugation, imperfect and perfect (ending in -masu or -mashita) have their pitch accent on ma. Thus hajimemasu ("I begin something") has a pattern of ha-JI-ME-MA-su and hanashimashita ("I spoke") has a pattern of ha-NA-SHI-MA-shi-ta.
Likewise, since i-adjectives always end in two vowels, the second of which is i (thus the name of this grammatical category), pitch accent always falls on the first vowel. Thus tanoshii has a pattern of ta-NO-SHI-i, atatakai has a pattern of a-TA-TA-KA-i, and so on and so forth. This, again, changes when you conjugate the adjective for perfect tense or negative aspect, but holds consistent across conjugations.
This is the whole breakdown of the Japanese pitch accent system at the word level. Unfortunately, just knowing how to correctly pitch a single word with its grammatical particles can't tell you everything you need to know about Japanese tonal variations, as noun phrases in total also vary their pitches according to complex rules of association with surrounding elements. Knowing how to correctly pronounce Japanese words in isolation, however, gives you a much firmer grasp on the basics of the language, and will help you far along the way of training your ear to parse out the correct tonal patterns of a fluent speaker.
Noto, Hiroyoshi. Communicating in Japanese. Ann Arbor: XanEdu OriginalWorks, 2003.
Lectures on the topic of Japanese linguistics from Prof. Mieko Kawai and Prof. Misa Miyachi, Center for East Asian Studies, University of Chicago.