journey··8 min read

Sam's Japanese Journey: Day 17 — Finding My Voice

The Microphone Button#

I have been staring at the microphone button for seventeen days.

It sits there in the JIVX interface, small and unassuming, next to the text input field. Every day I type my answers. Every day the microphone watches me type, saying nothing, judging nothing, just waiting. Like a side quest marker I have been walking past because I am not leveled up enough yet.

Today I tapped it.

I do not know what made today different. Maybe it was the momentum from the breakthrough on Day 13, the steady accuracy climb since then, the growing confidence that I can actually construct sentences correctly. Maybe it was that Kenji -- my Japanese-speaking coworker who has been giving me increasingly curious looks for a week -- finally asked me at lunch today: "What are you studying? I keep hearing you mutter Japanese words."

"Japanese," I said.

He looked at me for a long moment. Then he nodded, not the dismissive nod from the first week, but something closer to... respect? Or at least genuine curiosity. "How long?"

"Seventeen days."

Another nod. He went back to his lunch. But the fact that he asked -- that someone noticed, that the words I have been whispering to my cat and my fridge and my commute were loud enough for another human to register -- made me want to be louder.

So I tapped the microphone.

Speaking Into the Void#

The first sentence was school vocabulary: "I like math class." A straightforward ga sentence. Math class [close-up -- this is the specific thing being spotlighted] is liked. The camera metaphor mapped cleanly: suugaku no jugyou ga suki desu.

I said it out loud. Into my phone. In my apartment. Where the walls are thin and my roommate was definitely home.

My voice came out about an octave higher than normal, which I am going to blame on the microphone and not on the fact that I was terrified. "Suugaku no jugyou ga suki desu."

The app thought about it. A brief pause that lasted approximately seventeen years. And then: correct.

N5school

I like math class.

Neutral

数学(すうがく)授業(じゅぎょう)()きです。

Casual

数学(すうがく)授業(じゅぎょう)()きだ。

Vocabulary
数学mathematics授業class, lesson好きto like
Grammar
possessive/modifier particle
Try in JIVX

I said a complete Japanese sentence out loud and a computer understood me. Not Mochi (who meows at anything). Not the fridge (which responds to nothing). An actual speech recognition system parsed my pronunciation, matched it against the expected answer, and said yes. That is you. That is what you said. That is correct Japanese.

The の particle connecting suugaku and jugyou -- math's class, math class -- is doing quiet work here. It links two nouns the way apostrophe-s does in English but with more flexibility. Math の class が liked. Every particle is a tiny machine doing its specific job, and I can hear them now when I speak. Not just see them when I type.

The Milestone#

Second sentence. And this is the one that nearly broke me, in the best possible way.

"I can read hiragana."

Five words in English. A simple statement of ability. But for me, sitting in my apartment on Day 17 of learning Japanese, this sentence was not an exercise. It was a fact.

I can read hiragana. Not all of it fluently, not without occasionally mixing up す and む, but I can look at hiragana characters and know what they say. Two and a half weeks ago, they were just shapes. Pretty, mysterious shapes that might as well have been decorative wallpaper. Now they are sounds. They are words. They are mine.

I tapped the microphone. I took a breath. And I said it.

"Hiragana ga yomemasu."

N5school

I can read hiragana.

Neutral

ひらがなが()めます。

Casual

ひらがなが()める。

Vocabulary
ひらがなhiragana読めるcan read
Grammar
〜える/られるpotential form (can do)
Try in JIVX

Correct.

I stared at the screen for a long time. The potential form -- yomeru, can read, from yomu, to read -- expresses ability. And here I was, using that form to describe an ability I actually have. Hiragana ga yomemasu is not a hypothetical exercise. It is a true statement about my life that did not exist three weeks ago.

Were there tears? I am going to say no because I am a 28-year-old QA tester who does not cry about reading systems. But my eyes were very... humid. Allergies. March is allergy season. Moving on.

The ga here is perfect too. Hiragana ga yomemasu. Close-up on hiragana -- it is the specific thing I am highlighting my ability to read. Not katakana (not yet), not kanji (definitely not yet). Hiragana. Zoom in. This one. I can read this one.

The Practice Behind the Practice#

Third sentence brought me back to earth: "I practice writing kanji." Aspirational, since my kanji recognition is still in its infancy. But the sentence structure was a welcome challenge.

My voice attempt: "Kanji wo kaku renshuu wo shimasu." Two wo particles in one sentence -- one marking kanji as the object of writing, one marking practice as the object of doing. I stumbled on the second wo and had to repeat myself. The app accepted it on the second try.

N5school

I practice writing kanji.

Neutral

漢字(かんじ)()練習(れんしゅう)をします。

Casual

漢字(かんじ)()練習(れんしゅう)をする。

Vocabulary
漢字Chinese characters書くto write練習practice練習するto practice
Grammar
object particle
Try in JIVX

This sentence has a structure I had not seen before: verb-modifying-noun. Kaku renshuu -- writing practice. The verb kaku (to write) modifies the noun renshuu (practice), creating a compound concept. In English we would say "writing practice" or "the practice of writing." In Japanese, you just put the verb before the noun. It is more compact. More efficient. Like refactoring a function to remove unnecessary wrapper code.

Speaking it out loud was harder than typing it. When I type, I can pause, delete, rethink. When I speak, the words have to come out in real time, in order, with the right particles in the right places. It is the difference between writing code in an IDE with autocomplete and writing it on a whiteboard during an interview. Same language, completely different pressure.

The Voice Changes Everything#

I did all three sentences by voice today. Three sentences spoken aloud into a microphone, processed by a machine, graded by AI. My accuracy is at 75%, the highest it has been since the early optimistic days of Week 1 when I was too naive to know what I did not know.

But the number is not what matters. What matters is that I spoke. For seventeen days I have been a Japanese reader, a Japanese typer, a Japanese thinker-in-my-head. Today I became a Japanese speaker. A bad one. A slow one. One who pronounces "renshuu" like they are sneezing. But a speaker.

Mochi heard the whole session from the doorway. Three full sentences of Japanese, spoken at normal volume, into a phone. She meowed after each one. I am going to count that as peer review.

Kenji asked what I am studying. The answer is: everything. But today, specifically, I am studying the sound of my own voice saying things I could not say three weeks ago.

Hiragana ga yomemasu. I can read hiragana. I said it and I meant it and the machine confirmed it.

That is a good day.

Day 17 Stats

51
Sentences
75%
Accuracy
17
Streak

Key Takeaway

Voice input transforms language learning from recognition to production. Typing lets you pause and edit -- speaking forces real-time recall of vocabulary, particles, and structure all at once. The first time you speak a sentence and get it right, you will know the difference between knowing Japanese and using it.