Shenmue 4 leak audio review trailer

If you need to make an ambitious game with lots of assets and voice acting, it's a great way to handle those 'extra' details. I certainly wouldn't want the AI to dictate the gameplay or level design, however. I am open to it as a way to help keep costs down while developers focus on the parts that matter. I've always been in favor of cutting voice acting, etc. and this is a great way to do that without taking it away.
I know it sucks to get jerked around emotionally by something like this, but I think some people filtering through this might be a little too deep in the feels. I personally don't think it's all that deep.

I've been a little hesitant to actually say this, but if this is as fake as I feel it is, it's absolutely made by someone who's been sitting at the very bottom of the Shenmue iceberg for a very long time. There's no way around that.
I feel like it's probably more of an experiment, though. Yeah, maybe an experiment at the expense of Shenmue fans, but if I wanted to see how well I could do creating my own Japanese dialogue for like a fan Shenmue game or something, I might try something like this. (Assuming I didn't actually know much Japanese.)

Yes, it might be indeed a weird experiment. That's a nice theory. My post also might come off as more aggressive than I have anticipated. I am not mad and also not disappointed.

Was just sharing my theories about the matter.

it is fun to speculate about such stuff.

The AI theory has some merits, especially with the word AI circling quite often.

Maybe the AI is the creator of that like for some experimental purposes.
I'm really torn on this. It could be real but it certainly could be fake. Considering there is a guy on YouTube who made AI play the first Tomb Raider. The AI can look around, can recognize textures an recall those "never seen before and even compare them to already seen ones.

The AI plays through the levels by itself. The guy even programmed Laras AI to react VERBALLY to what she is seeing or doing in the game and in her environmental. While programming the verbal reactions he feeded the AI with character traits of Lara so she would react naturally.

It's so interesting!! I'd recommend you to watch these videos. Then you can rethink if this audio leak is real or not.

The first video is about how he made this. It's amazing.

I'm kind of familiar with the RVC-Project, which is the most straight forward approach for AI assisted voice cloning. I say it's in the realm of what AI can do for sure but it would take a tremendous effort of work to be that convincing. Especially if no real speech was used. Errors of language would indicate a text to speech approach. I've done a Ryo model before with pretty clean trainig data and it was never that good.

Also, first post, hi! Don't mind me.
There was that recent GTA Online mod where you could choose one of three AI partners to go around patrolling the city as a cop. Their ability to chat with the player was okay... responses were a little slow for realtime interaction, but it was definitely parsing what the player said and coming up with in-character responses that weren't scripted. You could tell that they were sort of trying to conceal the processing time to parse the player's input and come up with a response, usually saying something like "Well... [long pause]... [response]."

However, they were not emoting very well... not even nearly as well as the stiff acting of minor characters in the English language versions of the first couple of Shenmue games. And definitely not as well as the acting in the audio file that this thread concerns. Elevenlabs has some new features where you can ascribe some vocal traits to passages, like "raspy voice" or things like that which will help with expressing different emotions and settings, but those were rolled out very recently.
Ugh, I hate this. I've been mulling over ways to reverse engineer this audio for most of the week, assuming it was completely fake. I still can't believe it's just a standard audio recording of a preview trailer, though.

I've seen a Japanese Vtuber trying to make a short novel with ChatGPT that had some interesting results. I was sort of thinking the dialogue might have been scripted with ChatGPT because of that. I hadn't actually used ChatGPT myself before today, but it feels like there are a lot of hurdles in getting even the level of conversation in the leaked audio. It's not quite as simple as I thought it might be. It could be possible to really hand hold it through the process and get something a little more detailed, but I don't really intend to spend that much time on it.
This isn't all that interesting, but just for fun, here's how the conversation went:
As expected, though, It got wildly confused about writing a lot of the names in Japanese (wrong character for "Yu"), and doesn't know who Mei Mei is. I was thinking Mei Mei was the character mentioned by Kenji Miyawaki in the Kickstarter interview, but I was just rewatching it now, and he definitely says her name is Min Min, not Mei Mei. Now that I think about it, I'm pretty sure the same person who joined the discord and said Chankakai is Zhangjiajie also said that Mei Mei is Ming Ming (which seems to be how Min Min was pronounced in the English translation of the Miyawaki interview.)

The other theory I had was it's just a script run through DeepL, and read from that output by AI generated text to speech voices. Splice in some scream and grunting noises from the games, combat and UI sfx from Shenmue III, layer in some audio tracks for background audio, noise, etc, and a stochastic noise track over everything to mask some of the hitches in the dialogue. Bingo, bang-o, bongo. Not a small amount of effort by any means, and getting good enough voices would be the most laborious part.

There's too much wrong with this thing for me to just believe it straight up, but it still feels like an outsized effort for its apparent purpose. However, if I were building something as a proof of concept for VA recreation, I could see most of that effort being assumed by some larger project. (Can't help feeling like I'm somehow impeding some sort of background fan effort.)

Personally, I've sort of felt like Shenmue IV has been in active development for a while now, regardless of this, anyway. I think I'm going to step away from this for now. I don't like it.
Last edited:
I mean if it IS real, we will see soon enough.

Still TGS, TGA and I'm certain Sony will have another event *after* Spider-Man comes out; we don't have much of an idea for their 2024 slate beyond FF7 Rebirth.
The sad part is that, if it's real but is a pitch to potential investors, we don't have any guarantee the pitch was greenlighted.

So, there's no guarantee the game is in development or we are getting a proper reveal any time.
@JCJ I was also thinking last night this could be part of the investment pitch we knew about all the way back in 2020. That would explain the 'proper' voice acting?
That said, if any investors did bite, surely YsNet would have something to show around now. Hopefully.
I await TGS with anticipation like the rest of you and hope for a positive outcome
The most interesting question (if this is real) is who is the brave publisher? You need balls to release Shenmue 4. :D

SEGA or some new publisher (with money) can get the most out of it. 110 were suitable, but we found out that they had no money. SEGA however can do it. Even just for a good image and boosting sales for the first games. SEGA would benefit the most in any case.
The sad part is that, if it's real but is a pitch to potential investors, we don't have any guarantee the pitch was greenlighted.

So, there's no guarantee the game is in development or we are getting a proper reveal any time.
This is why we need to continue to make noise.

Without us then Shenmue slowly dies and fades into obscurity
The *idea* that this recording was from that potential "suitor" meeting back in 2020 only adds further mystique, imo.

If it was from that, why would someone randomly wait until 2023 to suddenly create an account, fire it off on here and Twitter and then vanish? That's a lot of effort and secrecy for something from years ago.

The only plausible explanations I can think of are that it is more recent than that and something *else*, OR it was posted to ease the minds of fans, because when it occurred years ago it was greenlit by whomever it was(110?)

I also think Yu probably learned from the experiences on Kickstarter and from Deep Silvers involvement to keep a tight lid on things until it's ready to show.

You don't want unpolished WiP trailers and stuff. He'll want it to be ready for a good trailer if and when it shows up
I'm rather late to the show (have been away for a few days) but have to say that I took it to be genuine at first, but I am growing less convinced as each listen reveals further things that sound "off" in some way.

While one or two being present might be able to be overlooked, their frequency makes for an uncomfortable listening experience overall when listening closely. Several have already been highlighted by previous posters, but just to pick out a few in particular that stood out:

* Pronunciation issues: in particular Lan Di (which can be heard in multiple places as "Lan Ti" rather than "Lan Tei") and Tentei (where Ryo says "Ten-tee?"). In another place Shenhua says "何者だ" but pronounces it as "nan-mono da" rather than "nani-mono da", which is a small difference but gives it a non-natural flow.
* Inappropriate word choice / politeness level: we hear Shenhua replying to the old man using the word お前 (o-mae). This word means "you" but is rarely used by females, and would be especially rude to use toward an elder. She also uses the word "俺たち" for "we", again a masculine way of speaking that suits someone like Ren but is very un-Shenhua-like!

Although these may sound to be small issues on their own, they would surely have been noticed and corrected before the trailer was ever shown. I tend to agree with those who suggest this was put together using AI tools or similar, by someone who is not a native Japanese speaker (but is very familiar with the lore).
I'm rather late to the show (have been away for a few days) but have to say that I took it to be genuine at first, but I am growing less convinced as each listen reveals further things that sound "off" in some way.

While one or two being present might be able to be overlooked, their frequency makes for an uncomfortable listening experience overall when listening closely. Several have already been highlighted by previous posters, but just to pick out a few in particular that stood out:

* Pronunciation issues: in particular Lan Di (which can be heard in multiple places as "Lan Ti" rather than "Lan Tei") and Tentei (where Ryo says "Ten-tee?"). In another place Shenhua says "何者だ" but pronounces it as "nan-mono da" rather than "nani-mono da", which is a small difference but gives it a non-natural flow.
* Inappropriate word choice / politeness level: we hear Shenhua replying to the old man using the word お前 (o-mae). This word means "you" but is rarely used by females, and would be especially rude to use toward an elder. She also uses the word "俺たち" for "we", again a masculine way of speaking that suits someone like Ren but is very un-Shenhua-like!

Although these may sound to be small issues on their own, they would surely have been noticed and corrected before the trailer was ever shown. I tend to agree with those who suggest this was put together using AI tools or similar, by someone who is not a native Japanese speaker (but is very familiar with the lore).
Thanks Switch. Does Yu-san sound "off"? It could still be a genuine trailer that isn't polished, but if Yu sounds like AI then it's 100% fake.
Just a musing:

It may not be real, but sometimes I wonder if over the years we've become so used to the rumor mill or the void of silence thaf we've just become giant cynics, lol.
The audio perfectly fits a chapter of Fringe. It mades people want to touch some grass after hearing it.

Now seriously, Im recalling YS telling about problems between delay input-response. If he feels its necessary to an specified mechanic requires by his vision he will mantain the project in the oven until the tech point its reached (could be 1-2 years) .
About his voice in the trailer... maybe its distorted through the recording but it also seems like made from old videos of him when was younger. I now that I know nothing.
Last edited:
I listened to Air Twister's presentation and his voice sounds pretty close, but also slightly different. Of course in one case you are on stage, it's noisy, music is loud, etc., and in the other case they are possibly in some room and someone is recording with a phone/recorder on. It's hard to say, but let's hear from someone who knows more. To be honest, I have no experience with how well AI voices can imitate a human.
Thanks Switch. Does Yu-san sound "off"? It could still be a genuine trailer that isn't polished, but if Yu sounds like AI then it's 100% fake.
It sounds close to his voice, I think. But there's something else that seems strange with his word choice, which @Rydeen has pointed out earlier, which is at the end where he says what sounds to be "yakujo suru", but that doesn't make sense. Another similar-sounding word like "sakujo suru" would fit (which would have the meaning that he might remove the part about Tentei).