[RELEASE] Shenmue Audio Restoration Project - Shenmue English & Japanese Dub - Shenmue II English

Joined
Jul 27, 2018
Shenmue Audio Restoration Project


What is it?

The Shenmue Audio Restoration Project aims to restore the voices to their full quality, as their quality on PC is substandard at best, leaving much to be desired from this port.


What do you mean "restore the voices?"

When D3T ported this game, they converted all of the audio from the Yamaha AICA ADPCM Stream format, to the xWMA format. For this use, xWMA is wholly unsuitable, and completely destroys any semblance of audio quality. Beyond that, while the xWMA encoder they used (more on how I know what specific encoder they used in a minute) encodes audio at 48kbps by default, D3T opted to encode (most of) the voice files in the game at 20kbps. In the case of Shenmue, this is especially bad because the audio quality for the voices was already pretty poor (understandable for the time, but still poor). The aim of this project is to restore the original voice files as 1:1 as possible.


So if xWMA files are bad, and they're what the game uses; how are you doing this?

Well, it just so happens that the game also accepts .WAV files if you name them identically to what D3T did. Oh yeah, they also changed the filenames from something descriptive like 01AYUA008 (yes, that's actually descriptive), to generic filenames like file_1, file_2, file_3 etc. This wouldn't be an issue, except everything doesn't line up 1:1. The first file (alphabetically) in the Dreamcast archive isn't necessarily the first file in the PC archive.

That seems like a problem; how did you find the correct filenames?

Well this is where things get interesting. So, as I said, D3T changed the filenames to generic ones like file_1, but what's really interesting are the file extensions. The files in the PC release's archives end in .wav.xma. Considering the Shenmue Translation Pack converts the .str files from the Dreamcast release into .wav, I took a wild guess that they used the Shenmue Translation Pack in making this port. Noticing that extension is how I figured out how to load .wav files into the game. It took me a few more days before I really figured out what was up. At first, I thought I was going to have to listen to each file, transcribe it, do the same for the other release, and then compare the two by hand. In desperation to find a way to make this easier, I decided to try and encode some xWMA files myself, from my .WAV sources, and see if I could use filesizes to compare them. In my experimentation I noticed I had two files that weren't just close in size, they were identical, so of course, I checked their MD5 hashes. It would be absolutely absurd, but super useful if I were able to generate identical files... Which is exactly what I wound up doing, the MD5 hashes matched. I checked it again and again and again. It matched. From there, everything just sort of spiraled, and I got my good friend DaviDokuro to whip up some scripts to compare my XMAs against theirs, and rename .wav files wherever there was a match. It's one of the dumbest ideas I've ever had; and it worked perfectly. Many revisions of the script later, and here we are.

Wait so... Where are we?

Well, we're in a few places.

First off, we're on MEGA and Nexus Mods, download the mod from the link below, and unzip the scene folder inside to your mods folder (Forklift mod loader required), and you're good to go.

We're also on Github. Remember how I said DaviDokuro whipped up a few scripts for me? Well, we put them on Github, so anyone can go through the process I did of extracting WAVs, extracting XWMs, converting WAVs to XWMs (twice; once for 20kbps and once for 48kbps, in order to match 99.9% of filenames, there's like 10-15 total throughout the whole game that still don't match up because they're encoded at 32kbps or were otherwise altered.). This script is called pcdcall and it can be found on his Github, which I will link at the bottom of this post.

There's one more thing on that Github as well. We realized that the whole matching filenames thing only has to be done once. As long as we log it, we never have to do it again... And we can write a script to just rename the .WAV files; then you just have to unpack the PC archives, replace the .xwm files with our fake .xwm files, and re-pack the archives. Unfortunately, you would still need to know the hashes of the PC afs files... luckily those are also all documented on the Github.

So what's next?

Well, Shenmue is now done, and it's time to get to work on Shenmue II. I think I have most of the groundwork laid out for it; but it's going to be a bit more involved of a process (there are some things I just can't script, unfortunately), and there are some things that I haven't been able to figure out yet; but that should be solved soon, since I'll be changing the order in which I wanted to do things, and am now doing English first for Shenmue II, since it'll be easiest for me to figure out problems with.

Shenmue II Japanese is next. I don't know yet if the game uses the actual Japanese dub, or the European Japanese dub... so I guess I'll do both. Not sure which I'm going to do first

Alright, I downloaded your mod and I think I found an issue

Well, reach out to me either here, on reddit ( /u/Jianmin_Tao ) or on the Nexus. I can't guarantee any support (and I will not be supporting the scripts on Github, half because that's Davidokuro's job, and half because they're adequate for my usage, and I have not experienced any bugs), because tracking down individual files in this is a nightmare; but I'll take a look into it.

I also want to note that there are a few (specifically two) known issues that I'm not entirely sure how I'm going to handle going forward, so keep in mind that these may be fixed in an update, and they also may not.

- I did not convert the BGM01/BGM02/BGM03 archives, these contain things such as the intro/ending cutscene's BGM, the item collection sound(s), and the notebook music. They also may or may not be present in the .CSV files on the Github (I legitimately just don't remember). I did not convert these because they're all music or SFX, and XWMA isn't too horrible a choice (I still wouldn't recommend it though) for this use case, so I can stomach the audio. Straight music files in the AFS archives also seem to be a bit wonky. I think I know how I can counteract this, but it'll take some playing around, so I'm going to finish the dubs, then work on those.

- Some voice clips were legitimately edited before being inserted into the game; I did not re-do those edits, so when those lines get triggered, something funky may happen; I don't know, I've done almost no testing whatsoever on this pack, all I know is that everything should work. These files ARE noted in the CSVs on the Github, so you don't have to worry about them unless you're doing the whole conversion start-to-finish as I have, which nobody should ever ever ever ever have to do again.

- This isn't necessarily a bug, but just something to watch out for if you're doing all the conversions start-to-finish. It seems the version of xwmaencode you use matters. I do not know which version it is that I have that works, I just know the one Davidokuro got the other day didn't work

Instructions:
1) Install the Forklift Mod Loader
2) Move the Scene folder from the zip you downloaded into your Mods folder
3) Play the game without your ears wanting to murder you

Special Thanks

Raymonf - For doing so much work on capturing/reversing hashes. I couldn't have even begun this project without Wulinshu and your unpacking tools.

SHENTRAD Team - I cannot overstate just how instrumental your translation pack has been to the rereleases of these games.

DaviDokuro - Lead Programmer on this project, and just a damn good friend.

Lemonhaze - Dude does too much for us. He's been absolutely TEARING APART the executables, and he also wrote me a script to automate AHX conversion. Dude's been too awesome, and Shenmue II (English) being done now instead of sometime next year is directly because of him.

Bluemue - For being my guinea pig, as well as a repository of knowledge on the original releases, you really helped a lot dude.

I'd also like to thank anyone that's been following along with this project. It's been a hell of a ride and I can't believe that we've gotten this far in under a month.

Downloads and Links

Nexus Mods: https://www.nexusmods.com/shenmue/mods/23/
UPDATE: Shenmue II English is now available

Shenmue Audio Restoration Project Toolset: http://github.com/davidokuro/SARP-Toolset/

Youtube: https://www.youtube.com/channel/UCmuMJLASoCAvOSojt1gjZKQ
 
Last edited:
Are you going to make a thread about this for the Steam community? I think they'd appreciate knowing about it.

Also, could the future mod tools you mention theoretically be used to do a Shenmue fan dub in any language?
 
Are you going to make a thread about this for the Steam community? I think they'd appreciate knowing about it.

Also, could the future mod tools you mention theoretically be used to do a Shenmue fan dub in any language?


Shit, I knew I forgot to make a thread SOMEWHERE; and those future mod tools can DEFINITELY be used for that, although you wouldn't even really need the TOOLS part of it, just the documentation part.
 
Awesome you work very quick! Looking forward to installing this once I start my playthrough.
 
Just been trying to play the first Shenmue in Japanese and the sound quality is even worse than in English, not just in overall quality (which is worse, high pitched parts literally distort) but most of the verbal lines also end with a pop/click noise...
 
First post has been updated to accommodate the new release

Just been trying to play the first Shenmue in Japanese and the sound quality is even worse than in English, not just in overall quality (which is worse, high pitched parts literally distort) but most of the verbal lines also end with a pop/click noise...

yeah, the pop/click thing seems to be related to the game itself, not so much the voice files. Either way, Japanese is my next target afte rbeating this game.
 
Just wanted to throw a quick note up here, and let it be known. I -WILL- be fixing Shenmue II

Awesome! Does this fix include the music? As I understand, you've been transcoding the Dreamcast audio files at higher bitrates than d3t did, but as the music was produced originally on the Dreamcast hardware, I'm not sure how that would work? Unless you retranscode the Xbox audio files, I guess.
 
No, I won't be fixing the music. You have a sort of correct idea of what's happening. Essentially, I'm re-doing about half of what D3T did to the audio. In order to do this though, I also have to re-create their whole audio workflow. So I'm doing all the work they did, in order to do half the work they did. It's a little bit odd to explain without showing demonstrations; but essentially I had to track down the exact tools (down to version numbers in some cases) they used to make this port.

That being said; while my previous bitching about not understanding WHY they used the formats they did still stands. I just want to stress harder than ever before, this port is FUCKING INCREDIBLE, even with all the bugs. It's essentially the same thing as Freeablo; or perhaps even more akin to URDE (Metroid Prime to PC port that's currently being made by fans by meticulously reverse engineering Metroid Prime and re-implementing it line by line). I'm growing more and more certain that they had absolutely NO assets, and very very little, if any, source. They were basically handed copies of the games, and were told "have fun". This port was absolutely a labor of love, and it only exists because the team at D3T wanted it to exist.

Thank you, D3T, from the bottom of my heart.

EDIT: I also forgot to clarify, the voices and the music are two separate things. The music was generated on the system; the voices are just streaming audio.
 
Would it be possible for someone else to fix the music? Or is it not as simple as swapping the files? How is the music currently implemented in these ports?
 
To the best of my knowledge, they're legitimately emulating the Dreamcast's sound chip. It's the only bit of emulation in here (besides space harrier, afaik; but I'm really mostly taking guesses and going on hearsay), so it wouldn't be as simple as drag and drop. We kind of lucked out with voices in that the game loads at least one other type of file. The only thing really holding back Shenmue II right now from my perspective (again, that's of doing strictly voices), is that to my knowledge, Forklift does not work with Shenmue II; and there would be A LOT of hashes to invalidate.
 
To the best of my knowledge, they're legitimately emulating the Dreamcast's sound chip. It's the only bit of emulation in here (besides space harrier, afaik; but I'm really mostly taking guesses and going on hearsay), so it wouldn't be as simple as drag and drop. We kind of lucked out with voices in that the game loads at least one other type of file. The only thing really holding back Shenmue II right now from my perspective (again, that's of doing strictly voices), is that to my knowledge, Forklift does not work with Shenmue II; and there would be A LOT of hashes to invalidate.

Is the music in Shenmue II emulated as well? If so, is it emulating the Dreamcast version's audio or the Xbox version's audio? I think I saw people say that the music glitches in II sound like the 360-emulated version of the Xbox port.
 
Is the music in Shenmue II emulated as well? If so, is it emulating the Dreamcast version's audio or the Xbox version's audio? I think I saw people say that the music glitches in II sound like the 360-emulated version of the Xbox port.

I'm honestly not sure, I'd like to say yes, but I don't want to put out false information. I couldn't tell you which version it's using, because again, I'm only looking at voices really. I've got Japanese DC voices up next for Shenmue 1, and then Japanese (Dreamcast) for Shenmue II. After that, I'm going to look into tracking down an Xbox and a copy of Shenmue II so I can get started on those voice files. That's really about all I can talk about, beyond voices, I'm a bit of a lost puppy.
 
Thanks for the replies; will you be using EU or JP Dreamcast voices for II? I guess the only real difference is Yuan.
 
I'm not entirely sure. When the game first came out, I played an imported EU copy; and then US Xbox; so I'm probably going to use EU JP (which I'd guess is what they used, but I don't know for sure). If someone wants to document which files they are (just run diffs between EU and JP I guess, a large chunk of the files I've taken a look at, which are next to none, appeared to be the same, bit for bit.) I'd be more than happy to release a "choose your own" version for Shenmue II. Right now, I'm a little more concerned with just getting a hold of the Shenmue II English files. It's not that it's particularly hard, but it's money I don't exactly have, or want to spend; I've already got two dead xboxes, why do I need a third that's only really going to be used once (....okay, I might also play JSRF on it, but that'd really be it, two games). Honestly, I don't even really need to have it myself, I just need someone to dump the disc and send me the relevant files (although I am holding onto full-disc dumps, both for posterity, preservation, and for other projects that may arise that need the original files.) and I can take care of the rest.
 
It’s a bit confusing what audio version of S2 they used. Yuan is male in the JP language mode (can’t bare the English dub of S2). But also seems to sometimes revert to the female EU voice in free battle.

They also modified the subtitles to use male pronouns. So it’s possible it’s the EU version but with the original JP Yuan put back in??
 
yeesh, alright then, I guess that just means we're DEFINITELY doing a choose your own type deal then (note: I do not believe I am converting free battle voices. I do not know where they are located, nor what format they are in. If anyone knows, point me towards them and I'll see what I can do)
 
No, I won't be fixing the music. You have a sort of correct idea of what's happening. Essentially, I'm re-doing about half of what D3T did to the audio. In order to do this though, I also have to re-create their whole audio workflow. So I'm doing all the work they did, in order to do half the work they did. It's a little bit odd to explain without showing demonstrations; but essentially I had to track down the exact tools (down to version numbers in some cases) they used to make this port.

That being said; while my previous bitching about not understanding WHY they used the formats they did still stands. I just want to stress harder than ever before, this port is FUCKING INCREDIBLE, even with all the bugs. It's essentially the same thing as Freeablo; or perhaps even more akin to URDE (Metroid Prime to PC port that's currently being made by fans by meticulously reverse engineering Metroid Prime and re-implementing it line by line). I'm growing more and more certain that they had absolutely NO assets, and very very little, if any, source. They were basically handed copies of the games, and were told "have fun". This port was absolutely a labor of love, and it only exists because the team at D3T wanted it to exist.

Thank you, D3T, from the bottom of my heart.

EDIT: I also forgot to clarify, the voices and the music are two separate things. The music was generated on the system; the voices are just streaming audio.


What a fucking post. Absolute respect sir.
 
Back
Top