I am trying to convert some Cantonese audio clip into text with the Google Speech API. The example app on https://cloud.google.com/speech/docs/samples was good (I am using Java) and following the instruction I can get the example converting the sample clip into text. But than I get into trouble converting the Cantonese.
Firstly, I can't get it converting Cantonese clip. It just returned blank result. I did two things to make it return some transcript.
- Setting the Language Code
RecognitionConfig config =
RecognitionConfig.newBuilder()
.setEncoding(AudioEncoding.LINEAR16)
.setSampleRate(samplingRate).setLanguageCode("yue-Hant-HK")
.build();
- Using Audacity and record the clip with MONO channel, and export it as RAW type:
- File type: Other uncompressed files
- Header: RAW (header-less)
- Encoding: Signed 16-bit PCM
And finally I got a response like below:
INFO: Received response: results {
alternatives {
transcript: "\350\251\246\345\232\207\345\273\243\346\235\261\350\251\261\350\250\273\345\206\212\346\231\202\351\226\223"
confidence: 0.8150804
}
}
It looked like a JSON but it wasn't. It was GRPC's response to get the Chinese transcript we need GRPC client.
I modified the AsynClient a bit and make it return the AsyncRecognizeResponse from the recognize method, and use some code like below:
AsyncRecognizeResponse result = null;
try {
result = client.recognize();
} catch(Exception e){
e.printStackTrace();
}finally {
client.shutdown();
}
result.getResultsCount();
List rresult = result.getResultsList();
for(SpeechRecognitionResult srr:rresult){
SpeechRecognitionAlternative alternativesStr = srr.getAlternatives(0);
String transcriptStr= alternativesStr.getTranscript();
System.out.println(transcriptStr);
}
And the transcriptStr contains the correct Chinese characters now.
Comments