Pronunciation Analyses

Performing a pronunciation analysis is broken up into various calls.

Initialising a pronunciation analysis

This is the starting point for a pronunciation analysis. A unique analysis id is generated, which serves a leading role in the analysis. Each other call requires the analysis id as a parameter.

If other RPCs are called without this RPC being called first, the error nl.itslanguage.session_not_initialised will be returned.

URI

nl.itslanguage.pronunciation.init_analysis

Response

The unique analysis id is returned as a string. The id is used in other calls to identify the current analysis.

Initialise pronunciation challenge

Before performing the analysis, the reference audio of the challenge needs to be prepared. This involves aligning the audio. When the RPC is called, the challenge is initialised asynchronously. When the challenge is to be used, the server automatically waits for the challenge initialisation to finish. If the initialisation results in an error, the error is relayed to the client.

URI

nl.itslanguage.pronunciation.init_challenge

Fetch reference alignment

The challenge that will be done is registered with a reference audio file to compare the student's speech with. This will need to be fetched before being able to send audio. This reference alignment will be given back to the caller as an intermediate result.

URI

nl.itslanguage.pronunciation.alignment

Initialising audio for uploading

The audio that is to be uploaded for analysis is streamed to the server. Some information is required in order for the server to be able to store and analyse the audio correctly.

URI

nl.itslanguage.pronunciation.init_audio

Parameters

Name Type Description
analysis_id string Required The id of the analysis to initialise the audio for.
audio_format string Required The mimetype of the audio to upload.

Depending on the audio_format, additional parameters may be required. Please see the audio documentation for which audio formats are supported and what parameters they require.

Parameters

Name Type Description
analysis_id string Required The id of the analysis to prepare the challenge for.
organisation_id string Required The id of the organisation in which to find the challenge.
challenge_id string Required The id of the challenge to prepare.

Stream analysis audio

The audio to be analysed can now be streamed to the server. The streaming works by repeatedly calling this RPC. Each time the RPC is called, the data will be appended to an audio file on the server.

URI

nl.itslanguage.pronunciation.write

Parameters

Name Type Description
analysis_id string Required The id of the analysis for which to upload the audio.
data bytes Required The data to stream to the server. This can be any length.
encoding string Optional The encoding in which the data is sent. This should be base64 if the data is base64 encoded.

Perform the analysis

After completing the streaming of the audio, the analysis can be performed.

URI

nl.itslanguage.pronunciation.analyse

Parameters

Name Type Description
analysis_id string Required The id of the analysis to perform.

Response

A JSON object with the analysis results and metadata is returned.

Example:

{"id": "98",
 "audioUrl": "https://api-dot-itslanguage.nl:80/organisations/1/challenges/pronunciation/4/analyses/98/audio",
 "score": 8.4245,
 "words": [[{"graphemes": "B",
             "score": 9.8355,
             "verdict": "good",
             "phonemes": [{"start": 0.12,
                           "end": 0.33,
                           "ipa": "b",
                           "score": 9.8355,
                           "verdict": "good"}]},
             {"graphemes": "IR",
              "phonemes": [{"end": 0.5,
                            "ipa": "er",
                            "score": 5.9905,
                            "start": 0.33,
                            "verdict": "moderate"}],
              "score": 5.9905,
              "verdict": "moderate"},
             {"graphemes": "D",
              "phonemes": [{"end": 0.63,
                            "ipa": "d",
                            "score": 7.5173,
                            "start": 0.5,
                            "verdict": "good"}],
              "score": 7.5173,
              "verdict": "good"}]]}

The RPC can return various errors:

  • nl.itslanguage.alignment_failed If the alignment of the audio failed.
  • nl.itslanguage.analysis_failed If the analysis of the audio failed.
  • nl.itslanguage.no_such_organisation If the organisation doesn't exist.
  • nl.itslanguage.no_such_challenge If the challenge doesn't exist.