We tested this feature significantly. However, this is still in the beta mode.

Background

Now you know how to get overall rating statistics for each file or files per evaluation types. You may wonder why the evaluators think so. Additional question we can ask to the evaluators is “why do you think so?”

Tell me the reason why you think so

With additional configuration, you can ask the evaluators why they give such ratings. For example, if the rating is low, you can ask the details on such a low rating.

Example

import podonos
from podonos import *
import boto3


# Generate a sample speech file.
script = "Hello, how is your day going?"
polly_client = boto3.Session().client('polly')
response = polly_client.synthesize_speech(
    VoiceId='Brian', OutputFormat='mp3',
    Text=script, Engine='neural'
)
filename = '/path/to/1.mp3'
file = open(filename, 'wb')
file.write(response['AudioStream'].read())
file.close()

client = podonos.init()
etor = client.create_evaluator(
    name="Speech AI Preferences Test",
    desc="Preference test between speech synthesis models",
    type="NMOS", lan="en-us", num_eval=10,
    use_annotation=True  # You set this to True
)

etor.add_file(
    File(path='/path/to/1.mp3', model_tag='Polly', tags=["Brian", "neural"],
         script=script))
etor.close()

For this feature, you need to set use_annotation=True when creating an Evaluator object. Also, you need to provide the script when adding files. This feature is only available for single stimulus evaluations.

Evaluation with annotation

Once the evaluation finishes, please click the analysis tab. In the evaluation, you can see the files on the top:

Please click one of the files. Then, you will see the original text and marked words or phrases where the evaluators left reasoning behind each rating with detailed descriptions like