Podonos

This is a base module. You can import:

python
import podonos
from podonos import *

init()

Initialize the module.

api_key
string

API key you obtained from the workspace. For details, see the Get API key If this is not set, the package tries to read PODONOS_API_KEY from the environment variable. Throws an error if both of them are not available.

Returns an instance of Client.

python
client = podonos.init(api_key='<API_KEY>')

Client

Client manages one or more Evaluator and evaluation history.

create_evaluator()

Create a new instance of Evaluator. One evaluator supports a single type of evaluation throughtout its life cycle. If you want multiple types of evaluation, you can create multiple evaluators by calling create_evaluator() multiple times.

name
string

Name of this evaluation session. If empty, a random name is automatically generated and used.

desc
string

Description of this evaluation session. This field is purely for your record, so later you can see how you generated the output files or how you trained your model.

type
string
default:"NMOS"

Evaluation type. One of the following:

TypeDescription
NMOSNaturalness Mean Opinion Score
SMOSSimilarity Mean Opinion Score
QMOSQuality Mean Opinion Score
P808Speech quality by ITU-T P.808
PREFPreference test between two audios/speechs

Currently we support 5-point evaluation.

lan
string
default:"en-us"

Specific language and locale of the speech. Currently we support:

CodeDescription
en-usEnglish (United States)
en-gbEnglish (United Kingdom)
en-auEnglish (Austalia)
en-caEnglish (Canada)
ko-krKorean (Korea)
zh-cnMandarin (China)
es-esSpanish (Spain)
es-mxSpanish (Mexico)
fr-frFrench (France)
de-deGerman (Germany)
ja-jpJapanese (Japan)
it-itItalian (Italy)
pl-plPolish (Poland)
audioGeneral audio file

We will add more soon. Please check later again.

num_eval
int
default:"10"

Number of evaluations per sample. For example, if this is 10 for NMOS type evaluation, each audio file will be assigned to 10 humans, the statistics of the evaluation output will be computed and presented in the final report.

due_hours
string
default:"12"

Expected due of the final report. Depending on the hours, the pricing may change.

use_annotation
bool
default:"False"

True for requesting additional details of the rating. Only available for single stimulus evaluations. File script must be provided.

use_power_normalization
bool
default:"False"

Enable power normalization to ensure consistent audio volume levels during evaluation.

auto_start
bool
default:"False"

If True, the evaluation automatically starts when you finish uploading the files. If False, you go to Workspace, confirm the evaluation session, and manually start the evaluation.

max_upload_workers
int
default:"20"

Maximum number of upload worker threads. If you experience a slow upload, please increase the number of workers.

Returns an instance of Evaluator.

etor = client.create_evaluator()

create_evaluator_from_template()

When you create an evaluation using a template, all the questions and options defined in the template are automatically assigned to the new evaluation. This ensures consistency and saves time by reusing pre-defined content.

name
string

Name of this evaluation session. If empty, a random name is automatically generated and used.

desc
string

Description of this evaluation session. This field is purely for your record, so later you can see how you generated the output files or how you trained your model.

num_eval
int
default:"10"

Number of evaluations per sample. For example, if this is 10 for NMOS type evaluation, each audio file will be assigned to 10 humans, the statistics of the evaluation output will be computed and presented in the final report.

use_power_normalization
bool
default:"False"

Enable power normalization to ensure consistent audio volume levels during evaluation.

template_id
string

The unique identifier of the template to base the new evaluation on. Required to specify the predefined structure and settings for the evaluation.

etor = client.create_evaluator_from_template(
  name="How natural the voice it is", 
  desc="new_model_vs_competitor_model", 
  num_eval=10, 
  template_id="abcdef"
)

create_evaluator_from_template_json()

Create a new evaluation using a JSON template. This allows you to define custom evaluation structures programmatically.

json
Dict

Template JSON as a dictionary. Optional if json_file is provided.

json_file
string

Path to the JSON template file. Optional if json is provided.

name
string

Name of this evaluation session. Required.

custom_type
string

Type of evaluation. Must be either “SINGLE” or “DOUBLE”.

desc
string

Description of this evaluation session. Optional.

lan
string
default:"en-us"

Language for evaluation. See supported languages in create_evaluator().

num_eval
int
default:"10"

Number of evaluations per sample.

use_annotation
bool
default:"False"

Enable detailed annotation on script for detailed rating reasoning. Only available for single stimulus evaluations. File script must be provided.

use_power_normalization
bool
default:"False"

Enable power normalization to ensure consistent audio volume levels during evaluation.

max_upload_workers
int
default:"20"

The maximum number of upload workers. Must be a positive integer.

# Using JSON dictionary
template = {
    "query": [
        {
            "type": "SCORED",
            "question": "How natural the voice it is",
            "description": "Rate the quality of the voice",
            "options": [
              {"label_text": "Excellent"},
              {"label_text": "Good"},
              {"label_text": "Fair"},
              {"label_text": "Poor"},
              {"label_text": "Bad"},
            ]
        }
    ]
}

evaluator = client.create_evaluator_from_template_json(
    json=template,
    name="Quality Test",
    custom_type="SINGLE"
)

Returns an instance of Evaluator.

Here’s the JSON template for reference:

Question: Represents the main question posed to evaluators about the audio being assessed. It guides the evaluators on what specific aspect of the audio they should focus on during the evaluation.

ParameterDescriptionRequiredNotes
typeType of question. Options: SCORED, NON_SCORED, COMPARISONYesDetermines the structure and requirements of the question
questionThe main question textYesMust be provided for all question types
descriptionAdditional details or context for the questionNoOptional for all question types
optionsList of possible options. Only for SCORED and NON_SCORED typesConditionalMust have between 1 and 9 options for SCORED and NON_SCORED types
scaleScale for comparison. Only for COMPARISON typeConditionalMust be an integer between 2 and 9 for COMPARISON type
allow_multipleAllows multiple selections. Only for NON_SCORED typeNoEnables multiple choice selection
has_otherIncludes an “Other” option. Only for NON_SCORED typeNoAdds an option for evaluators to specify an unlisted choice
related_modelRelated model for the question. Only for Double Evaluation type.ConditionalSelect which model the question is related to.
anchor_labelLabels for the ends of the comparison scale. Only for COMPARISON type.ConditionalProvides context for what each end of the scale represents, enhancing evaluator understanding.

Important Notes:

  • SCORED and NON_SCORED questions can have a maximum of 9 options.
  • COMPARISON type questions must have a scale between 2 and 9.
  • related_model consists of ALL, MODEL_A and MODEL_B. Default is ALL. The related_model is only used for the question (not for instructions).

get_evaluation_list()

Returns a JSON containing all your evaluations.

evaluations = client.get_evaluation_list()
print(evaluations)

The output JSON looks like:

[
  {
    'id': '<UUID>', 
    'title': 'How natural my synthetic voices are', 
    'internal_name': null, 
    'description': 'Used latest internal model. Epoch 10, alpha 0.1', 
    'status': 'ACTIVE', // DRAFT, ACTIVE, COMPLETED
    'created_time': 2024-06-25T01:40:43.429Z, 
    'updated_time': 2024-06-26T13:21:34.801Z
  },
  
  ...
]

get_stats_json_by_id()

Returns a list of JSONs containing the statistics of each stimulus for the evaluation referenced by the id.

evaluation_id
string

Evaluation id. See get_evaluation_list().

group_by
string
default:"question"

Group by criteria. Options are “question”, “script”, or “model”. Default is “question”. Note that “script” and “model” are only available for single-question evaluations.

evaluations = client.get_evaluation_list()
for eval in evaluations:
    stats = client.get_stats_json_by_id(eval['id'], group_by='question')
    print(stats)
FieldDescriptionSCOREDNON_SCORED
meanAverage score-
medianMedian score-
stdStandard deviation-
semStandard error of the mean-
ci_9595% confidence interval-
optionsEach option name as key with integer value
OTHERThe number of evaluators who selected “Other”

For NON_SCORED questions:

  • The integer value is the number of evaluators who selected the option.
  • All options are included in the response regardless of their value

You can get the statistics of each question by calling get_stats_json_by_id() with group_by set to question, script, or model.

{
  "question": string,
  "description": string,
  "order": int,
  "responses": [
    {
      "name": string,
      "model_tag": string,
      "tags": string[],
      "type": "A" | "B" | "REF",
      "script": string | null,
      "mean": float | null, // null if the question is not SCORED
      "median": float | null, // null if the question is not SCORED
      "std": float | null, // null if the question is not SCORED
      "sem": float | null, // null if the question is not SCORED
      "ci_95": float | null, // null if the question is not SCORED
    }
  ]
}

download_evaluation_files_by_evaluation_id()

Download all files associated with a specific evaluation, identified by its evaluation_id, from the Podonos evaluation service. It saves these files to a specified directory on the local file system and generates a metadata file describing the downloaded files. Return a string indicating the status of the download operation. This could be a success message or an error message if the download fails.

evaluation_id
string

Evaluation id. See get_evaluation_list().

output_dir
string

The directory path where the downloaded files will be saved. This should be a valid path on the local file system where the user has write permissions.

client.download_evaluation_files_by_evaluation_id(
  evaluation_id='12345', 
  output_dir='./output'
)
FieldDescription
file_pathThe path to the downloaded file relative to the output_dir.
original_nameThe original name of the file before downloading.
model_tagThe model tag associated with the file, used for categorization.
tagsA list of tags associated with the file, providing additional context or categorization.

File Naming Convention:

Each downloaded file is saved in the format {output_dir}/{model_tag}/{file_name}. This means that files are organized into subdirectories named after their model_tag, and the original file name is hashed formatted.

File

A clsss representing one file, used for adding files in Evaluator.

path
string
required

Path to the file to evaluate. For audio files, we support wav and mp3 formats.

model_tag
string
required

Name of your model (e.g., “WhisperTTS”) or any unique name (e.g., “human”)

tags
list[string]

A list of string tags for the file designated by path. You can use this field as properties of the file such as original, synthesized, tom, maria, and so on. Later you can look up or group files with particular tags in the output report.

script
string

Text script of the input audio file. If use_annotation is set to True,

is_ref
bool
default:"False"

True if this file works as a reference in a comparative evaluation.

Evaluator

Evaluator manages a single type of evaluation.

add_file()

Add one file to evaluate in a single evaluation question. For a single file evaluation like NMOS, one file to evaluate is added.

file
string
required

Input File. This field is required if type is NMOS or P808.

etor.add_file(file=File(path='/path/to/speech_0_0.wav',
                        tags=['synthesized', 'male', 'ver1234']))

add_files()

Add multiple files for such evaluations that require multiple files for comparison.

file0
string
required

First Input File. This field is required if type is SMOS.

file1
string
required

Second Input File. This field is required if type is SMOS.

file0 = File(path='/path/to/speech0.wav', tags=['original', 'male', 'human'])
file1 = File(path='/path/to/speech1.wav', tags=['synthesized', 'male', 'ver1234'])
etor.add_file_set(file0=file0, file1=file1)

close()

Close the evaluation session. Once this function is called, all the evaluation files will be sent to the Podonos evaluation service, the files will go through a series of processing, and delivered to evaluators.

Returns a JSON object containing the uploading status.

python
status = etor.close()