Podonos

This is a base module. You can import:

python
import podonos
from podonos import *

init()

Initialize the module.

api_key
string

API key you obtained from the workspace. For details, see the Get API key If this is not set, the package tries to read PODONOS_API_KEY from the environment variable. Throws an error if both of them are not available.

Returns an instance of Client.

python
client = podonos.init(api_key='<API_KEY>')

Client

Client manages one or more Evaluator and evaluation history.

create_evaluator()

Create a new instance of Evaluator. One evaluator supports a single type of evaluation throughtout its life cycle. If you want multiple types of evaluation, you can create multiple evaluators by calling create_evaluator() multiple times.

name
string

Name of this evaluation session. If empty, a random name is automatically generated and used.

desc
string

Description of this evaluation session. This field is purely for your record, so later you can see how you generated the output files or how you trained your model.

type
string
default: "NMOS"

Evaluation type. One of the following:

TypeDescription
NMOSNaturalness Mean Opinion Score
SMOSSimilarity Mean Opinion Score
QMOSQuality Mean Opinion Score
P808Speech quality by ITU-T P.808
PREFPreference test between two audios/speechs

Currently we support 5-point evaluation.

lan
string
default: "en-us"

Specific language and locale of the speech. Currently we support:

CodeDescription
en-usEnglish (United States)
en-gbEnglish (United Kingdom)
en-auEnglish (Austalia)
en-caEnglish (Canada)
ko-krKorean (Korea)
zh-cnMandarin (China)
es-esSpanish (Spain)
es-mxSpanish (Mexico)
fr-frFrench (France)
de-deGerman (Germany)
ja-jpJapanese (Japan)
it-itItalian (Italy)
pl-plPolish (Poland)
audioGeneral audio file

We will add more soon. Please check later again.

num_eval
int
default: "10"

Number of evaluations per sample. For example, if this is 10 for NMOS type evaluation, each audio file will be assigned to 10 humans, the statistics of the evaluation output will be computed and presented in the final report.

due_hours
string
default: "12"

Expected due of the final report. Depending on the hours, the pricing may change.

use_annotation
bool
default: "False"

True for requesting additional details of the rating.

auto_start
bool
default: "False"

If True, the evaluation automatically starts when you finish uploading the files. If False, you go to Workspace, confirm the evaluation session, and manually start the evaluation.

max_upload_workers
int
default: "20"

Maximum number of upload worker threads. If you experience a slow upload, please increase the number of workers.

Returns an instance of Evaluator.

get_evaluation_list()

Returns a JSON containing all your evaluations.

The output JSON looks like:

[
  {
    'id': '<UUID>', 
    'title': 'How natural my synthetic voices are', 
    'internal_name': null, 
    'description': 'Used latest internal model. Epoch 10, alpha 0.1', 
    'status': 'ACTIVE', // DRAFT, ACTIVE, COMPLETED
    'created_time': 2024-06-25T01:40:43.429Z, 
    'updated_time': 2024-06-26T13:21:34.801Z
  },
  
  ...
]

get_stats_dict_by_id()

Returns a list of JSONs containing the statistics of each stimulus for the evaluation referenced by the id.

evaluation_id
string

Evaluation id. See get_evaluation_list()

You will get a list of JSONs like:

[
  {
    'files': [
      {'name': 'ai.wav', 'tags': ['male', 'generated', 'ai'], 'type': 'A'}, 
      {'name': 'real.wav', 'tags': ['male', 'real', 'human'], 'type': 'B'}
    ],
    'mean': 3.4, 
    'median': 3.5, 
    'std': 1.07,
    'ci_90': 1.14, 
    'ci_95': 1.48, 
    'ci_99': 2.46
  }
]

download_stats_csv_by_id()

Download the evaluation statistics into a CSV file referenced by the id.

id
string

Evaluation id. See get_evaluation_list()

output_path
string

Path to the output CSV file.

File

A clsss representing one file, used for adding files in Evaluator.

path
string
required

Path to the file to evaluate. For audio files, we support wav and mp3 formats.

model_tag
string
required

Name of your model (e.g., “WhisperTTS”) or any unique name (e.g., “human”)

tags
list[string]

A list of string tags for the file designated by path. You can use this field as properties of the file such as original, synthesized, tom, maria, and so on. Later you can look up or group files with particular tags in the output report.

script
string

Text script of the input audio file. If use_annotation is set to True,

is_ref
bool
default: "False"

True if this file works as a reference in a comparative evaluation.

Evaluator

Evaluator manages a single type of evaluation.

add_file()

Add one file to evaluate in a single evaluation question. For a single file evaluation like NMOS, one file to evaluate is added.

file
string
required

Input File. This field is required if type is NMOS or P808.

add_files()

Add multiple files for such evaluations that require multiple files for comparison.

file0
string
required

First Input File. This field is required if type is SMOS.

file1
string
required

Second Input File. This field is required if type is SMOS.

close()

Close the evaluation session. Once this function is called, all the evaluation files will be sent to the Podonos evaluation service, the files will go through a series of processing, and delivered to evaluators.

Returns a JSON object containing the uploading status.

python
status = etor.close()