Agent

To evaluate the simultaneous translation system, the users need to implement agent class which operate the system logics. This section will introduce how to implement an agent.

Source-Target Types

First of all, we must declare the source and target types of the agent class. It can be done by inheriting from

One of the following four built-in agent types
- simuleval.agents.TextToTextAgent
- simuleval.agents.SpeechToTextAgent
- simuleval.agents.TextToSpeechAgent
- simuleval.agents.SpeechToSpeechAgent
Or simuleval.agents.GenericAgent, with explicit declaration of source_type and target_type.

The follow two examples are equivalent.

from simuleval import simuleval
from simuleval.agents import GenericAgent

class MySpeechToTextAgent(GenericAgent):
    source_type = "Speech"
    target_type = "Text"
    ....

from simuleval.agents import SpeechToSpeechAgent

class MySpeechToTextAgent(SpeechToSpeechAgent):
    ....

Policy

The agent must have a policy method which must return one of two actions, ReadAction and WriteAction. For example, an agent with a policy method should look like this

class MySpeechToTextAgent(SpeechToSpeechAgent):
    def policy(self):
        if do_we_need_more_input(self.states):
            return ReadAction()
        else:
            prediction = generate_a_token(self.states)
            finished = is_sentence_finished(self.states)
            return WriteAction(prediction, finished=finished)

States

Each agent has the attribute the states to keep track of the progress of decoding. The states attribute will be reset at the beginning of each sentence. SimulEval provide an built-in states simuleval.agents.states.AgentStates, which has some basic attributes such source and target sequences. The users can also define customized states with Agent.build_states method:

from simuleval.agents.states import AgentStates
from dataclasses import dataclass

@dataclass
class MyComplicatedStates(AgentStates)
    some_very_useful_variable: int

    def reset(self):
        super().reset()
        # also remember to reset the value
        some_very_useful_variable = 0

class MySpeechToTextAgent(SpeechToSpeechAgent):
    def build_states(self):
        return MyComplicatedStates(0)

    def policy(self):
        some_very_useful_variable = self.states.some_very_useful_variable
        ...
        self.states.some_very_useful_variable = new_value
        ...

Pipeline

The simultaneous system can consist several different components. For instance, a simultaneous speech-to-text translation can have a streaming automatic speech recognition system and simultaneous text-to-text translation system. SimulEval introduces the agent pipeline to support this function. The following is a minimal example. We concatenate two wait-k systems with different rates (k=2 and k=3) Note that if there are more than one agent class define, the @entrypoint decorator has to be used to determine the entry point

import random
from simuleval import entrypoint
from simuleval.agents import TextToTextAgent
from simuleval.agents.actions import ReadAction, WriteAction
from simuleval.agents import AgentPipeline


class DummyWaitkTextAgent(TextToTextAgent):
    waitk = 0
    vocab = [chr(i) for i in range(ord("A"), ord("Z") + 1)]

    def policy(self):
        lagging = len(self.states.source) - len(self.states.target)

        if lagging >= self.waitk or self.states.source_finished:
            prediction = random.choice(self.vocab)

            return WriteAction(prediction, finished=(lagging <= 1))
        else:
            return ReadAction()


class DummyWait2TextAgent(DummyWaitkTextAgent):
    waitk = 2


class DummyWait4TextAgent(DummyWaitkTextAgent):
    waitk = 4


@entrypoint
class DummyPipeline(AgentPipeline):
    pipeline = [DummyWait2TextAgent, DummyWait4TextAgent]

Customized Arguments

It is often the case that we need to pass some customized arguments for the system to configure different settings. The agent class has a built-in static method add_args for this purpose. The following is an updated version of the dummy agent from Quick Start.

import random
from simuleval.agents import TextToTextAgent
from simuleval.agents.actions import ReadAction, WriteAction
from argparse import Namespace, ArgumentParser


@entrypoint
class DummyWaitkTextAgent(TextToTextAgent):
    def __init__(self, args: Namespace):
        """Initialize your agent here.
        For example loading model, vocab, etc
        """
        super().__init__(args)
        self.waitk = args.waitk
        with open(args.vocab) as f:
            self.vocab = [line.strip() for line in f]

    @staticmethod
    def add_args(parser: ArgumentParser):
        """Add customized command line arguments"""
        parser.add_argument("--waitk", type=int, default=3)
        parser.add_argument("--vocab", type=str)

    def policy(self):
        lagging = len(self.states.source) - len(self.states.target)

        if lagging >= self.waitk or self.states.source_finished:
            prediction = random.choice(self.vocab)

            return WriteAction(prediction, finished=(lagging <= 1))
        else:
            return ReadAction()

Then just simply pass the arguments through command line as follow.

simuleval \
    --source source.txt --source target.txt \ # data arguments
    --agent dummy_waitk_text_agent_v2.py \
    --waitk 3 --vocab data/dict.txt # agent arguments