# biome.text.text_cleaning Module

# TextCleaning Class


class TextCleaning ()

Base class for text cleaning processors

# Ancestors

  • allennlp.common.registrable.Registrable
  • allennlp.common.from_params.FromParams

# Subclasses

# TextCleaningRule Class


class TextCleaningRule (func: Callable[[str], str])

Registers a function as a rule for the default text cleaning implementation

Use the decorator @TextCleaningRule for creating custom text cleaning and pre-processing rules.

An example function to strip spaces (already included in the default TextCleaning processor):

@TextCleaningRule
def strip_spaces(text: str) -> str:
    return text.strip()

Parameters

func : Callable[[str]
The function to register

# registered_rules Static method


def registered_rules() -> Dict[str, Callable[[str], str]]

Registered rules dictionary

# DefaultTextCleaning Class


class DefaultTextCleaning (rules: List[str] = None)

Defines rules that can be applied to the text before it gets tokenized.

Each rule is a simple python function that receives and returns a str.

Parameters

rules : List[str]
A list of registered rule method names to be applied to text inputs

# Ancestors

  • TextCleaning
  • allennlp.common.registrable.Registrable
  • allennlp.common.from_params.FromParams
Maintained by