Dataset settings#

Rubrix datasets have certain settings that you can configure via the rb.*Settings classes, for example rb.TextClassificationSettings.

Define a labeling schema#

You can define a labeling schema for your Rubrix dataset, which fixes the allowed labels for your predictions and annotations. Once you set a labeling schema, each time you log to the corresponding dataset, Rubrix will perform validations of the added predictions and annotations to make sure they comply with the schema.

import rubrix as rb

# Define labeling schema
settings = rb.TextClassificationSettings(label_schema=["A", "B", "C"])

# Apply settings to a new or already existing dataset
rb.configure_dataset(name="my_dataset", settings=settings)

# Logging to the newly created dataset triggers the validation checks
rb.log(rb.TextClassificationRecord(text="text", annotation="D"), "my_dataset")
#BadRequestApiError: Rubrix server returned an error with http status: 400