Python API Referenz

Die Python API Referenz dient als zentrale Anlaufstelle für Entwicklerinnen und Entwickler. Sie hilft dabei, Funktionen, Klassen, Module nachvollziehen und effizient nutzen zu können. Sie bietet eine strukturierte Übersicht über alle öffentlich zugänglichen Elemente des Codes und deren Verwendung.

In dieser Referenz finden Sie detaillierte Informationen zu:

Modulen und Paketen: Welche Module verfügbar sind und wie sie importiert werden.
Funktionen und Methoden: Beschreibung von Parametern, Rückgabewerten, möglichen Ausnahmen und Anwendungsbeispielen.
Klassen und Objekten: Informationen zu Konstruktoren, Attributen, geerbten Methoden.

Diese Dokumentation richtet sich sowohl an Einsteigerinnen und Einsteiger als auch an erfahrene Entwicklerinnen und Entwickler. Sie soll den Einstieg erleichtern, die Entwicklung beschleunigen und die Wiederverwendbarkeit des Codes fördern.

backend

MODULE	DESCRIPTION
`main`	Main module of the application.
`src`

main

Main module of the application.

This module serves as the entry point for the program. It imports necessary modules, sets up any initial configuration or data structures, and possibly defines main functions or classes that are used throughout the application.

src

MODULE	DESCRIPTION
`app`	Initialize the app.
`endpoints`	Define all endpoints of the FastAPI app.
`models`
`services`
`settings`	Load all settings from a central place, not hidden in utils.
`utils`

app

Initialize the app.

FUNCTION	DESCRIPTION
`lifespan`	Sets up a scheduler and updates available llms.

lifespan `async`

lifespan(_app)

Sets up a scheduler and updates available llms.

This lifespan function is started on startup of FastAPI. The first part - till yield is executed on startup and initializes a scheduler to regulary check the LLM-API. The second part is executed on shutdown and is used to clean up the scheduler.

The available LLMs - i.e. the LLMs where API-checks passed - are cached in FastAPI state object as app.state.available_llms.

Source code in docs/repositories-clones/backend/src/app.py

@asynccontextmanager
async def lifespan(_app: FastAPI) -> AsyncGenerator[None, None]:
    """Sets up a scheduler and updates available llms.

    This lifespan function is started on startup of FastAPI. The first part
    - till `yield` is executed on startup and initializes a scheduler to regulary
    check the LLM-API. The second part is executed on shutdown and is used to
    clean up the scheduler.

    The available LLMs - i.e. the LLMs where API-checks passed - are cached in
    FastAPI state object as `app.state.available_llms`.
    """

    async def update_llm_state() -> None:
        _app.state.available_llms = await get_available_llms()

    # store available LLMs in FastAPI app state
    _app.state.available_llms = await get_available_llms()

    # setup a scheduler
    scheduler = AsyncIOScheduler()
    scheduler.add_job(
        update_llm_state,
        "interval",
        seconds=settings.check_llm_api_interval_in_s,
    )
    scheduler.start()

    yield

    # cleanup
    scheduler.shutdown()

endpoints

Define all endpoints of the FastAPI app.

FUNCTION	DESCRIPTION
`get_llms`	Return model information of available LLMs.
`health`	Return a health check message.
`save_feedback`	Get user feedback and write feedback to DB.
`simplify_user_text`	Simplify the user text based on the provided parameters.
`simplify_user_text_stream`	Simplify the user text based on the provided parameters and return the result as a stream.

get_llms `async`

get_llms(request)

Return model information of available LLMs.

PARAMETER	DESCRIPTION
`request`	Request-Data. TYPE: `Request`

RETURNS	DESCRIPTION
`list[dict]`	The list of available models.

Source code in docs/repositories-clones/backend/src/endpoints.py

@router.get(
    "/llms",
    summary="List available language models",
    description=("Returns a list of available language models (LLMs).\n\n"),
    responses={
        200: {
            "description": "List of available LLMs.",
            "content": {
                "application/json": {
                    "example": [
                        {
                            "label": "gemma3:27b",
                            "is_remote": False,
                            "max_input_length": 9000,
                            "name": "gemma_3_27b",
                        },
                    ]
                }
            },
        },
        500: {
            "description": "Internal server error",
        },
    },
)
async def get_llms(request: Request) -> list[dict]:
    """Return model information of available LLMs.

    Args:
        request (Request): Request-Data.

    Returns:
        The list of available models.
    """
    app = request.app  # indirectly access the FastAPI app object
    return app.state.available_llms

health `async`

health()

Return a health check message.

RETURNS	DESCRIPTION
`dict[str, str]`	The health check message as a dictonary.

Source code in docs/repositories-clones/backend/src/endpoints.py

@router.get(
    "/",
    summary="Health check",
    description=(
        "Returns a simple message indicating that the Behörden-KlarText service is running.\n\n"
        "Use this endpoint to verify that the service is alive and responsive."
    ),
    responses={
        200: {
            "description": "Health check successful",
            "content": {
                "application/json": {"example": {"message": "Behörden-KlarText is running"}}
            },
        },
        500: {"description": "Internal server error"},
    },
)
@router.get(
    "/health",
    summary="Health check",
    description=(
        "Returns a simple message indicating that the Behörden-KlarText service is running.\n\n"
        "Use this endpoint to verify that the service is alive and responsive."
    ),
    responses={
        200: {
            "description": "Health check successful",
            "content": {
                "application/json": {"example": {"message": "Behörden-KlarText is running"}}
            },
        },
        500: {"description": "Internal server error"},
    },
)
async def health() -> dict[str, str]:
    """Return a health check message.

    Returns:
        The health check message as a dictonary.
    """
    return {"message": f"{settings.service_name} is running"}

save_feedback `async`

save_feedback(feedback_input, session)

Get user feedback and write feedback to DB.

PARAMETER	DESCRIPTION
`feedback_input`	Feedback the user entered in the UI TYPE: `FeedbackBase`
`session`	Database session as a dependency TYPE: `SessionDep`

RETURNS	DESCRIPTION
`Feedback`	dict[str, str]: Status code of the opparation.

Source code in docs/repositories-clones/backend/src/endpoints.py

@router.post(
    "/feedback",
    response_model=Feedback,
    summary="Submit Feedback",
    description=(
        "Allows users to submit feedback on a text or an optimization.\n\n"
        "The response includes the stored feedback ID and the timestamp."
    ),
    openapi_extra={
        "requestBody": {
            "content": {
                "application/json": {
                    "examples": FeedbackBase.model_config["json_schema_extra"]["openapi_examples"]
                }
            }
        }
    },
    responses={
        200: {
            "description": "Feedback successfully saved.",
            "content": {
                "application/json": {
                    "examples": Feedback.model_config["json_schema_extra"]["openapi_examples"]
                }
            },
        },
    },
)
async def save_feedback(feedback_input: FeedbackBase, session: SessionDep) -> Feedback:
    """Get user feedback and write feedback to DB.

    Args:
        feedback_input (FeedbackBase): Feedback the user entered in the UI
        session (SessionDep): Database session as a dependency

    Returns:
        dict[str, str]: Status code of the opparation.
    """
    db_feedback = Feedback.model_validate(feedback_input)
    db_feedback = add_feedback_to_db(db_feedback, session)

    return db_feedback

simplify_user_text `async`

simplify_user_text(request, simplify_input)

Simplify the user text based on the provided parameters.

PARAMETER	DESCRIPTION
`request`	Request-Data. TYPE: `Request`
`simplify_input`	The input text along with any additional parameters for simplification. TYPE: `SimplifyInput`

RETURNS	DESCRIPTION
`SimplifyOutput`	An object containing both the original input text and the simplified text. TYPE: `SimplifyOutput`

Source code in docs/repositories-clones/backend/src/endpoints.py

@router.post(
    "/simplify",
    response_model=SimplifyOutput,
    summary="Text Simplification",
    description=(
        "Simplifies an input text according to the selected language model.\n\n"
        "The endpoint returns a simplified version of the input text.\n\n"
        "The text is processed in full and not streamed.\n\n"
    ),
    openapi_extra={
        "requestBody": {
            "content": {
                "application/json": {
                    "examples": SimplifyInput.model_config["json_schema_extra"]["openapi_examples"],
                }
            },
        }
    },
    responses={
        200: {
            "description": "Text successfully simplified.",
            "content": {
                "application/json": {
                    "examples": SimplifyOutput.model_config["json_schema_extra"][
                        "openapi_examples"
                    ],
                }
            },
        },
        502: {
            "description": "Internal error when calling the language model.",
            "content": {
                "application/json": {
                    "schema": {
                        "type": "object",
                        "properties": {"detail": {"type": "string"}},
                        "example": {
                            "detail": (
                                "Interner Fehler beim Aufruf des Sprachmodells. "
                                "Bitte versuchen Sie es später erneut."
                            )
                        },
                    }
                }
            },
        },
        503: {
            "description": "Language model not available.",
            "content": {
                "application/json": {
                    "schema": {
                        "type": "object",
                        "properties": {"detail": {"type": "string"}},
                        "example": {
                            "detail": (
                                "Das Sprachmodell ist nicht verfügbar. "
                                "Bitte versuchen Sie es später erneut."
                            )
                        },
                    }
                }
            },
        },
    },
)
async def simplify_user_text(request: Request, simplify_input: SimplifyInput) -> SimplifyOutput:
    """Simplify the user text based on the provided parameters.

    Args:
        request (Request): Request-Data.
        simplify_input (SimplifyInput):
            The input text along with any additional parameters for simplification.

    Returns:
        SimplifyOutput:
            An object containing both the original input text and the simplified text.
    """
    language_model = get_valid_language_model(
        simplify_input.language_model, request.app.state.available_llms
    )
    simplify_model = simplify_registry.simplify_models.get(language_model)

    simplified_text = await simplify_registry.simplify(simplify_model, simplify_input)

    simplify_output = SimplifyOutput(
        input_text=simplify_input.input_text, simplified_text=simplified_text
    )

    return simplify_output

simplify_user_text_stream `async`

simplify_user_text_stream(request, simplify_input)

Simplify the user text based on the provided parameters and return the result as a stream.

PARAMETER	DESCRIPTION
`request`	Request-Data. TYPE: `Request`
`simplify_input`	The input text along with any additional parameters for simplification. TYPE: `SimplifyInput`

RETURNS	DESCRIPTION
`StreamingResponse`	A stream of NDJSON objects containing either simplified text chunks or error messages. TYPE: `StreamingResponse`

Source code in docs/repositories-clones/backend/src/endpoints.py

@router.post(
    "/simplify-stream",
    response_class=StreamingResponse,
    summary="Streaming Text Simplification",
    description=(
        "Starts streaming the simplified text output.\n\n"
        "Each simplified text segment is sent sequentially until the full simplification is completed.\n"
        "Note: Unlike /simplify, the text is streamed here and not returned as a complete response.\n\n"
    ),
    openapi_extra={
        "requestBody": {
            "content": {
                "application/json": {
                    "examples": SimplifyInput.model_config["json_schema_extra"]["openapi_examples"]
                }
            }
        }
    },
    responses={
        200: {
            "description": "Streaming successfully started.",
            "content": {
                "application/x-ndjson": {
                    "schema": SimplifyStreamOutput.model_json_schema(),
                    "examples": SimplifyStreamOutput.model_config["json_schema_extra"][
                        "openapi_examples"
                    ],
                }
            },
        },
        503: {
            "description": "LLM provider not available according to the regular API check.",
            "content": {
                "application/json": {
                    "schema": {
                        "type": "object",
                        "properties": {"detail": {"type": "string"}},
                        "example": {
                            "detail": (
                                "Das Sprachmodell ist nicht verfügbar. "
                                "Bitte versuchen Sie es später erneut."
                            )
                        },
                    }
                }
            },
        },
    },
)
async def simplify_user_text_stream(
    request: Request,
    simplify_input: SimplifyInput,
) -> StreamingResponse:
    """Simplify the user text based on the provided parameters and return the result as a stream.

    Args:
        request (Request): Request-Data.
        simplify_input (SimplifyInput):
            The input text along with any additional parameters for simplification.

    Returns:
        StreamingResponse:
            A stream of NDJSON objects containing either simplified text chunks or error messages.
    """
    language_model = get_valid_language_model(
        simplify_input.language_model, request.app.state.available_llms
    )
    simplify_model = simplify_registry.simplify_models.get(language_model)

    return StreamingResponse(
        simplify_registry.simplify_stream(simplify_model, simplify_input),
        media_type="application/x-ndjson",
    )

models

MODULE	DESCRIPTION
`api_input`	pydantic Models for API input parameters.
`api_output`	pydantic Models for API output parameters.
`general`	Load and check Settings from yml.
`llms`	Pydantic models for LLM configuration.

api_input

pydantic Models for API input parameters.

CLASS	DESCRIPTION
`Feedback`	Feedback model for storing user feedback in a database.
`FeedbackBase`	Base class for the user feedback to expose in the API.
`SimplifyInput`	Input model for /simplify endpoint to simplify text input.

FUNCTION	DESCRIPTION
`get_formated_now`	Get the current time in UTC formatted as a string.

Feedback

Bases: FeedbackBase

Feedback model for storing user feedback in a database.

tablename (str): The name of the table in the database. Defaults to "behoerden_klartext_feedback" id (int, optional): The unique identifier for each feedback entry. Will be set by the database date_time_utc (str): The timestamp when the feedback was recorded in UTC format.

Source code in docs/repositories-clones/backend/src/models/api_input.py

class Feedback(FeedbackBase, table=True):
    """Feedback model for storing user feedback in a database.

    __tablename__ (str): The name of the table in the database. Defaults to "behoerden_klartext_feedback"
    id (int, optional): The unique identifier for each feedback entry. Will be set by the database
    date_time_utc (str): The timestamp when the feedback was recorded in UTC format.
    """

    __tablename__: str = "behoerden_klartext_feedback"  # type: ignore

    id: int | None = Field(default=None, primary_key=True)
    date_time_utc: str = Field(default_factory=get_formated_now)

    model_config = ConfigDict(
        json_schema_extra={
            "openapi_examples": {
                "full_feedback": {
                    "summary": "Example of complete feedback with ID and timestamp",
                    "description": "A complete feedback entry with all evaluation fields, including ID and date.",
                    "value": {
                        "id": 1,
                        "date_time_utc": "2025-12-08T12:00:00Z",
                        "understandability": "Sehr verständlich",
                        "style": "Flüssig und klar",
                        "accuracy": "Inhaltlich korrekt",
                        "grammar": "Keine Fehler",
                        "structure": "Gut strukturiert",
                        "other": "Keine zusätzlichen Anmerkungen",
                        "original": "Der Versicherungsfall wird geprüft.",
                        "optimized": "Wir prüfen den Versicherungsfall gerade.",
                    },
                },
            }
        }
    )

FeedbackBase

Bases: SQLModel

Base class for the user feedback to expose in the API.

ATTRIBUTE	DESCRIPTION
`understandability`	Bewertung der Verständlichkeit. TYPE: `str`
`style`	Bewertung des Stils. TYPE: `str`
`accuracy`	Bewertung der inhaltlichen Genauigkeit. TYPE: `str`
`grammar`	Bewertung der Grammatik. TYPE: `str`
`structure`	Bewertung der Struktur. TYPE: `str`
`other`	Sonstige Anmerkungen. TYPE: `str`
`original`	Originaltext. TYPE: `str`
`optimized`	Optimierter Text. TYPE: `str`

Source code in docs/repositories-clones/backend/src/models/api_input.py

class FeedbackBase(SQLModel):
    """Base class for the user feedback to expose in the API.

    Attributes:
        understandability (str): Bewertung der Verständlichkeit.
        style (str): Bewertung des Stils.
        accuracy (str): Bewertung der inhaltlichen Genauigkeit.
        grammar (str): Bewertung der Grammatik.
        structure (str): Bewertung der Struktur.
        other (str): Sonstige Anmerkungen.
        original (str): Originaltext.
        optimized (str): Optimierter Text.
    """

    understandability: str
    style: str
    accuracy: str
    grammar: str
    structure: str
    other: str
    original: str
    optimized: str

    model_config = ConfigDict(
        json_schema_extra={
            "openapi_examples": {
                "full_feedback": {
                    "summary": "Example of complete feedback",
                    "description": "A complete feedback entry with all evaluation fields.",
                    "value": {
                        "understandability": "Sehr verständlich",
                        "style": "Flüssig und klar",
                        "accuracy": "Inhaltlich korrekt",
                        "grammar": "Keine Fehler",
                        "structure": "Gut strukturiert",
                        "other": "Keine zusätzlichen Anmerkungen",
                        "original": "Der Versicherungsfall wird geprüft.",
                        "optimized": "Wir prüfen den Versicherungsfall.",
                    },
                }
            }
        }
    )

SimplifyInput

Bases: BaseModel

Input model for /simplify endpoint to simplify text input.

input_text (str): The text to be simplified. language_model (str | None): The identifier of the language model to use. If None, the first model in the list of available LLMs will be used.

Source code in docs/repositories-clones/backend/src/models/api_input.py

class SimplifyInput(BaseModel):
    """Input model for /simplify endpoint to simplify text input.

    input_text (str): The text to be simplified.
    language_model (str | None): The identifier of the language model to use.
        If None, the first model in the list of available LLMs will be used.
    """

    input_text: str
    language_model: str | None = None

    model_config = ConfigDict(
        json_schema_extra={
            "openapi_examples": {
                "simple_example": {
                    "summary": "Simple text simplification request",
                    "description": "A straightforward request to simplify a short text.",
                    "value": {
                        "input_text": (
                            "Die Anfrage wird derzeit unter Berücksichtigung aller "
                            "rechtlich relevanten Aspekte geprüft."
                        ),
                        "language_model": "test_model_mock",
                    },
                }
            }
        }
    )

get_formated_now

get_formated_now()

Get the current time in UTC formatted as a string.

Source code in docs/repositories-clones/backend/src/models/api_input.py

def get_formated_now() -> str:
    """Get the current time in UTC formatted as a string."""
    return datetime.now(UTC).strftime("%Y-%m-%d %H:%M:%S")

api_output

pydantic Models for API output parameters.

CLASS	DESCRIPTION
`SimplifyOutput`	Represents the result of a text simplification process.
`SimplifyStreamOutput`	Represents the streamed result of a text simplification process.

SimplifyOutput

Bases: BaseModel

Represents the result of a text simplification process.

ATTRIBUTE	DESCRIPTION
`input_text`	The original input text. TYPE: `str`
`simplified_text`	The simplified version of the input text. TYPE: `str`

Source code in docs/repositories-clones/backend/src/models/api_output.py

class SimplifyOutput(BaseModel):
    """Represents the result of a text simplification process.

    Attributes:
        input_text (str): The original input text.
        simplified_text (str): The simplified version of the input text.
    """

    input_text: str
    simplified_text: str

    model_config = ConfigDict(
        json_schema_extra={
            "openapi_examples": {
                "simple_example": {
                    "summary": "Simplified text",
                    "description": "Typical output showing the original text and the simplified version.",
                    "value": {
                        "input_text": (
                            "Die Anfrage wird derzeit unter Berücksichtigung aller "
                            "rechtlich relevanten Aspekte geprüft."
                        ),
                        "simplified_text": "Die Anfrage wird geprüft.",
                    },
                },
            }
        }
    )

SimplifyStreamOutput

Bases: BaseModel

Represents the streamed result of a text simplification process.

ATTRIBUTE	DESCRIPTION
`output_type`	Indicates whether the output is a simplification response or an error. TYPE: `Literal['response', 'error']`
`simplified_text`	The simplified text if output_type is "response". TYPE: `Optional[str]`
`error_message`	The error description if output_type is "error". TYPE: `Optional[str]`

Source code in docs/repositories-clones/backend/src/models/api_output.py

class SimplifyStreamOutput(BaseModel):
    """Represents the streamed result of a text simplification process.

    Attributes:
        output_type (Literal["response", "error"]):
            Indicates whether the output is a simplification response or an error.
        simplified_text (Optional[str]):
            The simplified text if output_type is "response".
        error_message (Optional[str]):
            The error description if output_type is "error".
    """

    output_type: Literal["response", "error"]
    simplified_text: str | None = None
    error_message: str | None = None

    model_config = ConfigDict(
        json_schema_extra={
            "openapi_examples": {
                "response_chunk": {
                    "summary": "Simplification response chunk",
                    "description": (
                        "Represents a single segment of the simplified text that is returned during streaming."
                    ),
                    "value": {
                        "output_type": "response",
                        "simplified_text": "Die Anfrage wird geprüft.",
                        "error_message": None,
                    },
                },
                "internal_error": {
                    "summary": "Internal error",
                    "description": "Returned when an unexpected internal error occurs during streaming.",
                    "value": {
                        "output_type": "error",
                        "simplified_text": None,
                        "error_message": "Interner Fehler während des Streamings.",
                    },
                },
            }
        }
    )

general

Load and check Settings from yml.

CLASS	DESCRIPTION
`ActiveLLMs`	Selection of available models for respective use cases.
`LogLevel`	Enum class specifying possible log levels.
`OpenTelemetry`	Settings for OpenTelemetry integration.
`Settings`	General Settings for the service.

ActiveLLMs

Bases: BaseModel

Selection of available models for respective use cases.

ATTRIBUTE	DESCRIPTION
`model_config`	Used to ignore other services, which are defined in the config. TYPE: `ConfigDict`
`behoerden_klartext`	List containing available models for Behörden-KlarText. It may contain only a subset of all models in llm_models.yml. TYPE: `list[str]`

Source code in docs/repositories-clones/backend/src/models/general.py

class ActiveLLMs(BaseModel):
    """Selection of available models for respective use cases.

    Attributes:
        model_config (ConfigDict): Used to ignore other services, which are defined in the config.
        behoerden_klartext (list[str]): List containing available models for Behörden-KlarText.
            It may contain only a subset of all models in llm_models.yml.
    """

    model_config = ConfigDict(extra="ignore")

    behoerden_klartext: list[str]

LogLevel

Bases: StrEnum

Enum class specifying possible log levels.

Source code in docs/repositories-clones/backend/src/models/general.py

class LogLevel(StrEnum):
    """Enum class specifying possible log levels."""

    CRITICAL = "CRITICAL"
    ERROR = "ERROR"
    WARNING = "WARNING"
    INFO = "INFO"
    DEBUG = "DEBUG"

    @classmethod
    def _missing_(cls, value: object) -> None:
        """Convert strings to uppercase and recheck for existance."""
        if isinstance(value, str):
            value = value.upper()
            for level in cls:
                if level == value:
                    return level
        return None

OpenTelemetry

Bases: BaseModel

Settings for OpenTelemetry integration.

ATTRIBUTE	DESCRIPTION
`model_config`	Used to ignore other services, which are defined in the config. TYPE: `ConfigDict`
`enabled`	Whether to enable OpenTelemetry exporting. TYPE: `bool`
`otlp_grpc_endpoint`	Endpoint of the OTLP receiver to export signals to. TYPE: `str`
`export_interval_in_ms`	Interval in milliseconds to export signals. TYPE: `PositiveInt`

Source code in docs/repositories-clones/backend/src/models/general.py

class OpenTelemetry(BaseModel):
    """Settings for OpenTelemetry integration.

    Attributes:
        model_config (ConfigDict): Used to ignore other services, which are defined in the config.
        enabled (bool): Whether to enable OpenTelemetry exporting.
        otlp_grpc_endpoint (str): Endpoint of the OTLP receiver to export signals to.
        export_interval_in_ms (PositiveInt): Interval in milliseconds to export signals.
    """

    model_config = ConfigDict(extra="ignore")

    enabled: bool = False
    otlp_grpc_endpoint: AnyHttpUrl = "http://otel-collector:4317"
    export_interval_in_ms: PositiveInt = 60_000

Settings

Bases: BaseModel

General Settings for the service.

ATTRIBUTE	DESCRIPTION
`model_config`	Used to ignore other services, which are defined in the config. TYPE: `ConfigDict`
`service_name`	Name of service, i.e. 'behoerden_klartext' TYPE: `str`
`service_description`	Description of the service. TYPE: `str`
`api_prefix`	Prefix for all API endpoints. TYPE: `str`
`active_llms`	Selection of available models for respective use cases. TYPE: `ActiveLLMs`
`check_llm_api_interval_in_s`	Interval for checking all LLM APIs (seconds). TYPE: `PositiveInt`
`log_level`	Minimal level of logging output given. TYPE: `LogLevel`
`exit_log_level`	LogLevel at which the app will exit; should not be lower than ERROR. TYPE: `LogLevel`
`debug_mode_server`	Should be False in production. TYPE: `bool`
`log_file_max_bytes`	(PositiveInt): Max file size for logfile TYPE: `PositiveInt`
`log_file_backup_count`	Number of log-files to loop over TYPE: `PositiveInt`
`log_file`	Write logfile there. TYPE: `FilePath`
`n_uvicorn_workers`	Number of parallel uvicorn instances. TYPE: `PositiveInt`
`path_container_feedback_db`	Path of Feedback-DB inside the container (relative to backend folder). TYPE: `Path`
`name_feedback_db`	Name of the feedback database file. TYPE: `str`

METHOD	DESCRIPTION
`ensure_log_dir`	Create the log and feedback directory after validation.

Source code in docs/repositories-clones/backend/src/models/general.py

class Settings(BaseModel):
    """General Settings for the service.

    Attributes:
        model_config (ConfigDict): Used to ignore other services, which are defined in the config.
        service_name (str): Name of service, i.e. 'behoerden_klartext'
        service_description (str): Description of the service.
        api_prefix (str): Prefix for all API endpoints.
        active_llms (ActiveLLMs): Selection of available models for respective use cases.
        check_llm_api_interval_in_s (PositiveInt): Interval for checking all LLM APIs (seconds).
        log_level (LogLevel): Minimal level of logging output given.
        exit_log_level (LogLevel): LogLevel at which the app will exit; should not be lower than ERROR.
        debug_mode_server (bool): Should be False in production.
        log_file_max_bytes: (PositiveInt): Max file size for logfile
        log_file_backup_count (PositiveInt): Number of log-files to loop over
        log_file (FilePath): Write logfile there.
        n_uvicorn_workers (PositiveInt): Number of parallel uvicorn instances.
        path_container_feedback_db (Path): Path of Feedback-DB inside the container (relative to backend folder).
        name_feedback_db (str): Name of the feedback database file.
    """

    model_config = ConfigDict(extra="ignore")

    service_name: str = "Behörden-KlarText"
    service_description: str = "Vereinfachung von Behördentexten mit Hilfe von Sprachmodellen."
    api_prefix: str = ""

    active_llms: ActiveLLMs
    check_llm_api_interval_in_s: PositiveInt = 120

    otel: OpenTelemetry = OpenTelemetry()

    log_level: LogLevel = LogLevel.INFO
    exit_log_level: LogLevel = LogLevel.CRITICAL
    debug_mode_server: bool = False

    log_file_max_bytes: PositiveInt = 1 * 1024 * 1024
    log_file_backup_count: PositiveInt = 3
    log_file: FilePath = Path("/backend/logs/log")

    n_uvicorn_workers: PositiveInt = 1

    path_container_feedback_db: Path = Path("/backend/feedback")
    name_feedback_db: str = "feedback.sqlite"

    @model_validator(mode="after")
    def ensure_log_dir(self) -> "Settings":
        """Create the log and feedback directory after validation."""
        self.log_file.parent.mkdir(parents=True, exist_ok=True)
        self.path_container_feedback_db.mkdir(parents=True, exist_ok=True)
        return self

ensure_log_dir

ensure_log_dir()

Create the log and feedback directory after validation.

Source code in docs/repositories-clones/backend/src/models/general.py

@model_validator(mode="after")
def ensure_log_dir(self) -> "Settings":
    """Create the log and feedback directory after validation."""
    self.log_file.parent.mkdir(parents=True, exist_ok=True)
    self.path_container_feedback_db.mkdir(parents=True, exist_ok=True)
    return self

llms

Pydantic models for LLM configuration.

CLASS	DESCRIPTION
`APIAuth`	Defines Authentification settings for LLM.
`LLM`	Configuration of a Large Langauge Model.
`LLMAPI`	Defines API-Connection to LLM.
`LLMConfig`	Base class as loaded from model_configs.yml.
`LLMInference`	Defines Inference parameters.
`LLMPromptConfig`	Defines the structure of a LLM prompt configuration.
`LLMPromptMaps`	Defines complete LLM prompt config.
`LLMPrompts`	Defines the selectable LLM Prompts.

APIAuth

Bases: BaseModel

Defines Authentification settings for LLM.

ATTRIBUTE	DESCRIPTION
`type`	Either 'token' or 'basic_auth'. TYPE: `Literal`
`secret_path`	File path where the api token or credentials are stored. TYPE: `FilePath`

METHOD	DESCRIPTION
`get_auth_header`	Generate auth part of header for http request.

Source code in docs/repositories-clones/backend/src/models/llms.py

class APIAuth(BaseModel):
    """Defines Authentification settings for LLM.

    Attributes:
        type (Literal): Either 'token' or 'basic_auth'.
        secret_path (FilePath): File path where the api token or credentials are stored.
    """

    type: Literal["token", "basic_auth"]
    secret_path: FilePath

    @property
    def secret(self) -> SecretStr:
        """Load secret variable as 'secret'."""
        with open(self.secret_path) as file:
            return SecretStr(file.read().strip())

    def get_auth_header(self) -> str:
        """Generate auth part of header for http request.

        Returns:
            The auth header.
        """
        auth_header = ""

        if self.type == "basic_auth":
            auth_header = (
                f"Basic {base64.b64encode(self.secret.get_secret_value().encode()).decode()}"
            )
        elif self.type == "token":
            auth_header = f"Bearer {self.secret.get_secret_value()}"

        return auth_header

secret property

secret

Load secret variable as 'secret'.

get_auth_header

get_auth_header()

Generate auth part of header for http request.

RETURNS	DESCRIPTION
`str`	The auth header.

Source code in docs/repositories-clones/backend/src/models/llms.py

def get_auth_header(self) -> str:
    """Generate auth part of header for http request.

    Returns:
        The auth header.
    """
    auth_header = ""

    if self.type == "basic_auth":
        auth_header = (
            f"Basic {base64.b64encode(self.secret.get_secret_value().encode()).decode()}"
        )
    elif self.type == "token":
        auth_header = f"Bearer {self.secret.get_secret_value()}"

    return auth_header

LLM

Bases: BaseModel

Configuration of a Large Langauge Model.

ATTRIBUTE	DESCRIPTION
`label`	Human-readable model name that can be presented to users. TYPE: `str`
`model`	Model name which is used in API call, e.g. ollama tag. TYPE: `str`
`prompt_map`	Prompt map name to load LLMPromptMaps and LLMProtectedWordsMaps from. TYPE: `str`
`is_remote`	Is this LLM hosted at an external API? TYPE: `bool`
`max_input_length`	Total context length of the LLM. TYPE: `int`
`api`	API information. TYPE: `LLMAPI`
`inference`	Inference parameters. TYPE: `LLMInference`
`prompt_config`	Prompts. TYPE: `LLMPromptConfig`

Source code in docs/repositories-clones/backend/src/models/llms.py

class LLM(BaseModel):
    """Configuration of a Large Langauge Model.

    Attributes:
        label (str): Human-readable model name that can be presented to users.
        model (str): Model name which is used in API call, e.g. ollama tag.
        prompt_map (str): Prompt map name to load LLMPromptMaps and LLMProtectedWordsMaps from.
        is_remote (bool): Is this LLM hosted at an external API?
        max_input_length (int): Total context length of the LLM.
        api (LLMAPI): API information.
        inference (LLMInference): Inference parameters.
        prompt_config (LLMPromptConfig): Prompts.
    """

    label: str
    model: str
    prompt_map: str
    is_remote: bool
    max_input_length: int = 9_000
    api: LLMAPI
    inference: LLMInference
    prompt_config: LLMPromptConfig = None

LLMAPI

Bases: BaseModel

Defines API-Connection to LLM.

ATTRIBUTE	DESCRIPTION
`url`	URL to model. TYPE: `AnyHttpUrl`
`health_check`	Relative path to health check, i.e. '/models' TYPE: `str \| None`
`auth`	Authentification settings for LLM TYPE: `APIAuth \| None`

METHOD	DESCRIPTION
`get_health_check_url`	Get the URL to check if API is available.

Source code in docs/repositories-clones/backend/src/models/llms.py

class LLMAPI(BaseModel):
    """Defines API-Connection to LLM.

    Attributes:
        url (AnyHttpUrl): URL to model.
        health_check (str | None): Relative path to health check, i.e. '/models'
        auth (APIAuth | None): Authentification settings for LLM
    """

    url: AnyHttpUrl
    health_check: str | None = None
    auth: APIAuth | None = None

    def get_health_check_url(self) -> str:
        """Get the URL to check if API is available."""
        if self.health_check:
            # make sure to remove trailing and leading slashes to not override path
            return urljoin(
                str(self.url).rstrip("/") + "/",
                self.health_check.lstrip("/"),
            )
        return str(self.url)

get_health_check_url

get_health_check_url()

Get the URL to check if API is available.

Source code in docs/repositories-clones/backend/src/models/llms.py

def get_health_check_url(self) -> str:
    """Get the URL to check if API is available."""
    if self.health_check:
        # make sure to remove trailing and leading slashes to not override path
        return urljoin(
            str(self.url).rstrip("/") + "/",
            self.health_check.lstrip("/"),
        )
    return str(self.url)

LLMConfig

Bases: BaseModel

Base class as loaded from model_configs.yml.

ATTRIBUTE	DESCRIPTION
`model_config`	Used to ignore other services, which are defined in the config. TYPE: `ConfigDict`
`behoerden_klartext`	Dictionary containing a name and definition of available LLMs. The first entry in the dictionary will be used as the default model. TYPE: `dict[str, LLM]`

Source code in docs/repositories-clones/backend/src/models/llms.py

class LLMConfig(BaseModel):
    """Base class as loaded from model_configs.yml.

    Attributes:
        model_config (ConfigDict): Used to ignore other services, which are defined in the config.
        behoerden_klartext (dict[str, LLM]): Dictionary containing a name and definition of available LLMs.
            The first entry in the dictionary will be used as the default model.
    """

    # if there are more services defined in the config: just ignore them
    model_config = ConfigDict(extra="ignore")

    behoerden_klartext: dict[str, LLM]

    def __iter__(self) -> Iterator[str]:
        """Get 'keys' for automatic merge with i.e. LLMPromptConfig."""
        return iter(self.__dict__.keys())

    def __getitem__(self, service: str) -> dict[str, LLM]:
        """Get all LLMs for a given service (e.g. "behoerden_klartext", "rag").

        Args:
            service (str): The service name (e.g., "behoerden_klartext", "rag").

        Returns:
            All configered LLMs for the given service.
        """
        return self.__getattribute__(service)

LLMInference

Bases: BaseModel

Defines Inference parameters.

ATTRIBUTE	DESCRIPTION
`temperature`	Randomness / variation of the output High values indicate more creativity. TYPE: `PositiveFloat`
`top_p`	Threshold for sampling only from the most likely tokens. TYPE: `PositiveFloat`
`frequency_penalty`	Reduces the likelihood of repeating tokens based on their existing frequency in the text. TYPE: `float`
`presence_penalty`	Encourages the model to introduce new tokens by penalizing tokens that have already appeared. TYPE: `float`

Source code in docs/repositories-clones/backend/src/models/llms.py

class LLMInference(BaseModel):
    """Defines Inference parameters.

    Attributes:
        temperature (PositiveFloat): Randomness / variation of the output High values indicate more creativity.
        top_p (PositiveFloat): Threshold for sampling only from the most likely tokens.
        frequency_penalty (float): Reduces the likelihood of repeating tokens based on their existing frequency
            in the text.
        presence_penalty (float): Encourages the model to introduce new tokens by penalizing tokens that have
            already appeared.
    """

    temperature: float = 0.7
    top_p: float = 1.0
    frequency_penalty: float = 0.0
    presence_penalty: float = 0.0

LLMPromptConfig

Bases: BaseModel

Defines the structure of a LLM prompt configuration.

ATTRIBUTE	DESCRIPTION
`model_config`	Used to ignore other services, which are defined in the config. TYPE: `ConfigDict`
`system`	System prompt. TYPE: `LLMPrompts`
`user`	User prompt. TYPE: `LLMPrompts`
`protected_words`	Protected words. TYPE: `list`
`protected_words_stem`	Wpord stem of protected words. TYPE: `list`

Source code in docs/repositories-clones/backend/src/models/llms.py

class LLMPromptConfig(BaseModel):
    """Defines the structure of a LLM prompt configuration.

    Attributes:
        model_config (ConfigDict): Used to ignore other services, which are defined in the config.
        system (LLMPrompts): System prompt.
        user (LLMPrompts): User prompt.
        protected_words (list): Protected words.
        protected_words_stem (list): Wpord stem of protected words.
    """

    # if there are more prompt types defined that are not used in this service: just ignore them
    model_config = ConfigDict(extra="ignore")

    system: LLMPrompts
    user: LLMPrompts
    protected_words: list[str]
    protected_words_stem: list[list[str]] = []

    def __init__(self, **data: object) -> None:
        """Initializes the config and computes the stemmed protected words."""
        super().__init__(**data)
        self.protected_words_stem = [stem_string(word) for word in self.protected_words]

LLMPromptMaps

Bases: BaseModel

Defines complete LLM prompt config.

ATTRIBUTE	DESCRIPTION
`model_config`	Used to ignore other services, which are defined in the config. TYPE: `ConfigDict`
`behoerden_klartext`	Dictionary containing a name and prompts of LLMs's available for behoerden_klartext. TYPE: `dict[str, LLMPromptConfig]`

Source code in docs/repositories-clones/backend/src/models/llms.py

class LLMPromptMaps(BaseModel):
    """Defines complete LLM prompt config.

    Attributes:
        model_config (ConfigDict): Used to ignore other services, which are defined in the config.
        behoerden_klartext (dict[str, LLMPromptConfig]): Dictionary containing a name and prompts of LLMs's
            available for behoerden_klartext.
    """

    model_config = ConfigDict(extra="ignore")

    behoerden_klartext: dict[str, LLMPromptConfig]

    def __iter__(self) -> Iterator[str]:
        """Get 'keys' for automatic merge with i.e. LLMConfig."""
        return iter(self.__dict__.keys())

LLMPrompts

Bases: BaseModel

Defines the selectable LLM Prompts.

ATTRIBUTE	DESCRIPTION
`model_config`	Used to ignore other services, which are defined in the config. TYPE: `ConfigDict`
`simplify`	Prompt for model. TYPE: `str`

Source code in docs/repositories-clones/backend/src/models/llms.py

class LLMPrompts(BaseModel):
    """Defines the selectable LLM Prompts.

    Attributes:
        model_config (ConfigDict): Used to ignore other services, which are defined in the config.
        simplify (str): Prompt for model.
    """

    # if there are more prompts defined that are not used in this service: just ignore them
    model_config = ConfigDict(extra="ignore")

    simplify: str = ""

services

MODULE	DESCRIPTION
`feedback`	Handle feedback from users.
`simplify`
`stem`	Stem service for calculation of word stems.

feedback

Handle feedback from users.

FUNCTION	DESCRIPTION
`add_feedback_to_db`	Add a new feedback entry to the database.
`get_db_session`	Create and provide a new Session for each request for interacting with the database.
`init_feedback_db`	Initialize the feedback database. If it does not exist, create it.

add_feedback_to_db

add_feedback_to_db(feedback, session)

Add a new feedback entry to the database.

PARAMETER	DESCRIPTION
`feedback`	Feedback text given by the user TYPE: `Feedback`
`session`	SQLAlchemy/SQLModel session object TYPE: `Session`

Retuns

Feedback: The feedback instance with ID in the database and time stamp

Source code in docs/repositories-clones/backend/src/services/feedback.py

def add_feedback_to_db(feedback: Feedback, session: Session) -> Feedback:
    """Add a new feedback entry to the database.

    Args:
        feedback (Feedback): Feedback text given by the user
        session (Session): SQLAlchemy/SQLModel session object

    Retuns:
        Feedback: The feedback instance with ID in the database and time stamp
    """
    logger.info("Adding feedback to DB...")

    session.add(feedback)
    session.commit()
    session.refresh(feedback)
    return feedback

get_db_session

get_db_session()

Create and provide a new Session for each request for interacting with the database.

YIELDS	DESCRIPTION
`Session`	A new SQLAlchemy/SQLModel session object TYPE:: `Session`

Source code in docs/repositories-clones/backend/src/services/feedback.py

def get_db_session() -> Generator[Session, Any, None]:
    """Create and provide a new Session for each request for interacting with the database.

    Yields:
        Session: A new SQLAlchemy/SQLModel session object
    """
    with Session(db_engine) as session:
        yield session

init_feedback_db

init_feedback_db()

Initialize the feedback database. If it does not exist, create it.

Source code in docs/repositories-clones/backend/src/services/feedback.py

def init_feedback_db() -> None:
    """Initialize the feedback database. If it does not exist, create it."""
    logger.info("Initializing feedback database...")
    logger.debug(
        f"Creating database at {settings.path_container_feedback_db / settings.name_feedback_db}"
    )

    settings.path_container_feedback_db.mkdir(parents=True, exist_ok=True)

    # create table if not exists
    SQLModel.metadata.create_all(db_engine)

    logger.info(
        f"Feedback DB initialised: {(settings.path_container_feedback_db / settings.name_feedback_db).absolute()}"
    )

simplify

MODULE	DESCRIPTION
`openai`	Class for text simplification using OpenAI.
`registry`	Simplify-Registry class for storing and accessing Simplify-Providers (ProviderOpenAiLike).

openai

Class for text simplification using OpenAI.

CLASS	DESCRIPTION
`ProviderOpenAiLike`	Class for chat completions to simplify texts with OpenAI-compatible LLM provider.

ProviderOpenAiLike

Class for chat completions to simplify texts with OpenAI-compatible LLM provider.

ATTRIBUTE	DESCRIPTION
`llm`	Object describing the LLM. TYPE: `LLM`
`auth_client`	Authentication client for various APIs. TYPE: `CustomAuthClient`
`llm_client`	LLM client using AsnycOpenAI API. TYPE: `AsyncOpenAI`

METHOD	DESCRIPTION
`generate`	Take an object of type SimplifyInput as input and return a model-generated message as output.
`_generate`	Take a list of messages as input and return a model-generated message as output.

Source code in docs/repositories-clones/backend/src/services/simplify/openai.py

class ProviderOpenAiLike:
    """Class for chat completions to simplify texts with OpenAI-compatible LLM provider.

    Attributes:
        llm (LLM): Object describing the LLM.
        auth_client (CustomAuthClient): Authentication client for various APIs.
        llm_client (AsyncOpenAI): LLM client using AsnycOpenAI API.

    Methods:
        generate: Take an object of type SimplifyInput as input and return a model-generated message as output.
        _generate: Take a list of messages as input and return a model-generated message as output.
    """

    def __init__(self, llm: LLM) -> None:
        """Initializes the model with the LLM and the credentials."""
        self.llm: LLM = llm
        self.auth_client: CustomAuthClient = self._setup_auth_client()
        self.llm_client: AsyncOpenAI = self._setup_llm_client()

    async def generate_stream(
        self, simplify_input: SimplifyInput
    ) -> AsyncGenerator[ChatCompletionChunk | dict, None]:
        """Take an object of type SimplifyInput as input and return a stream of model-generated messages as output.

        Args:
            simplify_input (SimplifyInput): Input to the model.

        Yields:
            ChatCompletionChunk | dict: Individual chunks of the LLM streaming response
            or a dict with keys ``type`` and ``message`` in case of errors.
        """
        matched_protected_words = self._find_matched_protected_words(
            input_text=simplify_input.input_text
        )
        protected_words = (
            ", ".join(matched_protected_words)
            if matched_protected_words
            else "Keine Begriffe vorhanden."
        )
        logger.debug(f"Geschützte Begriffe: {protected_words}")

        messages = [
            {
                "role": "system",
                "content": self.llm.prompt_config.system.simplify.format(
                    protected_words=protected_words
                ),
            },
            {
                "role": "user",
                "content": self.llm.prompt_config.user.simplify.format(
                    input_text=simplify_input.input_text
                ),
            },
        ]
        logger.debug(f"Messages: {messages}")

        async for chunk in self._generate_stream(messages=messages, response_format="text"):
            yield chunk

    async def generate(self, simplify_input: SimplifyInput) -> str:
        """Take an object of type SimplifyInput as input and return a model-generated message as output.

        Args:
            simplify_input (SimplifyInput): Input to the model.

        Returns:
            str: Model-generated response text.
        """
        matched_protected_words = self._find_matched_protected_words(
            input_text=simplify_input.input_text
        )
        protected_words = (
            ", ".join(matched_protected_words)
            if matched_protected_words
            else "Keine Begriffe vorhanden."
        )
        logger.debug(f"Geschützte Begriffe: {protected_words}")

        messages = [
            {
                "role": "system",
                "content": self.llm.prompt_config.system.simplify.format(
                    protected_words=protected_words
                ),
            },
            {
                "role": "user",
                "content": self.llm.prompt_config.user.simplify.format(
                    input_text=simplify_input.input_text
                ),
            },
        ]
        logger.debug(f"Messages: {messages}")

        response: ChatCompletion = await self._generate(messages, response_format="text")
        content: str = response.choices[0].message.content  # type: ignore

        logger.debug(f"Content: {content}")

        return content

    async def _generate_stream(
        self,
        messages: list,
        response_format: str = "text",
    ) -> AsyncGenerator[ChatCompletionChunk | dict, None]:
        """Take a list of messages as input and return the stream of a model-generated message as output.

        Args:
            messages (list): Messages as input to the model.
            response_format (str): Format of the response.

        Yields:
            ChatCompletionChunk | dict: The model-generated chunks,
            or a dict with keys ``type`` and ``message`` in case of errors.
        """
        try:
            response = await self.llm_client.chat.completions.create(
                model=self.llm.model,
                messages=messages,
                response_format={"type": response_format},
                frequency_penalty=self.llm.inference.frequency_penalty,
                presence_penalty=self.llm.inference.presence_penalty,
                temperature=self.llm.inference.temperature,
                top_p=self.llm.inference.top_p,
                stream=True,
            )

            async for chunk in response:
                yield chunk

        except Exception as e:
            logger.error(f"Error during streaming: {e}")
            yield {
                "type": "error",
                "message": "Interner Fehler während des Streamings.",
            }

    async def _generate(self, messages: list, response_format: str = "text") -> ChatCompletion:
        """Take a list of messages as input and return a model-generated message as output.

        Args:
            messages (list): Messages as input to the model.
            response_format (str): Format of the response.

        Returns:
            ChatCompletion: Model-generated response.
        """
        try:
            response: ChatCompletion = await self.llm_client.chat.completions.create(
                model=self.llm.model,
                messages=messages,
                response_format={"type": response_format},
                frequency_penalty=self.llm.inference.frequency_penalty,
                presence_penalty=self.llm.inference.presence_penalty,
                temperature=self.llm.inference.temperature,
                top_p=self.llm.inference.top_p,
                stream=False,
            )
            logger.debug(f"Response from LLM-Client: {response}")

        except Exception as e:
            logger.error(f"{self.llm.label} API call of Chat-Completion to LLM failed: {e}")
            raise HTTPException(
                status_code=status.HTTP_502_BAD_GATEWAY,
                detail="Interner Fehler beim Aufruf des Sprachmodells. Bitte versuchen Sie es später erneut.",
            )

        return response

    def _find_matched_protected_words(self, input_text: str) -> list[str]:
        """Finds all protected words whose stem patterns appear in the stemmed input text.

        Args:
            input_text (str): The text to check for protected words.

        Returns:
            list[str]: List of protected words whose stem patterns were fully matched in the input text.
        """
        input_text_stems = set(stem_string(input_text))

        matched_words = [
            word
            for word, pattern in zip(
                self.llm.prompt_config.protected_words,
                self.llm.prompt_config.protected_words_stem,
            )
            if set(pattern).issubset(input_text_stems)
        ]

        return matched_words

    def _setup_auth_client(self) -> CustomAuthClient:
        """Set up authentication client for various APIs.

        Sets up an authentication client using either a token, credentials or no authentication method.

        Returns:
            Authentication client.
        """
        if self.llm.api.auth:
            auth_client = CustomAuthClient(
                secret=self.llm.api.auth.secret.get_secret_value(),
                auth_type=self.llm.api.auth.type,
            )
        else:
            auth_client = CustomAuthClient()

        return auth_client

    def _setup_llm_client(self) -> AsyncOpenAI:
        """Initializing the LLM client using AsnycOpenAI API.

        Returns:
            Asynchronous OpenAI client.
        """
        max_retries = 3
        timeout = 60

        llm_client = AsyncOpenAI(
            api_key=" ",
            http_client=self.auth_client,
            base_url=str(self.llm.api.url),
            timeout=timeout,
            max_retries=max_retries,
        )

        return llm_client

generate async

generate(simplify_input)

Take an object of type SimplifyInput as input and return a model-generated message as output.

PARAMETER	DESCRIPTION
`simplify_input`	Input to the model. TYPE: `SimplifyInput`

RETURNS	DESCRIPTION
`str`	Model-generated response text. TYPE: `str`

Source code in docs/repositories-clones/backend/src/services/simplify/openai.py

async def generate(self, simplify_input: SimplifyInput) -> str:
    """Take an object of type SimplifyInput as input and return a model-generated message as output.

    Args:
        simplify_input (SimplifyInput): Input to the model.

    Returns:
        str: Model-generated response text.
    """
    matched_protected_words = self._find_matched_protected_words(
        input_text=simplify_input.input_text
    )
    protected_words = (
        ", ".join(matched_protected_words)
        if matched_protected_words
        else "Keine Begriffe vorhanden."
    )
    logger.debug(f"Geschützte Begriffe: {protected_words}")

    messages = [
        {
            "role": "system",
            "content": self.llm.prompt_config.system.simplify.format(
                protected_words=protected_words
            ),
        },
        {
            "role": "user",
            "content": self.llm.prompt_config.user.simplify.format(
                input_text=simplify_input.input_text
            ),
        },
    ]
    logger.debug(f"Messages: {messages}")

    response: ChatCompletion = await self._generate(messages, response_format="text")
    content: str = response.choices[0].message.content  # type: ignore

    logger.debug(f"Content: {content}")

    return content

generate_stream async

generate_stream(simplify_input)

Take an object of type SimplifyInput as input and return a stream of model-generated messages as output.

PARAMETER	DESCRIPTION
`simplify_input`	Input to the model. TYPE: `SimplifyInput`

YIELDS	DESCRIPTION
`AsyncGenerator[ChatCompletionChunk \| dict, None]`	ChatCompletionChunk \| dict: Individual chunks of the LLM streaming response
`AsyncGenerator[ChatCompletionChunk \| dict, None]`	or a dict with keys `type` and `message` in case of errors.

Source code in docs/repositories-clones/backend/src/services/simplify/openai.py

async def generate_stream(
    self, simplify_input: SimplifyInput
) -> AsyncGenerator[ChatCompletionChunk | dict, None]:
    """Take an object of type SimplifyInput as input and return a stream of model-generated messages as output.

    Args:
        simplify_input (SimplifyInput): Input to the model.

    Yields:
        ChatCompletionChunk | dict: Individual chunks of the LLM streaming response
        or a dict with keys ``type`` and ``message`` in case of errors.
    """
    matched_protected_words = self._find_matched_protected_words(
        input_text=simplify_input.input_text
    )
    protected_words = (
        ", ".join(matched_protected_words)
        if matched_protected_words
        else "Keine Begriffe vorhanden."
    )
    logger.debug(f"Geschützte Begriffe: {protected_words}")

    messages = [
        {
            "role": "system",
            "content": self.llm.prompt_config.system.simplify.format(
                protected_words=protected_words
            ),
        },
        {
            "role": "user",
            "content": self.llm.prompt_config.user.simplify.format(
                input_text=simplify_input.input_text
            ),
        },
    ]
    logger.debug(f"Messages: {messages}")

    async for chunk in self._generate_stream(messages=messages, response_format="text"):
        yield chunk

registry

Simplify-Registry class for storing and accessing Simplify-Providers (ProviderOpenAiLike).

CLASS	DESCRIPTION
`SimplifyRegistry`	Manages and stores Simplify-Providers (ProviderOpenAiLike) and makes access possible.

SimplifyRegistry

Manages and stores Simplify-Providers (ProviderOpenAiLike) and makes access possible.

ATTRIBUTE	DESCRIPTION
`simplify_models`	Simplify models. TYPE: `dict[str, ProviderOpenAiLike]`
`llm_config`	Model configuration for simplify initialzation. TYPE: `LLMConfig`

METHOD	DESCRIPTION
`simplify`	Start the simplification of selected language model.
`simplify_stream`	Start the simplification of selected language model.

Source code in docs/repositories-clones/backend/src/services/simplify/registry.py

class SimplifyRegistry:
    """Manages and stores Simplify-Providers (ProviderOpenAiLike) and makes access possible.

    Attributes:
        simplify_models (dict[str, ProviderOpenAiLike]): Simplify models.
        llm_config (LLMConfig): Model configuration for simplify initialzation.
    """

    def __init__(self, llm_config: LLMConfig) -> None:
        """Initializes the list of simplify models."""
        self.llm_config: LLMConfig = llm_config
        self.simplify_models: dict[str, ProviderOpenAiLike] = self._initialize_models()

    def _initialize_models(self) -> dict[str, ProviderOpenAiLike]:
        """Load all available simplify models based on custom configuration.

        Returns:
            All model objects with custom configuration.
        """
        models = {}

        for model_name, llm in self.llm_config.behoerden_klartext.items():
            models[model_name] = ProviderOpenAiLike(llm=llm)

        logger.debug(f"Initialized {len(models)} simplify models")
        return models

    async def simplify(
        self, simplify_model: ProviderOpenAiLike, simplify_input: SimplifyInput
    ) -> str:
        """Start the simplification of selected language model.

        Args:
            simplify_input (SimplifyInput): Defines the input to the endpoint including the text input.
            simplify_model (ProviderOpenAiLike): Initialised model for simplify task.

        Return:
            str: The simplified text and any error messages.
        """
        result = await simplify_model.generate(simplify_input)

        logger.info(f"Simplify successfully completed with model: {simplify_input.language_model}")

        return result

    async def simplify_stream(
        self,
        simplify_model: ProviderOpenAiLike,
        simplify_input: SimplifyInput,
    ) -> AsyncGenerator[str, None]:
        """Start the simplification of selected language model.

        Args:
            simplify_model (ProviderOpenAiLike): Initialised model for simplify task.
            simplify_input (SimplifyInput): Defines the input to the endpoint including the text input.

        Yields:
            str: JSON-encoded ``SimplifyStreamOutput`` strings. Each line
            represents either a response chunk or an error message.
        """
        async for chunk in simplify_model.generate_stream(simplify_input):
            if isinstance(chunk, dict) and chunk.get("type") == "error":
                yield (
                    SimplifyStreamOutput(
                        output_type="error",
                        error_message=f"{chunk['message']}",
                    ).model_dump_json()
                )
                break

            current_content = getattr(chunk.choices[0].delta, "content", None)
            if current_content:
                yield (
                    SimplifyStreamOutput(
                        output_type="response",
                        simplified_text=current_content,
                    ).model_dump_json()
                    + "\n"
                )

simplify async

simplify(simplify_model, simplify_input)

Start the simplification of selected language model.

PARAMETER	DESCRIPTION
`simplify_input`	Defines the input to the endpoint including the text input. TYPE: `SimplifyInput`
`simplify_model`	Initialised model for simplify task. TYPE: `ProviderOpenAiLike`

Return

str: The simplified text and any error messages.

Source code in docs/repositories-clones/backend/src/services/simplify/registry.py

async def simplify(
    self, simplify_model: ProviderOpenAiLike, simplify_input: SimplifyInput
) -> str:
    """Start the simplification of selected language model.

    Args:
        simplify_input (SimplifyInput): Defines the input to the endpoint including the text input.
        simplify_model (ProviderOpenAiLike): Initialised model for simplify task.

    Return:
        str: The simplified text and any error messages.
    """
    result = await simplify_model.generate(simplify_input)

    logger.info(f"Simplify successfully completed with model: {simplify_input.language_model}")

    return result

simplify_stream async

simplify_stream(simplify_model, simplify_input)

Start the simplification of selected language model.

PARAMETER	DESCRIPTION
`simplify_model`	Initialised model for simplify task. TYPE: `ProviderOpenAiLike`
`simplify_input`	Defines the input to the endpoint including the text input. TYPE: `SimplifyInput`

YIELDS	DESCRIPTION
`str`	JSON-encoded `SimplifyStreamOutput` strings. Each line TYPE:: `AsyncGenerator[str, None]`
`AsyncGenerator[str, None]`	represents either a response chunk or an error message.

Source code in docs/repositories-clones/backend/src/services/simplify/registry.py

async def simplify_stream(
    self,
    simplify_model: ProviderOpenAiLike,
    simplify_input: SimplifyInput,
) -> AsyncGenerator[str, None]:
    """Start the simplification of selected language model.

    Args:
        simplify_model (ProviderOpenAiLike): Initialised model for simplify task.
        simplify_input (SimplifyInput): Defines the input to the endpoint including the text input.

    Yields:
        str: JSON-encoded ``SimplifyStreamOutput`` strings. Each line
        represents either a response chunk or an error message.
    """
    async for chunk in simplify_model.generate_stream(simplify_input):
        if isinstance(chunk, dict) and chunk.get("type") == "error":
            yield (
                SimplifyStreamOutput(
                    output_type="error",
                    error_message=f"{chunk['message']}",
                ).model_dump_json()
            )
            break

        current_content = getattr(chunk.choices[0].delta, "content", None)
        if current_content:
            yield (
                SimplifyStreamOutput(
                    output_type="response",
                    simplified_text=current_content,
                ).model_dump_json()
                + "\n"
            )

stem

Stem service for calculation of word stems.

FUNCTION	DESCRIPTION
`stem_string`	Returns stems of substrings of the protected word as list.

stem_string

stem_string(string)

Returns stems of substrings of the protected word as list.

PARAMETER	DESCRIPTION
`string`	Portected word as input string. TYPE: `str`

RETURNS	DESCRIPTION
`list`	Stemmed words as list. TYPE: `list[str]`

Source code in docs/repositories-clones/backend/src/services/stem.py

def stem_string(string: str) -> list[str]:
    """Returns stems of substrings of the protected word as list.

    Args:
        string (str): Portected word as input string.

    Returns:
        list: Stemmed words as list.
    """
    words = re.findall(r"\w+", string.lower())
    stems = stemmer.stemWords(words)
    return stems

settings

Load all settings from a central place, not hidden in utils.

utils

MODULE	DESCRIPTION
`base_logger`	Define settings for the logger.
`check_model_api_availability`	This module provides functions to check LLM-APIs for availability.
`model_selector`	This module provides functions for selecting language models based on API input.
`openai_custom_auth`	Costumized Httpx Authentication Client.
`otel`	Setup OpenTelemetry collections and exporters.
`process_configs`	Methods to load and config and start checks of config integrity.

base_logger

Define settings for the logger.

CLASS	DESCRIPTION
`ExitHandler`	Logging handler that reacts to critical log events by calling sys.exit.

FUNCTION	DESCRIPTION
`setup_logger`	Initialize the logger with the desired log level and add handlers.

ExitHandler

Bases: StreamHandler

Logging handler that reacts to critical log events by calling sys.exit.

When a critical error occurs, this handler terminates the program.

METHOD	DESCRIPTION
`emit`	Override the inherited emit method to exit the program on a critical error.

Source code in docs/repositories-clones/backend/src/utils/base_logger.py

class ExitHandler(logging.StreamHandler):
    """Logging handler that reacts to critical log events by calling sys.exit.

    When a critical error occurs, this handler terminates the program.
    """

    def emit(self, record: logging.LogRecord) -> None:
        """Override the inherited emit method to exit the program on a critical error.

        Args:
            record (logging.LogRecord): log entry
        """
        if record.levelno >= logging.getLevelNamesMapping()[settings.exit_log_level]:
            sys.exit(1)

emit

emit(record)

Override the inherited emit method to exit the program on a critical error.

PARAMETER	DESCRIPTION
`record`	log entry TYPE: `LogRecord`

Source code in docs/repositories-clones/backend/src/utils/base_logger.py

def emit(self, record: logging.LogRecord) -> None:
    """Override the inherited emit method to exit the program on a critical error.

    Args:
        record (logging.LogRecord): log entry
    """
    if record.levelno >= logging.getLevelNamesMapping()[settings.exit_log_level]:
        sys.exit(1)

setup_logger

setup_logger()

Initialize the logger with the desired log level and add handlers.

Sets up the root logger, which all other loggers inherit from. Adds file, console and exit handlers to the logger and sets the format.

Source code in docs/repositories-clones/backend/src/utils/base_logger.py

def setup_logger() -> None:
    """Initialize the logger with the desired log level and add handlers.

    Sets up the root logger, which all other loggers inherit from.
    Adds file, console and exit handlers to the logger and sets the format.
    """
    # root logger, all other loggers inherit from this
    logger = logging.getLogger()

    # create different handlers for log file and console
    file_handler = logging.handlers.RotatingFileHandler(
        filename=settings.log_file,
        maxBytes=settings.log_file_max_bytes,
        backupCount=settings.log_file_backup_count,
    )

    console_handler = logging.StreamHandler()
    exit_handler = ExitHandler()

    # define log format and set for each handler
    formatter = logging.Formatter(
        fmt="%(asctime)s - %(levelname)8s - %(module)s - %(funcName)s: %(message)s",
        datefmt="%Y-%m-%d %H:%M:%S%z",
    )
    file_handler.setFormatter(formatter)
    console_handler.setFormatter(formatter)

    # add handlers to the logger
    logger.addHandler(file_handler)
    logger.addHandler(console_handler)
    logger.addHandler(exit_handler)

    logger.setLevel(settings.log_level)

check_model_api_availability

This module provides functions to check LLM-APIs for availability.

To check a certain LLM use await check_model_api(llm). To get all LLMs that are activated in configs/general.yml, use await get_available_llms().

FUNCTION	DESCRIPTION
`get_available_llms`	Returns a list of available LLMs.
`is_model_api_available`	Check if API is available using credentials.

get_available_llms `async`

get_available_llms()

Returns a list of available LLMs.

RETURNS	DESCRIPTION
`list[dict[str, str]]`	List of available LLMs with selected infos.

Source code in docs/repositories-clones/backend/src/utils/check_model_api_availability.py

async def get_available_llms() -> list[dict[str, str]]:
    """Returns a list of available LLMs.

    Returns:
        List of available LLMs with selected infos.
    """
    available_llms = []

    # iterate over model_groups (services), i.e. behoerden_klartext, RAG, embedding, ...
    for model_group_key in llm_config:
        logger.debug(f"Checking APIs for {model_group_key}-LLMs.")
        model_group = llm_config[model_group_key]

        for llm_name, llm in model_group.items():
            logger.debug(f"Checking availability of {llm_name}")
            if await is_model_api_available(llm.api, llm_name):
                llm_dict = llm.model_dump(include=["label", "is_remote", "max_input_length"])
                llm_dict["name"] = llm_name

                available_llms.append(llm_dict)

    return available_llms

is_model_api_available `async`

is_model_api_available(llm_api, llm_name, timeout_in_s=10)

Check if API is available using credentials.

Availability is checked by sending a HEAD, GET, or POST request. If a health_check endpoint is provided, the request is sent to that endpoint; otherwise, it is sent to the main API URL.

PARAMETER	DESCRIPTION
`llm_api`	the LLMAPI instance to check TYPE: `LLMAPI`
`llm_name`	ID of the LLM as used in the config file as reference TYPE: `str`
`timeout_in_s`	http timeout in seconds; defaults to 10 TYPE: `int` DEFAULT: `10`

RETURNS	DESCRIPTION
`bool`	Whether the model API is available or not - True if the API is available.

Source code in docs/repositories-clones/backend/src/utils/check_model_api_availability.py

async def is_model_api_available(
    llm_api: LLMAPI,
    llm_name: str,
    timeout_in_s: int = 10,
) -> bool:
    """Check if API is available using credentials.

    Availability is checked by sending a HEAD, GET, or POST request. If a health_check endpoint is provided,
    the request is sent to that endpoint; otherwise, it is sent to the main API URL.

    Args:
        llm_api (LLMAPI): the LLMAPI instance to check
        llm_name (str): ID of the LLM as used in the config file as reference
        timeout_in_s (int): http timeout in seconds; defaults to 10

    Returns:
        Whether the model API is available or not - True if the API is available.
    """
    headers = {"Content-type": "application/json"}

    # Authorization is not always needed
    if llm_api.auth:
        headers["Authorization"] = llm_api.auth.get_auth_header()

    url = llm_api.get_health_check_url()

    # test health check endpoint with GET, HEAD and POST
    try:
        async with httpx.AsyncClient() as client:
            response = await client.get(
                url,
                headers=headers,
                timeout=timeout_in_s,
            )
        logger.debug(
            f"{url} health check via GET request: {response.status_code=}, LLM: '{llm_name}"
        )

        # test with HEAD
        if response.status_code != HTTPStatus.OK:
            async with httpx.AsyncClient() as client:
                response = await client.head(
                    url,
                    headers=headers,
                    timeout=timeout_in_s,
                )
            logger.debug(
                f"{url} health check via HEAD request: {response.status_code=}, LLM: '{llm_name}"
            )

        # test with POST
        if response.status_code != HTTPStatus.OK:
            async with httpx.AsyncClient() as client:
                response = await client.post(
                    url,
                    headers=headers,
                    timeout=timeout_in_s,
                )
            logger.debug(
                f"{url} health check via POST request: {response.status_code=}, LLM: '{llm_name}"
            )

    except Exception as e:
        logger.warning(f"Exception when trying to reach LLM API. Error: {e}, LLM: '{llm_name}")
        return False

    if response.status_code != HTTPStatus.OK:
        logger.warning(
            f"LLM unavailable: Could not establish connection to LLM-API. LLM: '{llm_name}"
        )

    return response.status_code == HTTPStatus.OK

model_selector

This module provides functions for selecting language models based on API input.

FUNCTION	DESCRIPTION
`get_valid_language_model`	Returns a valid language model name to use for processing.

get_valid_language_model

get_valid_language_model(language_model, available_llms)

Returns a valid language model name to use for processing.

This function ensures that a valid language model is selected from the list of currently available models. If no model is specified (language_model=None), it automatically selects the first model in the list.

The function also checks that the requested model exists in available_llms. If no models are available or the chosen model is not in the list, an HTTPException is raised with status 503.

PARAMETER	DESCRIPTION
`language_model`	The requested language model or None. TYPE: `str \| None`
`available_llms`	List of available LLMs (each with 'name'). TYPE: `list[dict]`

RETURNS	DESCRIPTION
`str`	Valid language model name. TYPE: `str`

RAISES	DESCRIPTION
`HTTPException`	If no models are available or the chosen model is not in the list.

Source code in docs/repositories-clones/backend/src/utils/model_selector.py

def get_valid_language_model(language_model: str | None, available_llms: list[dict]) -> str:
    """Returns a valid language model name to use for processing.

    This function ensures that a valid language model is selected from the
    list of currently available models. If no model is specified (`language_model=None`),
    it automatically selects the first model in the list.

    The function also checks that the requested model exists in `available_llms`.
    If no models are available or the chosen model is not in the list, an HTTPException
    is raised with status 503.

    Args:
        language_model (str | None): The requested language model or None.
        available_llms (list[dict]): List of available LLMs (each with 'name').

    Returns:
        str: Valid language model name.

    Raises:
        HTTPException: If no models are available or the chosen model is not in the list.
    """
    if not available_llms:
        error_msg = "Keine Sprachmodelle verfügbar. Bitte versuchen Sie es später erneut."
        logger.error(error_msg)
        raise HTTPException(
            status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
            detail=error_msg,
        )

    if not language_model:
        language_model = available_llms[0]["name"]
        logger.debug(f"No language_model specified. Automatically selected: {language_model}")

    if not any(llm["name"] == language_model for llm in available_llms):
        error_msg = f"Das Sprachmodell '{language_model}' ist nicht verfügbar."
        logger.error(error_msg)
        raise HTTPException(
            status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
            detail=f"{error_msg} Bitte versuchen Sie es später erneut.",
        )

    return language_model

openai_custom_auth

Costumized Httpx Authentication Client.

CLASS	DESCRIPTION
`CustomAuthClient`	Custom HTTP transport for OpenAI client.

CustomAuthClient

Bases: AsyncClient

Custom HTTP transport for OpenAI client.

This class supports both Bearer Token Authentication and Basic Authentication. If auth_type is 'token', the secret is expected to be the API key. If auth_type is 'basic_auth', the secret is expected to be a base64-encoded string of 'username:password'.

ATTRIBUTE	DESCRIPTION
`auth_header`	Authentication header for the httpx client. TYPE: `str`

METHOD	DESCRIPTION
`a_send`	Asynchronous method for sending HTTP requests.
`send`	Synchronous method for sending HTTP requests.

Source code in docs/repositories-clones/backend/src/utils/openai_custom_auth.py

class CustomAuthClient(httpx.AsyncClient):
    """Custom HTTP transport for OpenAI client.

    This class supports both Bearer Token Authentication and Basic Authentication.
    If `auth_type` is 'token', the `secret` is expected to be the API key.
    If `auth_type` is 'basic_auth', the `secret` is expected to be a base64-encoded string of 'username:password'.

    Attributes:
        auth_header (str): Authentication header for the httpx client.

    Methods:
        a_send(request, *args, **kwargs): Asynchronous method for sending HTTP requests.
        send(request, *args, **kwargs): Synchronous method for sending HTTP requests.
    """

    def __init__(
        self,
        secret: str | None = None,
        auth_type: Literal["token", "basic_auth"] | None = None,
        *args: object,
        **kwargs: object,
    ) -> None:
        """Initialize the custom HTTP transport for OpenAI client.

        Args:
            secret (str, optional): OpenAI API Key or Basic Auth credentials (username:password).
                                     This is required depending on the `auth_type`. If `auth_type`
                                     is 'token', the `secret` should be the API key. If
                                     `auth_type` is 'basic_auth', the `secret` should be a
                                     base64-encoded string of 'username:password'.
            auth_type (str, optional): The type of authentication to use. It can be 'token' or 'basic_auth'.
            *args: Variable length argument list.
            **kwargs: Arbitrary keyword arguments.

        Raises:
            ValueError: If `auth_type` is provided but `secret` is not provided.
        """
        super().__init__(*args, **kwargs)

        self.auth_header = ""

        if auth_type and not secret:
            raise ValueError("API credentials are required but missing.")

        if auth_type == "token":
            self.auth_header = f"Bearer {secret}"

        elif auth_type == "basic_auth":
            encoded_credentials = base64.b64encode(secret.encode()).decode()
            self.auth_header = f"Basic {encoded_credentials}"

    async def a_send(
        self,
        request: httpx.Request,
        *args: object,
        **kwargs: object,
    ) -> httpx.Response:
        """Asynchronous version of the send method to handle requests asynchronously."""
        if "Authorization" in request.headers:
            del request.headers["Authorization"]
        if self.auth_header:
            request.headers["Authorization"] = self.auth_header
        return await super().a_send(request, *args, **kwargs)

    def send(
        self,
        request: httpx.Request,
        *args: object,
        **kwargs: object,
    ) -> httpx.Response:
        """Version of the send method to handle requests asynchronously."""
        if "Authorization" in request.headers:
            del request.headers["Authorization"]
        if self.auth_header:
            request.headers["Authorization"] = self.auth_header
        return super().send(request, *args, **kwargs)

a_send async

a_send(request, *args, **kwargs)

Asynchronous version of the send method to handle requests asynchronously.

Source code in docs/repositories-clones/backend/src/utils/openai_custom_auth.py

async def a_send(
    self,
    request: httpx.Request,
    *args: object,
    **kwargs: object,
) -> httpx.Response:
    """Asynchronous version of the send method to handle requests asynchronously."""
    if "Authorization" in request.headers:
        del request.headers["Authorization"]
    if self.auth_header:
        request.headers["Authorization"] = self.auth_header
    return await super().a_send(request, *args, **kwargs)

send

send(request, *args, **kwargs)

Version of the send method to handle requests asynchronously.

Source code in docs/repositories-clones/backend/src/utils/openai_custom_auth.py

def send(
    self,
    request: httpx.Request,
    *args: object,
    **kwargs: object,
) -> httpx.Response:
    """Version of the send method to handle requests asynchronously."""
    if "Authorization" in request.headers:
        del request.headers["Authorization"]
    if self.auth_header:
        request.headers["Authorization"] = self.auth_header
    return super().send(request, *args, **kwargs)

otel

Setup OpenTelemetry collections and exporters.

FUNCTION	DESCRIPTION
`setup_otel`	Setup OpenTelemetry metrics exporter and meter provider.

setup_otel

setup_otel()

Setup OpenTelemetry metrics exporter and meter provider.

Source code in docs/repositories-clones/backend/src/utils/otel.py

def setup_otel() -> None:
    """Setup OpenTelemetry metrics exporter and meter provider."""
    logger.info("Setting up OpenTelemetry")
    logger.debug(f"OpenTelemetry settings: {settings.otel}")

    # traces
    span_exporter = OTLPSpanExporter(endpoint=str(settings.otel.otlp_grpc_endpoint))

    # use BatchspanProcessor to export spans in batches for better performance
    batch_span_processor = BatchSpanProcessor(
        span_exporter,
        schedule_delay_millis=settings.otel.export_interval_in_ms,
    )

    tracer_provider = TracerProvider()
    tracer_provider.add_span_processor(batch_span_processor)
    trace.set_tracer_provider(tracer_provider)

    # metrics
    metric_exporter = OTLPMetricExporter(
        endpoint=str(settings.otel.otlp_grpc_endpoint),
        insecure=True,
    )

    metrics_reader = PeriodicExportingMetricReader(
        metric_exporter,
        export_interval_millis=settings.otel.export_interval_in_ms,
    )
    metrics.set_meter_provider(MeterProvider(metric_readers=[metrics_reader]))

process_configs

Methods to load and config and start checks of config integrity.

FUNCTION	DESCRIPTION
`deep_merge_dicts`	Recursively merge two dictionaries.
`load_all_configs`	Load config settings from respective paths.
`load_from_yml_in_pydantic_model`	Load one ore multiple configs from 'yaml_paths' into given pydantic-Model.
`load_yaml`	Load yaml.
`merge_specific_cfgs_in_place`	Copy Prompt-config to apropriate section in general llm_config. Edit in-place!
`postprocess_configs`	Post-Process loaded configs.
`remove_unactive_models`	Remove models from all useacases, if they are not in 'active_models'. Edit in-place!

deep_merge_dicts

deep_merge_dicts(base, add)

Recursively merge two dictionaries.

PARAMETER	DESCRIPTION
`base`	The base dictionary that will be updated. TYPE: `dict`
`add`	The dictionary with values that extend the base. TYPE: `dict`

RETURNS	DESCRIPTION
`dict`	A new dictionary containing the merged keys and values. TYPE: `dict`

Source code in docs/repositories-clones/backend/src/utils/process_configs.py

def deep_merge_dicts(base: dict, add: dict) -> dict:
    """Recursively merge two dictionaries.

    Args:
        base (dict): The base dictionary that will be updated.
        add (dict): The dictionary with values that extend the base.

    Returns:
        dict: A new dictionary containing the merged keys and values.
    """
    for key, value in add.items():
        if isinstance(value, dict) and key in base and isinstance(base[key], dict):
            base[key] = deep_merge_dicts(base[key], value)
        else:
            base[key] = value
    return base

load_all_configs

load_all_configs(general_config_paths, path_to_llm_prompts, path_to_llm_protected_words, path_to_llm_model_configs)

Load config settings from respective paths.

PARAMETER	DESCRIPTION
`general_config_paths`	Path to config, matching 'Settings' TYPE: `Path`
`path_to_llm_prompts`	Path to config, matching 'LLMPromptMaps' (system and user prompt) TYPE: `Path`
`path_to_llm_protected_words`	Path to config, matching 'LLMPromptMaps' (protected words) TYPE: `Path`
`path_to_llm_model_configs`	Path to config, matching 'LLMConfig' TYPE: `Path`

RETURNS	DESCRIPTION
`tuple[Settings, LLMConfig]`	Config loaded into their Pydantic Model.

Source code in docs/repositories-clones/backend/src/utils/process_configs.py

def load_all_configs(
    general_config_paths: Path,
    path_to_llm_prompts: Path,
    path_to_llm_protected_words: Path,
    path_to_llm_model_configs: Path,
) -> tuple[Settings, LLMConfig]:
    """Load config settings from respective paths.

    Args:
        general_config_paths (Path): Path to config, matching 'Settings'
        path_to_llm_prompts (Path): Path to config, matching 'LLMPromptMaps' (system and user prompt)
        path_to_llm_protected_words (Path): Path to config, matching 'LLMPromptMaps' (protected words)
        path_to_llm_model_configs (Path): Path to config, matching 'LLMConfig'

    Returns:
        Config loaded into their Pydantic Model.

    """
    settings = load_from_yml_in_pydantic_model(general_config_paths, Settings)
    llm_prompts = load_from_yml_in_pydantic_model(
        [path_to_llm_prompts, path_to_llm_protected_words], LLMPromptMaps
    )
    llm_config = load_from_yml_in_pydantic_model(path_to_llm_model_configs, LLMConfig)

    postprocess_configs(settings, llm_prompts, llm_config)

    return settings, llm_config

load_from_yml_in_pydantic_model

load_from_yml_in_pydantic_model(yaml_paths, pydantic_reference_model)

Load one ore multiple configs from 'yaml_paths' into given pydantic-Model.

PARAMETER	DESCRIPTION
`yaml_paths`	Yaml(s) to load TYPE: `Union[Path, Iterable[Path]]`
`pydantic_reference_model`	pydantic model to load yaml(s) into TYPE: `BaseModel`

RETURNS	DESCRIPTION
`BaseModel`	BaseModel derived pydantic data class.

Source code in docs/repositories-clones/backend/src/utils/process_configs.py

def load_from_yml_in_pydantic_model(
    yaml_paths: Path | Iterable[Path],
    pydantic_reference_model: BaseModel,
) -> BaseModel:
    """Load one ore multiple configs from 'yaml_paths' into given pydantic-Model.

    Args:
        yaml_paths (Union[Path, Iterable[Path]]): Yaml(s) to load
        pydantic_reference_model (BaseModel): pydantic model to load yaml(s) into

    Returns:
        BaseModel derived pydantic data class.

    """
    if isinstance(yaml_paths, str | Path):
        yaml_paths = [yaml_paths]

    merged_data = {}

    for path in yaml_paths:
        data = load_yaml(path)
        merged_data = deep_merge_dicts(merged_data, data)
        logger.debug(f"Loaded partial config from: '{path}'")

    logger.info(f"Merged YAML data: {merged_data}")

    try:
        pydantic_class = pydantic_reference_model(**merged_data)
        logger.debug("Final merged config successfully validated.")
        return pydantic_class

    except ValidationError as e:
        logger.critical(f"Error loading config: '{e}'")
        raise e

load_yaml

load_yaml(yaml_path)

Load yaml.

PARAMETER	DESCRIPTION
`yaml_path`	Path to yaml TYPE: `list[Path]`

RETURNS	DESCRIPTION
`dict[str, Any]`	Content of the loaded yaml.

Source code in docs/repositories-clones/backend/src/utils/process_configs.py

def load_yaml(yaml_path: Path) -> dict[str, Any]:
    """Load yaml.

    Args:
        yaml_path (list[Path]): Path to yaml

    Returns:
        Content of the loaded yaml.
    """
    if not yaml_path.exists():
        logger.error(f"Invalid path: '{yaml_path}'")
        raise FileNotFoundError

    with open(yaml_path) as file:
        return yaml.safe_load(file)

merge_specific_cfgs_in_place

merge_specific_cfgs_in_place(llm_config, llm_prompts)

Copy Prompt-config to apropriate section in general llm_config. Edit in-place!

Only if 'prompt_map' in LLMConfig can be found in LLMPromptMaps, it will be merged. i.e. try to generalize sth. like this:

cfg["phi3:mini"].prompts = prompt[cfg["phi3:mini"].prompt_map]

PARAMETER	DESCRIPTION
`llm_config`	Target for merge of Prompt parameter TYPE: `LLMConfig`
`llm_prompts`	Source to merge Prompt parameter from TYPE: `LLMPromptMaps`

RETURNS	DESCRIPTION
`bool`	True if no problems occurred.

Source code in docs/repositories-clones/backend/src/utils/process_configs.py

def merge_specific_cfgs_in_place(llm_config: LLMConfig, llm_prompts: LLMPromptMaps) -> bool:
    """Copy Prompt-config to apropriate section in general llm_config. Edit in-place!

    Only if 'prompt_map' in LLMConfig can be found in LLMPromptMaps, it will be merged.
    i.e. try to generalize sth. like this:

    cfg["phi3:mini"].prompts = prompt[cfg["phi3:mini"].prompt_map]

    Args:
        llm_config (LLMConfig): Target for merge of Prompt parameter
        llm_prompts (LLMPromptMaps): Source to merge Prompt parameter from

    Returns:
        True if no problems occurred.
    """
    no_issues_occurred = True
    for usecase in llm_config:
        # load identical usecases, i.e. behoerden_klartext, RAG
        try:
            cfg = getattr(llm_config, usecase)
            prompt = getattr(llm_prompts, usecase)
        except AttributeError:
            logger.warning(
                f"Usecase '{usecase}' not matching between prompt- and general llm config. \
                    Skipping cfg-merge for '{usecase}' .."
            )
            no_issues_occurred = False
            continue

        # copy prompt config to its usecase- and model-counterpart
        for model in cfg:
            prompt_map_to_use = cfg[model].prompt_map
            if prompt_map_to_use in prompt:
                cfg[model].prompt_config = prompt[prompt_map_to_use]
            else:
                logger.warning(
                    f"'prompt_map: {prompt_map_to_use}' from LLM-config not in prompt-config for '{usecase}'. \
                        Skipping .."
                )
                no_issues_occurred = False
                continue

    return no_issues_occurred

postprocess_configs

postprocess_configs(settings, llm_prompts, llm_config)

Post-Process loaded configs.

Remove unused models (from settings.active_llms), merge LLMPromptMaps into LLMConfig.

PARAMETER	DESCRIPTION
`settings`	Config matching pydantic 'Settings'. TYPE: `Settings`
`llm_prompts`	Config matching pydantic 'LLMPromptMaps'. TYPE: `LLMPromptMaps`
`llm_config`	Config matching pydantic 'LLMConfig'. TYPE: `LLMConfig`

RETURNS	DESCRIPTION
`LLMConfig`	Merged and filtered LLM configuration.

Source code in docs/repositories-clones/backend/src/utils/process_configs.py

def postprocess_configs(
    settings: Settings,
    llm_prompts: LLMPromptMaps,
    llm_config: LLMConfig,
) -> LLMConfig:
    """Post-Process loaded configs.

    Remove unused models (from settings.active_llms), merge LLMPromptMaps into LLMConfig.

    Args:
        settings (Settings): Config matching pydantic 'Settings'.
        llm_prompts (LLMPromptMaps): Config matching pydantic 'LLMPromptMaps'.
        llm_config (LLMConfig): Config matching pydantic 'LLMConfig'.

    Returns:
        Merged and filtered LLM configuration.
    """
    remove_unactive_models(llm_config, settings.active_llms)
    merge_specific_cfgs_in_place(llm_config, llm_prompts)

    return llm_config

remove_unactive_models

remove_unactive_models(input_config, active_models)

Remove models from all useacases, if they are not in 'active_models'. Edit in-place!

PARAMETER	DESCRIPTION
`input_config`	Config to change TYPE: `LLMConfig`
`active_models`	Models to keep - remove other TYPE: `list[str]`

Source code in docs/repositories-clones/backend/src/utils/process_configs.py

def remove_unactive_models(input_config: LLMConfig, active_models: list[str]) -> None:
    """Remove models from all useacases, if they are not in 'active_models'. Edit in-place!

    Args:
        input_config (LLMConfig): Config to change
        active_models (list[str]): Models to keep - remove other
    """
    for usecase in input_config:
        cfg = getattr(input_config, usecase)
        active_models_for_usecase = getattr(active_models, usecase)
        for model in list(cfg):
            if model not in active_models_for_usecase:
                cfg.pop(model)

Python API Referenz

backend

main

src

app

lifespan async

endpoints

get_llms async

health async

save_feedback async

simplify_user_text async

simplify_user_text_stream async

models

api_input

Feedback

FeedbackBase

SimplifyInput

get_formated_now

api_output

SimplifyOutput

SimplifyStreamOutput

general

ActiveLLMs

LogLevel

OpenTelemetry

Settings

llms

APIAuth

LLM

LLMAPI

LLMConfig

LLMInference

LLMPromptConfig

LLMPromptMaps

LLMPrompts

services

feedback

add_feedback_to_db

get_db_session

init_feedback_db

simplify

openai

registry

stem

stem_string

settings

utils

base_logger

ExitHandler

setup_logger

check_model_api_availability

get_available_llms async

is_model_api_available async

model_selector

get_valid_language_model

openai_custom_auth

CustomAuthClient

otel

setup_otel

process_configs

deep_merge_dicts

load_all_configs

load_from_yml_in_pydantic_model

load_yaml

merge_specific_cfgs_in_place

postprocess_configs

remove_unactive_models

lifespan `async`

get_llms `async`

health `async`

save_feedback `async`

simplify_user_text `async`

simplify_user_text_stream `async`

get_available_llms `async`

is_model_api_available `async`