Schema Generation#

checkedframe._schema_generation.generate_schema_repr(df: nwt.IntoFrame, lazy: bool = False, class_name: str = 'MySchema', header: str | None = 'import checkedframe as cf', import_alias: str = 'cf.') SchemaRepr#

Generate a schema definition from an existing DataFrame.

Parameters:
  • df (nwt.IntoFrame) – The DataFrame to draw the schema from

  • lazy (bool, optional) – If False, only inspects the metadata of the DataFrame and has no visiblity on the actual values. Useful if generating a schema from a lazy DataFrame. However, this means that parameters that rely on the values, like “nullable”, cannot be generated, by default False

  • class_name (str, optional) – The name of the schema, by default “MySchema”

  • header (Optional[str], optional) – The header at the top of the file. If None, no header is generated, by default “import checkedframe as cf”

  • import_alias (str, optional) – The string to put in front of the dtypes, by default “cf.”

Return type:

SchemaRepr

Examples

import checkedframe as cf
import polars as pl

df = pl.DataFrame({"customer_id": ["TVU8X", "BB235"], "balance": [322.5, None]})

schema_repr = cf.generate_schema_repr()

# Write to file
# schema_repr.write_text("my_schema.py")

# Send to clipboard (requires pyperclip)
# schema_repr.write_clipboard()

print(schema_repr.schema_repr)

Output:

import checkedframe as cf

class MySchema(cf.Schema):
    customer_id = cf.String()
    balance = cf.Float64(nullable=True)