Categories: AI
AI_GENERATE
Flexible general-purpose function for processing unstructured data, primarily used for complex data extraction requiring multiple fields from source files.
Syntax
AI_GENERATE( [model_name VARCHAR,] prompt VARCHAR | (prompt VARCHAR, file_reference) [WITH SCHEMA data_type] ) → ANY|ROW
- model_name (optional): Optional model specification in format 'modelProvider.modelName' (e.g., 'gpt.4o'). modelProvider is the user-defined name for model provider configuration added in the preferences section (see Model Provider Configuration). modelName is one of the models supported and provided by that provider. If not provided, uses the default model for the organization.
- prompt: Natural language instruction for the LLM. Use (prompt, file_reference) to process files from LIST_FILES.
- WITH SCHEMA data_type: Output structure specification (optional). When provided, supports any SQL type that is valid as JSON (
VARCHAR,INT,BOOLEAN,FLOAT,DOUBLE) and a flatLISTof any of these types, plus flatROWtypes with column names and data types for multiple columns.
Examples
AI_GENERATE exampleWITH recipe_analysis AS (
SELECT file['path'] AS recipe_file,
AI_GENERATE( 'gpt.4o', ('Extract recipe details', file)
WITH SCHEMA ROW( recipe_name VARCHAR, cuisine_type VARCHAR)
) AS recipe_info
FROM TABLE(LIST_FILES('@Cookbooks/cookbook_recipes'))
WHERE file['path'] LIKE '%.pdf'
)
SELECT
recipe_file,
recipe_info['recipe_name'] AS recipe,
recipe_info['cuisine_type'] AS cuisine
FROM recipe_analysis
ORDER BY recipe ASC;
-- Sample output:
-- recipe_file | recipe | cuisine
-- italian_recipes.pdf | Spaghetti Carbonara | Italian
-- asian_cookbook.pdf | Pad Thai | Thai
-- desserts_guide.pdf | Tiramisu | Italian
Usage Notes
• AI_GENERATE is the most flexible AI function, supporting both simple text generation and complex structured data extraction.
• Use (prompt, file_reference) to process files from LIST_FILES.
• When WITH SCHEMA ROW is provided, the output is explicitly structured according to that schema including column names and data types.
• Without WITH SCHEMA, the output would be ANY type.
• If no model is specified, uses the default model set for the organization.
• Model specification format: 'modelProvider.modelName' (e.g., 'gpt.4o').