Originally posted on medium.
Invoke Gen AI for over a million rows with Google BigQuery SQL in batches without hard coding.
What is LLM
A large language model (LLM) is a type of artificial intelligence (AI) algorithm that uses deep learning techniques and massively large data sets to understand, summarize, generate, and predict new content.
There are some famous companies, providers, and models.
- OpenAI: GPT 3.5 | GPT 4 — ChatGPT
- Google Bard: Palm2
- Google Cloud: Text-bison | Chat-Bison — Palm2
- Anthropic: Claude

Typical hard coding to Invoke method of LLM Models
For non-UI usage, you could only invoke the llm models in the programming method. There are some examples of a single request to get an AI response:
- OpenAI Python SDK (https://github.com/openai/openai-python)
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the world series in 2020?"},
{"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
{"role": "user", "content": "Where was it played?"}
]
)
- Claude Python SDK (https://github.com/anthropics/anthropic-sdk-python)
from anthropic import Anthropic, HUMAN_PROMPT, AI_PROMPT
anthropic = Anthropic(
# defaults to os.environ.get("ANTHROPIC_API_KEY")
api_key="my api key",
)
completion = anthropic.completions.create(
model="claude-2",
max_tokens_to_sample=300,
prompt=f"{HUMAN_PROMPT} how does a court case get to the Supreme Court?{AI_PROMPT}",
)
print(completion.completion)
If you want to get AI responses at once for large offline prompts (like over a million raw requests), you need to make more effort to hard code just like for loops / multiple threads in the meantime .etc
You may simply meet the quota of requests per minute or tokens per minute and unbelievable errors.
To deal with these embarrassing cases, you will need to handle different exceptions in order to complete your AI requests.
Hard Working! Hard Coding!
What is Bigquery
Google BigQuery is a serverless and cost-effective enterprise data warehouse that works across clouds and scales with your data. Use built-in ML/AI and BI for insights at scale.
There are some competitors like AWS Redshift, Snowflake, and Databricks.
A data warehouse can easily store much more than 1 billion rows in a single table.
Combine LLM with Bigquery
Bigquery ML
Usually, performing ML on large datasets requires extensive programming and knowledge of ML frameworks. These requirements restrict solution development to a very small set of people within each company, and they exclude data analysts who understand the data but have limited ML knowledge and programming expertise. However, with BigQuery ML, SQL practitioners can use existing SQL tools and skills to build and evaluate models.
No hard code!
SQL is enough to train and invoke machine learning models in Bigquery
Supported LLMS In Bigquery ML
The models above are all supported by Google Palm2.
Generate text with text-bison
Embedded text with textembedding-gecko
Invoke LLMs in Bigquery SQL
Basic Syntax :
ML.GENERATE_TEXT(
MODEL `project_id.dataset.model_name`,
{ TABLE table_name | (query_statement) },
STRUCT(
[number_of_output_tokens AS max_output_tokens]
[, top_k_value AS top_k]
[, top_p_value AS top_p]
[, temperature AS temperature]
[, flatten_json_output AS flatten_json_output])
)
model_name
is the name of a remote model that specifiesCLOUD_AI_LARGE_LANGUAGE_MODEL_V1
forREMOTE_SERVICE_TYPE
.table_name |query_statement
is the name of the BigQuery table that contains the prompt data. It can be a table uri or a subquery.- Struct bodies are multiple pairs of model parameters.
Bigquery SQL Example :
SELECT *
FROM
ML.GENERATE_TEXT(
MODEL mydataset.llm_model,
(SELECT 'What is the purpose of dreams?' AS prompt),
STRUCT(
0.8 AS temperature,
1024 AS max_output_tokens,
0.95 AS top_p,
40 AS top_k));
Output Example :
- table schema

- Result JSON schema
{
"predictions":[
{
"citationMetadata":{
"citations":[
]
},
"content":"this is ai's output",
"safetyAttributes":{
"blocked":false,
"categories":[
"Finance",
"Health",
"Legal",
"Religion & Belief",
"Toxic"
],
"scores":[
0.3,
0.2,
0.1,
0.7,
0.1
]
}
}
]
}
Real Case :
SELECT
ml_generate_text_result['predictions'][0]['content'] AS generated_text,
ml_generate_text_result['predictions'][0]['safetyAttributes']
AS safety_attributes,
* EXCEPT (ml_generate_text_result)
FROM
ML.GENERATE_TEXT(
MODEL `bqml_tutorial.llm_model`,
(
SELECT
CONCAT(
'perform sentiment analysis on the following text, return one the following categories: positive, negative: ',
review) AS prompt,
*
FROM
`bigquery-public-data.imdb.reviews`
LIMIT 5
),
STRUCT(
0.2 AS temperature,
100 AS max_output_tokens));

What’s more
If you have offline data on another system like AWS S3/Local Disk or any Bigquery data system.
How to make the pattern possible?
Just import the data to Bigquery
Introduction to loading data | BigQuery | Google Cloud
Whether your business is early in its journey or well on its way to digital transformation, Google Cloud can help solve…
cloud.google.com
Batch Predict for a million+ Rows IN Bigquery
if you want to predict for a million rows once a time, the only 2 things you should do are :
- Throw out a simple easy Bigquery SQL like the example above
- Increase the quota of your model like RPM
Then what you need to do is just wait for the result table.
Everything will be done by Bigquery
What a simple and elegant method to invoke LLM models with crazy request numbers!
About the Quota of LLM Models in BIGQUERY :
- default quota :

- how to increase the quota:
Quota panel increase request

Email to bqml-feedback@google.com
To request more quota for the functions, adjust the quota for the associated Cloud AI service first, and then send an email to bqml-feedback@google.com and include information about the adjusted Cloud AI service quota.
Ending
Bigquery ML with Gen AI models can simply shorten your time without any hard coding for batch prediction usage.
The predicted results have been succinctly processed into a BIGQUERY table.
SQL is enough! No python! No hard coding
Source: medium