Skip to main content

How to connect to Google BigQuery

Learn how to connect elvex to Google BigQuery

Updated yesterday

This guide provides specific instructions for preparing a Google BigQuery database for connection to elvex.

Before You Begin

Before connecting elvex to your BigQuery database, you should:

  • Have administrator access to your Google Cloud project or work with someone who does

  • Understand which datasets and tables you want to make accessible

  • Identify which users or groups will need access to this data through elvex

  • Consider any data security or compliance requirements that apply to your data

Step 1: Create a Dedicated Service Account

For optimal security, create a dedicated service account in Google Cloud that elvex will use to connect to your BigQuery database:

  1. Go to the Google Cloud Console

  2. Navigate to IAM & Admin > Service Accounts

  3. Click Create Service Account

  4. Enter a name (e.g., "elvex-bigquery-service")

  5. Add a description (e.g., "Service account for elvex to access BigQuery data")

  6. Click Create and Continue

Step 2: Set Appropriate Permissions

Assign the minimum necessary permissions to the service account:

  1. In the Grant this service account access to project section, add the following roles:

    • BigQuery Data Viewer (for read-only access to data)

    • BigQuery Job User (to run queries)

Step 3: Generate a Service Account Key

Create a key file that elvex will use to authenticate as the service account:

  1. In the Google Cloud Console, navigate to "IAM & Admin" > "Service Accounts"

  2. Find the service account you created and click on it

  3. Go to the "Keys" tab

  4. Click "Add Key" > "Create new key"

  5. Select "JSON" as the key type

  6. Click "Create"

  7. The key file will be downloaded to your computer

Important: This key file contains sensitive information. Keep it secure and do not share it publicly.

Step 4: Create Views for Complex Data Models (Optional)

If your data model is complex or you want to simplify access, consider creating views:

CREATE VIEW project.dataset.customer_summary AS
SELECT
c.customer_id,
c.name,
c.email,
SUM(o.total_amount) AS total_spent,
COUNT(o.order_id) AS order_count
FROM
project.dataset.customers c
JOIN
project.dataset.orders o
ON
c.customer_id = o.customer_id
GROUP BY
c.customer_id, c.name, c.email;

Then grant the service account access to this view by setting dataset-level permissions.

Step 5: Add Descriptions to Tables and Columns (Recommended)

Adding descriptions to your tables and columns will help elvex's AI better understand your data:

  1. In the BigQuery console, navigate to the table you want to describe

  2. To add a table description:

    1. Click Details

    2. Click Edit details

    3. Add a description to the table

    4. Click Save

  3. To add column descriptions:

    • Click on the Schema tab

    • Click Edit schema

    • Add descriptions for each column

    • Click Save

Alternatively, you can use the bq update command:

# Update table description bq update
bq update --description "Customer information including contact details" project:dataset.customers

# Update column descriptions (requires a JSON schema file)
bq update --schema column1:type1:description1,column2:type2:description2 project:dataset.customers

Step 6: Prepare Connection Information

Gather the following information which you'll need when setting up the connection in elvex:

  • Google Cloud Project ID

  • BigQuery Dataset name

    • This is not the Dataset ID. If the Dataset ID is instant-gecko-454220-j8.sales the Dataset name is sales

  • Service account JSON key file

Step 7: Test the Connection (Optional)

Before configuring elvex, test that your service account can connect to BigQuery and access the intended tables:

  1. Install the Google Cloud SDK if you haven't already

  2. Set the service account key file as your active credentials:

    export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account-key.json"
  3. Run a test query:

    bq query --use_legacy_sql=false 'SELECT * FROM `project.dataset.table_name` LIMIT 5'

Step 8: Configure the Connection in elvex

Once you've completed the preparation steps, you're ready to configure the BigQuery connection in elvex:

  1. In elvex, navigate to the Datasources section

  2. Click Create New Datasource

  3. Select Database Connection as the datasource type

  4. Choose BigQuery as the database type

  5. Enter the connection details you gathered in Step 6:

    • Project ID

    • Dataset name

    • Upload the service account JSON key file

  6. Select the tables you want to include in your datasource

  7. Review the schema and add any additional descriptions if needed

  8. Configure access permissions for the datasource

  9. Save the datasource

Note: if no description is found for the table or the columns, elvex will intelligently create descriptions for them by looking at the first few rows of the table to understand the data. Be sure to verify the listed descriptions, as elvex may incorrectly identify information in the table.

BigQuery-Specific Considerations

Nested and Repeated Fields

BigQuery supports nested and repeated fields (STRUCT and ARRAY types). elvex can query these fields using BigQuery's standard syntax, but you should add clear descriptions to help the AI understand the structure of these complex fields.

Query Costs

Remember that BigQuery charges based on the amount of data processed by queries. To minimize costs:

  • Consider creating aggregated views for common analytical queries

  • Use table partitioning and clustering where appropriate

  • Set up BigQuery cost controls and monitoring

Security Best Practices for BigQuery

In addition to the general security best practices, consider these BigQuery-specific recommendations:

  • Regularly rotate the service account key

  • Use VPC Service Controls if you need additional network security

  • Consider using BigQuery Column-level Security for sensitive data

  • Enable Cloud Audit Logs to monitor access to your BigQuery resources

Troubleshooting BigQuery Connections

Common Issues

  • Authentication failures: Verify the service account key file is valid and has not expired

  • Permission errors: Ensure the service account has the necessary roles assigned

  • Dataset not found: Verify the project ID and dataset ID are correct

  • Quota exceeded: Check your BigQuery quotas and usage limits

Checking Permissions

To verify the permissions of your service account:

  1. In the Google Cloud Console, navigate to "IAM & Admin" > "IAM"

  2. Filter for your service account to see its project-level roles

  3. To check dataset-level permissions:

    • Go to BigQuery in the Google Cloud Console

    • Select the dataset

    • Click "Share" to see who has access

Next Steps

After successfully connecting your BigQuery database to elvex, you can:

  1. Create assistants that use this datasource

  2. Test natural language queries against your database

  3. Share the assistants with appropriate users in your organization

Did this answer your question?