Create an AI Data Engineer
AI Data Engineers automate complex data tasks with the predictability of code. They autonomously plan, build, test, and generate production-ready notebooks to clean, standardize, and structure your data. These AI agents adapt to new data sources and formats, and can be scheduled or triggered on demand.
Here are the steps to create a new AI Data Engineer.
Create an AI Data Engineer
Go to AI Data Engineer and Select New AI Data Engineer.

Populate the first section of Create a new AI Data Engineer Fields.
Enter AI Data Engineer Name
Enter the Databricks Workspace URL
Select the credential from the drop-down
Ensure you have followed the steps in Databricks Credential creation
If you don't see the credential in the drop-down, this means it has not been shared with Osmos.
Select Validate
Populate the second section of Create a new AI Data Engineer Fields.
Choose Git repo - Each Osmos task will operate against this Git folder in your Databricks Workspace.
A Git repo is required, and this repo must be shared with the Databricks credentials for it to appear in the list.
If you don't see the Git repo you expect, make sure you have shared the Git repo with the Service Principal.
Home folder - Enter the email associated with your Databricks user folder.
Select Cluster - Make sure your workspace has a running all-purpose cluster started via the Databricks UI or API.
The Cluster must be shared with your Service Principal and with "can attach" permissions.
Serverless and job clusters are not supported.
Select Save.

Last updated
Was this helpful?