# Databricks Permissions for AI Data Engineer

### Overview

The **Osmos AI Data Engineer for Databricks** connects securely to your Databricks workspace through a **service principal** that you create and manage.\
All compute runs on your own Databricks clusters, and code can be version-controlled in a Git-backed Databricks Repo, giving you complete transparency and control.

### Quickstart Guide

1. **Create a Service Principal** in your Databricks workspace.
2. **Grant Permissions**:
   * **Repos**: `Read`/`Write` on the target repo
   * **Workspace**: `Read`/`Write` on a folder for artifacts
   * **Clusters**: `Can Attach To` on your target classic (non-serverless) cluster
   * **Data**: `SELECT` on source tables, and `INSERT`/`UPDATE` on the desired output schema
3. **Provide Resources to Osmos**: Your Databricks service principal name, your Databricks workspace URL.
4. **Networking Check**: If your workspace uses a private VNet or VPC, contact Osmos Support for onboarding options.

### Required Permissions

The service principal needs these baseline permissions:

* **Databricks Repos** – `Read` & `Write` access to the designated repo path
* **Workspace** – `Read` & `Write` access to a folder for storing artifacts
* **Clusters** – `Can Attach To` on a designated, active classic (non-serverless) cluster
* **Unity Catalog Data** – `SELECT` on source schemas/tables and `INSERT`/`UPDATE` on the schema where outputs are written

### Configuration in Context

#### Code Management in Databricks Repos

Use a Git-backed Databricks Repo so the agent can create feature branches and commits.\
From there, push changes to your remote Git provider and create pull requests for review.\
Alternatively, grant write access to a Unity Catalog Volume if you prefer direct file storage.

#### Workspace for Artifacts

Provide a folder with `Read` & `Write` access (e.g., `/Users/your.email@company.com/`) for logs and temporary files.\
You may also grant `Read` access to other workspace artifacts you want the agent to reference.

#### Cluster Usage

Provision a **Classic (Non-Serverless) cluster** and grant the service principal `Can Attach To`.\
The agent will not create, start, or stop clusters—it simply uses the cluster you specify and inherits its policies and libraries.

#### Data Governance with Unity Catalog

Grant `SELECT` permissions on the source data and `INSERT/UPDATE` on a designated schema for outputs.\
A common best practice is to provide write access only to a development schema to isolate outputs from production tables.

### Summary

The **Osmos AI Data Engineer for Databricks** operates entirely within the permissions you define.\
Think of it as a new engineer on your team: it can only read or write where you allow and will follow the instructions you provide.\
By configuring service principal access carefully—covering **Repos**, **Workspace**, **Clusters**, and **Unity Catalog Data**—you maintain full control and governance while enabling powerful autonomous data engineering capabilities.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://agenticdocs.osmos.io/ai-data-agents-on-databricks/databricks-credentials/databricks-permissions-for-ai-data-engineer.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
