# Generate Notebook

### Overview

The **Generate Notebook** feature is at the heart of how the Osmos **AI Data Engineer** turns your configuration into powerful, reusable Python code. With one click, it produces a fully functional Spark-based notebook that is ready to run, schedule, version, and integrate into pipelines.

These notebooks do more than ingest and transform data—they represent **long-living, production-grade workflows** that evolve with your needs, while putting human reviewers in complete control.

### What It Is

**Generate Notebook** triggers the AI Data Engineer to build a ready-to-run Python notebook based on your configuration instructions, source files, and destination schemas. The notebook is:

* **Execution-ready**: Includes logic for ingestion, transformation, and validation
* **Reusable**: Can be versioned, re-executed, and adapted for new data
* **Pipeline-ready**: Built for integration into orchestration systems (e.g., Fabric, Airflow)
* **Autonomous but supervised**: All actions are user-initiated, ensuring complete control

> Think of it as saying:\
> \&#xNAN;*“Hey engineer, write me a Python notebook for this job.”*\
> And the AI does it—intelligently, iteratively, and at scale.

### What the AI Does Behind the Scenes

When you click **Generate Notebook**, the AI Data Engineer will:

1. **Sample & Analyze Files**\
   It inspects your input data (CSV, JSON, XML, Parquet, etc.) to understand schemas, anomalies, and transformations.
2. **Write the Code**\
   It generates Spark-based Python code that:
   * Ingests your data
   * Transforms it according to your instructions
   * Includes built-in schema checks and validation logic
3. **Write Its Tests**\
   The notebook includes test cases to catch data issues, logic gaps, or structural inconsistencies.
4. **Handle Errors Automatically**\
   If tests fail, the AI:
   * Resamples the data
   * Revises the code
   * Re-generates logic until a working solution is found
5. **Add Bookkeeping**\
   Built-in logic tracks what data has been processed, avoiding duplicates or reprocessing in future runs.

### Iteration & Feedback

* After reviewing the generated instructions, you can:
  * Edit them inline
  * Add edge-case handling
  * Strengthen constraints (e.g., "fail if source columns change")
* If the result isn't correct, update your instructions and **regenerate**
* Use real-time feedback to refine and guide the AI’s behavior

### &#x20;Key Capabilities

| Capability                 | Description                                                               |
| -------------------------- | ------------------------------------------------------------------------- |
| **Reusable & Versionable** | Notebooks are long-living and can be stored, shared, and reused           |
| **Fully Tested**           | AI includes test scripts and validation checks                            |
| **Pipeline Integration**   | Designed to plug into workflows and orchestration platforms               |
| **Bookkeeping Logic**      | Automatically tracks processed files for repeatable, safe operations      |
| **Performance Optimized**  | Passes through an AI profiler for better runtime and scaling              |
| **Human-in-the-Loop**      | All notebook generation and execution are initiated and reviewed by users |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://agenticdocs.osmos.io/ai-data-agents-on-microsoft-fabric/ai-data-wrangler/ai-data-engineer/generate-notebook.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
