Getting Started¶

This guide walks you through installing the Fabric Warehouse Advisor and running your first analysis.

Prerequisites¶

Requirement	Notes
Microsoft Fabric Workspace	With at least one Fabric Warehouse or Lakehouse SQL Endpoint
Fabric Notebook	The advisors run inside a Fabric Spark notebook
Warehouse / SQL Endpoint access	The identity running the notebook (your Entra ID / service principal) must have at least Read access on the target Warehouse / SQL Endpoint
Same tenant	The target Warehouse / SQL Endpoint must be in the same Fabric tenant
Python 3.9+	Only needed if building from source (not required on Fabric)

Prerequisite Note

A Lakehouse is only required if you plan to upload the .whl file to the Lakehouse Files section. The Microsoft Fabric Data Warehouse connector is pre-installed in the Fabric Spark runtime, and Query Insights is enabled by default on every Fabric Warehouse.

Getting the Wheel¶

Option A: Install directly from PyPI (Recommended)¶

%pip install fabric-warehouse-advisor

For version information, dependencies, and release notes, see the details.

Option B: Download Pre-Built¶

Download the latest .whl file from the GitHub Releases page — no build tools required.

Option C: Build from Source¶

On your local machine (or in any Python 3.9+ environment):

pip install build
python -m build

This produces two files in dist/:

fabric_warehouse_advisor-1.0.3-py3-none-any.whl — the installable wheel
fabric_warehouse_advisor-1.0.3.tar.gz — source distribution

You only need the .whl file for Fabric.

Installing in Fabric¶

Option A: Per-Notebook Install (Recommended)¶

Run PIP install

%pip install fabric-warehouse-advisor

Option B: Per-Notebook Install¶

Upload the .whl file to your Lakehouse Files area (see the official documentation for detailed upload instructions).
In the first cell of your notebook, run:

%pip install /lakehouse/default/Files/fabric_warehouse_advisor-1.0.3-py3-none-any.whl

This is the quickest way to get up and running.

Option C: Fabric Environment¶

For a more permanent setup, you can attach the library to a Fabric Environment (see the official documentation):

In your Fabric Workspace, create or open an Environment resource
Under Libraries, upload the .whl file
Publish the Environment
Attach the Environment to your notebook (Settings → Environment)

With this approach the library is pre-installed on every session that uses the Environment — no %pip install needed.

Running an Advisor¶

Data Clustering Advisor¶

Analyses your warehouse and recommends which tables and columns should use CLUSTER BY:

from fabric_warehouse_advisor import DataClusteringAdvisor, DataClusteringConfig

config = DataClusteringConfig(
    warehouse_name="MyWarehouse",
)

advisor = DataClusteringAdvisor(spark, config)
result = advisor.run()

See the full Data Clustering documentation for configuration options, scoring details, and report formats.

Performance Check Advisor¶

Detects performance anti-patterns across data types, caching, V-Order, and statistics:

from fabric_warehouse_advisor import PerformanceCheckAdvisor, PerformanceCheckConfig

config = PerformanceCheckConfig(
    warehouse_name="MyWarehouse",
)

advisor = PerformanceCheckAdvisor(spark, config)
result = advisor.run()

See the full Performance Check documentation for configuration options, check categories, and report formats.

Working with Results¶

Both advisors return a result object with pre-formatted reports in three formats.

Viewing Reports¶

# Rich HTML report (recommended in Fabric notebooks)
displayHTML(result.html_report)

# Plain text
print(result.text_report)

# Markdown
print(result.markdown_report)

Saving Reports¶

# Save as HTML (default)
result.save("/lakehouse/default/Files/reports/report.html")

# Save as Markdown
result.save("/lakehouse/default/Files/reports/report.md", "md")

# Save as plain text
result.save("/lakehouse/default/Files/reports/report.txt", "txt")

Data Clustering — Exploring Scores¶

The Data Clustering advisor also provides a Spark DataFrame:

result.scores_df.show()

# Persist to Delta for tracking over time
result.scores_df.write.mode("overwrite").format("delta").saveAsTable(
    "yourschema.clustering_advisor_scores"
)

Performance Check — Exploring Findings¶

The Performance Check advisor provides structured findings:

# Summary counts by severity
print(f"Critical: {result.critical_count}")
print(f"High:     {result.high_count}")
print(f"Medium:   {result.medium_count}")
print(f"Low:      {result.low_count}")
print(f"Info:     {result.info_count}")

# Iterate findings (sorted by severity)
for f in result.findings:
    print(f"[{f.level}] {f.object_name}: {f.message}")

# Filter to actionable findings only (excludes INFO)
for f in result.findings:
    if f.is_actionable:
        print(f"[{f.level}] {f.object_name}: {f.message}")
        if f.recommendation:
            print(f"  → {f.recommendation}")

Next Steps¶

Data Clustering Advisor — full documentation
Performance Check Advisor — full documentation
Cross-Workspace — analyse warehouses in other workspaces
Troubleshooting — common issues and solutions