Getting Started¶
This guide walks you through installing the Fabric Warehouse Advisor and running your first analysis.
Prerequisites¶
| Requirement | Notes |
|---|---|
| Microsoft Fabric Workspace | With at least one Fabric Warehouse or Lakehouse SQL Endpoint |
| Fabric Notebook | The advisors run inside a Fabric Spark notebook |
| Warehouse / SQL Endpoint access | The identity running the notebook (your Entra ID / service principal) must have at least Read access on the target Warehouse / SQL Endpoint |
| Same tenant | The target Warehouse / SQL Endpoint must be in the same Fabric tenant |
| Python 3.9+ | Only needed if building from source (not required on Fabric) |
Prerequisite Note
A Lakehouse is only required if you plan to upload the .whl file to the Lakehouse Files section.
The Microsoft Fabric Data Warehouse connector is pre-installed in the
Fabric Spark runtime, and Query Insights is enabled by default on every Fabric Warehouse.
Getting the Wheel¶
Option A: Install directly from PyPI (Recommended)¶
For version information, dependencies, and release notes, see the details.
Option B: Download Pre-Built¶
Download the latest .whl file from the
GitHub Releases
page — no build tools required.
Option C: Build from Source¶
On your local machine (or in any Python 3.9+ environment):
This produces two files in dist/:
fabric_warehouse_advisor-1.0.3-py3-none-any.whl— the installable wheelfabric_warehouse_advisor-1.0.3.tar.gz— source distribution
You only need the .whl file for Fabric.
Installing in Fabric¶
Option A: Per-Notebook Install (Recommended)¶
Run PIP install
Option B: Per-Notebook Install¶
- Upload the
.whlfile to your Lakehouse Files area (see the official documentation for detailed upload instructions). - In the first cell of your notebook, run:
This is the quickest way to get up and running.
Option C: Fabric Environment¶
For a more permanent setup, you can attach the library to a Fabric Environment (see the official documentation):
- In your Fabric Workspace, create or open an Environment resource
- Under Libraries, upload the
.whlfile - Publish the Environment
- Attach the Environment to your notebook (Settings → Environment)
With this approach the library is pre-installed on every session that uses
the Environment — no %pip install needed.
Running an Advisor¶
Data Clustering Advisor¶
Analyses your warehouse and recommends which tables and columns should
use CLUSTER BY:
from fabric_warehouse_advisor import DataClusteringAdvisor, DataClusteringConfig
config = DataClusteringConfig(
warehouse_name="MyWarehouse",
)
advisor = DataClusteringAdvisor(spark, config)
result = advisor.run()
See the full Data Clustering documentation for configuration options, scoring details, and report formats.
Performance Check Advisor¶
Detects performance anti-patterns across data types, caching, V-Order, and statistics:
from fabric_warehouse_advisor import PerformanceCheckAdvisor, PerformanceCheckConfig
config = PerformanceCheckConfig(
warehouse_name="MyWarehouse",
)
advisor = PerformanceCheckAdvisor(spark, config)
result = advisor.run()
See the full Performance Check documentation for configuration options, check categories, and report formats.
Working with Results¶
Both advisors return a result object with pre-formatted reports in three formats.
Viewing Reports¶
# Rich HTML report (recommended in Fabric notebooks)
displayHTML(result.html_report)
# Plain text
print(result.text_report)
# Markdown
print(result.markdown_report)
Saving Reports¶
# Save as HTML (default)
result.save("/lakehouse/default/Files/reports/report.html")
# Save as Markdown
result.save("/lakehouse/default/Files/reports/report.md", "md")
# Save as plain text
result.save("/lakehouse/default/Files/reports/report.txt", "txt")
Data Clustering — Exploring Scores¶
The Data Clustering advisor also provides a Spark DataFrame:
result.scores_df.show()
# Persist to Delta for tracking over time
result.scores_df.write.mode("overwrite").format("delta").saveAsTable(
"yourschema.clustering_advisor_scores"
)
Performance Check — Exploring Findings¶
The Performance Check advisor provides structured findings:
# Summary counts by severity
print(f"Critical: {result.critical_count}")
print(f"High: {result.high_count}")
print(f"Medium: {result.medium_count}")
print(f"Low: {result.low_count}")
print(f"Info: {result.info_count}")
# Iterate findings (sorted by severity)
for f in result.findings:
print(f"[{f.level}] {f.object_name}: {f.message}")
# Filter to actionable findings only (excludes INFO)
for f in result.findings:
if f.is_actionable:
print(f"[{f.level}] {f.object_name}: {f.message}")
if f.recommendation:
print(f" → {f.recommendation}")
Next Steps¶
- Data Clustering Advisor — full documentation
- Performance Check Advisor — full documentation
- Cross-Workspace — analyse warehouses in other workspaces
- Troubleshooting — common issues and solutions