From raw blocks to data diamonds

Our road to raw blocks to data diamonds

The Medallion Architecture

The medallion architecture is a data organization pattern that structures your data platform into three distinct layers, each representing a different level of data quality and refinement.

Think of it like refining raw ore into jewelry: you start with rough material and progressively transform it into something valuable and ready to use.

Why Use It?

It creates a clear, logical flow for data processing. Each layer has a single responsibility, making the system easier to understand, debug, and maintain. You always know where to find data at any stage of its journey.

The Three Layers

Bronze — The raw landing zone. Data arrives here exactly as it came from source systems, unchanged and unvalidated. It’s your safety net and audit trail.

Silver — The cleaned and conformed layer. Data is validated, deduplicated, and standardized here. Think of it as your “single source of truth” where business rules are applied.

Gold — The business-ready layer. Data is aggregated, enriched, and shaped for specific use cases like reports, dashboards, or machine learning models. This is what end users consume.

Lets get started

First things first, lets create our Workspaces:

  • ACDC 2026 Dev
  • ACDC 2026 Test
  • ACDC 2026 Production

Lets start with creating our a Lake house for our Bronze layer.

New Lake for the Bronze 🥉

Then we are creating a Dataflow Gen2 for retrieving data from Dataverse to our bronze lake. Creating connection to Dataverse from Fabric using Dataflow Gen2:

Selecting the tables we want to report on. In our case it is the Dream Project (basstards_dreamproject) table.

We are adding this raw data to our bronze data lake (after creating it doh).

Adding the data to the Bronze lake:

Choosing the destination to the bronze db.

Use the settings as is with automatic mapping. Works for now:

Save the settings and we are retriving the data and adding it to our bronze lake with raw data. Now the data is in our Bronze lake🤓

Lets go further on to the silver medal. Creating a new lake house for the silver layer:

Next, create a new Dataflow that is connected to our bronze lake that we are going to transform and then update the Silver lake:

Finding our bronze lake

Finding our Lakehouse

And connecting to the bronze lake:

Choosing the data we want to work with:

Now, the transformation begins:

  • Removing unwanted columns
  • Cleaning data
  • Renaming fields
  • Etc.

Removing some unwanted columns and renaming to make it more cleaner for the silver layer

Then adding the changes to the silver lake

Creating a connection for the silver lake

Then choosing the destination of the transformed table

Then using automapping. Doing the magic for us and saving the settings.

If we go to our ACDC_Silver lake we should see the data updated (after the Dataflow Gen2 has ran). For now refreshing it manually. You need to click Save and Run obviously oops..

Now the silver lake is update with the transformed columns

Now lets move on to the gold layer:

And again we need to create the lake for the gold layer

Then a new Dataflow Gen 2 for the last transformations for the gold layer

Then we need to connecto to the silver lake like we did for the “FromBronzeToSilver” dataflow

And finding the lakehouse source

Next connecting to the silver lake

Selecting the Silver lake data and clicking Create:

Now we retrieve the data from the Silver table

Now we have the same data from the silver lake

Now lets som aggregated data or business rule in the dataset that can be used in dashboards and reports or other subscribing systems.

For this example lets just create a column that shows the difference between the budget and estimated cost. And for the fun of it lets see what AI Prompt can help us with that:

We need to connect to FabricAI it looks like:

ooops.. that didnt work…

Lets do it in another way

Now we got a new column with the variance between the budget and the estimated cost. This contains null values and we can clean this up by replacing the null values with 0. As seen in the next steps.

By replacing null values with zeros it looks a bit cleaner.

After replacing:

And now we want to move this to the gold layer.

Like this:

Now adding the data to the gold lake destination

Using automatic settings to the destination again and then saving the settings:

Next we need to Save and Run the Dataflow Gen2, as we learned from the FromBronzeToSilver 🤓

After the Dataflow is Saved and Run, we should see the data in the Gold Lake. And look at that, it actually worked. Wohoo. Now we have our Gold medallion ready or the diamond data we want 💎

Then creating a new semantic model for use in Power BI

Now, we have a semantic model:

Lets try to make a report out of it

And then we use the semantic model we just created

And then we select the semtantic model based on our gold lake:

Aaaaaaaaand 🥁 There we have a report📈

And were out of time this year…

I hope this gave a deep insight of how we have created a very simple data platform based on the medallion structure with example of each steps from start to finish.

That was the end of this post. I hope this warms the diamond hearth (💎) of Catherine Wilhelmsen and I hope you give as plenty of points in this category. Best regards Fredrik Engseth🫶