One of my first posts on this blog was about creating an automated data dictionary. Previously, I thought this was the best method. However, after delving into coding in C# more in the past year, I believe there is a better way to go about it.
What we left behind
This new method does not rely on linked servers as the previous method did. And, as a result, this method is suitable for Power BI Premium models (I don't think it's possible to create a linked server connection to a Power BI Premium workspace - even with XMLA R/W endpoints enabled). This method also does not use DMVs. DMVs are great, however they can be a bit cumbersome to wrangle.
The new method
As I alluded to earlier, this new method uses C#. In fact, all you have to do is enter a few parameters into the C# script and off you go. One point to clarify is that this script only works for models that use legacy partitions (not M partitions). And, if using Power BI Premium, you must enable XMLA R/W endpoints.
Here's a breakdown of what the C# script does.
Creates a Data Dictionary table in your model (if one does not already exist) with the proper column names, source columns, data type, partition query and data source (provided by the user).
Drops (if it exists) and recreates a Data Dictionary table within a SQL environment (server, database, schema, table name provided by the user).
Extracts the metadata from the model file and inserts it into the table created in Step 2.
Now, all that is left is to deploy and process the model. However, we can take it a step further!
Integrated deployment
Since Tabular Editor allows C# scripts to be executed during a command line deployment, you can run this script each time you deploy your model. That way, it's literally no touch (well, except for entering the descriptions of objects in your model). The fully automated approach would be to integrate this into an automated deployment (via Azure DevOps) as outlined here.
Here is an example of the code you would run to execute this (or other) C# scripts while deploying your model. For deploying other file types (i.e. database.json) or to other incarnations of tabular, see Step 5 within this post.
start /wait /d "C:\Program Files (x86)\Tabular Editor" TabularEditor.exe "<Model.bim File>" -S "<C# Script File>" -D "<Server Name>" "<Model Name>"
Steps
Download and install Tabular Editor.
Download the C# script.
Copy and paste the C# script into the Advanced Scripting window in Tabular Editor.
Modify the following parameters at the top of the C# script.
string dataSource = @"";
string serverName = @"";
string databaseName = "";
string schemaName = "";
string dbTableName = "";
dataSource: This is the name of the data source in your model for your Data Dictionary.
serverName: This is the name of the server (in your data warehouse) where the Data Dictionary will reside (note that this is the same server that is referenced by the dataSource.
databaseName: This is the database within the data warehouse for the Data Dictionary table.
schemaName: This is the schema for the Data Dictionary table within the data warehouse.
dbTableName: This is the name of the Data Dictionary table in the data warehouse. I recommend 'META_DataDictionary' to adhere to the Tabular Style Guide.
5. Click the 'play' button (or press F5).
Conclusion
This method is more concise and I believe more elegant than the previous method. Additionally, it simultaneously creates both the data warehouse side, the model side, and the data side in one go. Furthermore, it enables this self-documenting feature to work for Power BI Premium models as well.
Hi Michael,
Using Tabular editor 3.2.2 I get an error Incorrect syntax near ',' on line 250 which is
cmdInsert.ExecuteNonQuery();
Any idea how I can make this working?
Thx,
Frans
Hi Micheal,
Thanks for your great posts, they're really appreciated 😁. You created a automated data dictionary in a DWH and in Excel, now i have found another way to do it and what it solves is that you dont need an external source to store the data. I would like to share it with you, how can i do this ? Kind regards, Benito2313
Is it possible to create a data dictionary based on its own? So add datasource (the analyse services) in the model. And use a DMV statement to extract the tables, columns and so on. This way you dont need to add it to the datawarehouse and you always have the updated data. Kind regards, Benito2313
This is excellent, been looking for an approach like this!
Would a similar approach be possible where the metadata is written to a calculated table in the model using the DATATABLE syntax?
That would enable us to use this for our models that don't use a SQL Server.