Consolidating Multiple PowerQuery connections to a single connection
I encountered this issue when working on a PowerBI file where different areas of a data lake were getting imported into PowerBI via PowerQuery. There are different files type in the lake, and each type is stored a unique folder. The developer had created multiple queries, one for each file type and each query resulted in a new data source in the model as it was specific to the particular folder in the lake. This becomes evident when you publish the model to the PowerBI Service and have to set credentials for each connection. You can see the same in the desktop when you select Transform Data -> Data Source Settings.
To get rid of these connections, there are 2 steps
1) Refactor your PowerQuery to all work from a single data lake root folder. This will list all files in all sub folder of the root. Filter the root folder to the sub folder of your choice. (A parameter containing the Sub Folder name is a nice way of doing this)
2) Once you have your queries refactored, remove the old data sources
The rest of this blog looks at how you can do step 2. The obvious steps of delete or clear permissions does not work here. However, you can remove the additional data sources as follows
A - Note the location of your new root data lake folder. (XXX.core.windows.net/zones/Trusted) below. Select one of the data sources you want to remove. Click on Change Source.
B- Paste in the root data lake folder. (XXX.core.windows.net/zones/Trusted) below and click OK
C - PowerBI recognizes that this is duplicate Data Source now, and it just removes it.
D- Continue until all data sources are gone, except the unique ones needed for your solution,
E - Publish to the PowerBI service and enjoy only having to configure the data sources once!
I hope this helps someone out.