Hi Everyone,
I am currently working on a project that will be turned into a blog/user group talk at some point, more on that at a later stage, however, it deals with addresses and my desire to map these addresses on maps.
Now I am not sure if you are too familiar with Irish addresses, but Irish postcodes are relatively new (launched in 2015), and plenty of addresses can be challenging to map on some of the mapping visuals available in PowerBI. In my experience, the best results are achieved when you have the GPS coordinates, so when presented with a bunch of addresses, we need to calculate the GPS.
For years now, I have had a solution running with a customer where I take the address of their customers and run it thru the Google geolocation API to get back the GPS and formatted address. This has been running as part of a typical nightly batch process where Integration Services (SSIS) has been calling the API and storing the returned address. For my new project, it is time to turn that into a DataFlow instead. So, how do you go about doing this?
1) Set up a Google Account. Google has a comprehensive Mapping API platform. You can sign up for the Google Map Platform here. They are currently offering $200 worth of usage per month for free. NB, you do have to provide payment details, so, you might want to consider setting up usage limits in case you get hit by an unexpected usage bill. For their geolocation API call, it costs $5 per 1,000 calls, so that $200 gets you 40,000 API Calls a month. (NB, yes, I know I am a Microsoft MVP, and you would expect that would use Azure and Bing for this. My SSIS code has been running for years, so I decided to port what I know, and I will do a blog for Bing at a later date)
2) As part of setting up your google account, you will end up with an API key that you will need for any calls that you make. Again, please, check your spending limits, etc when doing this. I am not going to step thru all the configuration of Google Maps. It is reasonably straightforward.
3) Having got your API, you will need to load your set of addresses somewhere that is accessible to Data Flows. There are so many options here, and I will leave it up to you. In my case, I just have the addresses in Sharepoint, although I will change that to SQL Server at a later point in my project.
4) So let's look at how the API work via a browser. The address to call is as follows, where you put your address and APIKey into the red parts.
https://maps.googleapis.com/maps/api/geocode/xml?address="AddressToGeoCodeHere"&key=YourAPIKeyHere
or
https://maps.googleapis.com/maps/api/geocode/xml?address="Leinster House, Kildare St, Dublin 2, Ireland"&key=XYZ1234567890***
*** Not my actual API key
If you just paste this into a browser, you should get a formatted address back as follows;
This XML file does not appear to have any style information associated with it. The document tree is shown below.
<GeocodeResponse>
<status>OK</status>
<result>
<type>establishment</type>
<type>point_of_interest</type>
<type>tourist_attraction</type>
<formatted_address>Kildare St, Dublin 2, Co. Dublin, Ireland</formatted_address>
<address_component>
<long_name>Kildare Street</long_name>
<short_name>Kildare St</short_name>
<type>route</type>
</address_component>
<address_component>
...</address_component>
<address_component>
...</address_component>
<address_component>
...</address_component>
<geometry>
<location>
<lat>53.3405901</lat>
<lng>-6.2539816</lng>
</location>
<location_type>GEOMETRIC_CENTER</location_type>
<viewport>
<southwest>
<lat>53.3393577</lat>
<lng>-6.2559943</lng>
</southwest>
<northeast>
<lat>53.3420557</lat>
<lng>-6.2532963</lng>
</northeast>
</viewport>
</geometry>
<partial_match>true</partial_match>
<place_id>ChIJk8HRHJoOZ0gRkE88yCQIObM</place_id>
<plus_code>
...</plus_code>
</result>
</GeocodeResponse>
You can see how amongst the XML, there is a section for the location with Latitude and Longitude coordinates. I am capturing the GEOMETRIC_CENTER, but there are others you can use as well.
5) Building a Data Flow or PowerQuery. Let's do this from scratch.
The easiest way to do this is via PowerBI Desktop, Start up PowerBI Desktop, and close that Yellow get data splash screen.
Create Some Sample Data
Clearly, in a production system, you will source your data from a file or database table. For this, I just need some sample data, so I going to select Enter Data from the Home Ribbon
At the next screen, enter some sample addresses give your column a name, and name the Table. Then Click the Edit button.
You should end up something like this.
Create two parameters as follows.
Set the address to an address of your choice and set the APIKey to the Google API key you got earlier.
Create a function to call the Google API
From the Home ribbon, select New Source -> Blank Query.
With your new query selected, click on the advanced editor
Paste the following code into the advanced editor. (replace all existing code)
let
Source = Table.FromValue(Web.Contents(
"https://maps.googleapis.com",
[
RelativePath="maps/api/geocode/xml",
Query=[
address=Address
,key=APIKey
]
]
)),
ExtractXMLResponseIntoTable = Table.AddColumn(Source, "Transform file", each Xml.Tables([Value])),
Value = ExtractXMLResponseIntoTable{0}[Value],
GetBinaryPayload = Xml.Tables(Value,null,65001),
ExtractFromPayload = GetBinaryPayload{0}[result],
// From here you can select different elements of the payload.
ExtractTheGeometryElement = ExtractFromPayload{0}[geometry],
ExtractGPS = ExtractTheGeometryElement{0}[location]
in
ExtractGPS
Make sure that the names used in the Query section match exactly to the two parameters you created. Then click Done.
Query=[
address=Address
,key=APIKey
]
If all is going well, you should at this stage have a query that returns a GPS coordinate. Rename the query to something appropriate. (CallGoogleAPI)
We will now convert it to a function by right-clicking on the query and selecting "Create Function"
Give your function a name (fctCallGoogleAPI) and click OK. You should now have the following created in PBI.
6) Now let's test our work. Return to your original SourceData query and from the Add Column ribbon, invoke a custom function.
Setup the function as follows and click OK.
In my version of PBI Desktop, I have to edit the resulting line to remove the quotes around APIKey in the line.
so the final line of code is
= Table.AddColumn(#"Changed Type", "fctCallGoogleAPI", each fctCallGoogleAPI([address], APIKey))
Our last step is to expand the function return
and, tada... we have a GPS for my address.
Conclusion
So, we have seen now how to call Google Geolocate API to get an address GPS. With some normal power query work, you should be able to link this function to any source of addresses to get the coordinates. From my experience, if you know the country, make sure you append it to the address when you pass it into the API. You will get more accurate results this way.
How the Function Works
If you just want the code, skip to the end to download it from my GitHub. Returning to the function. There are two main sections. The first bit handles the call to Google. Putting the address and key into the Query avoids scheduled refresh issues when you try to schedule this in the Power BI Service. Check Chris Webb's blog here. The response from this call is an XML payload.
Source = Table.FromValue(Web.Contents(
"https://maps.googleapis.com",
[
RelativePath="maps/api/geocode/xml",
Query=[
address=Address
,key=APIKey
]
]
))
The rest of the code deals with the extraction of elements from the Payload. I only wanted GPS, but there are other items returned from Google that you may be interested in.
ExtractXMLResponseIntoTable = Table.AddColumn(Source, "Transform file", each Xml.Tables([Value])),
Value = ExtractXMLResponseIntoTable{0}[Value],
GetBinaryPayload = Xml.Tables(Value,null,65001),
ExtractFromPayload = GetBinaryPayload{0}[result],
// From here you can select different elements of the payload.
ExtractTheGeometryElement = ExtractFromPayload{0}[geometry],
ExtractGPS = ExtractTheGeometryElement{0}[location]
in
ExtractGPS
if you want to see the different elements returned, duplicate the original CallGoogleAPI query and replace the lines in purple as follows;
ExtractFromPayload = GetBinaryPayload{0}[result]
in
ExtractFromPayload
You can now see the different elements that are returned and select what you need
I hope you find this of use. You can download the PBI file from my git hub here.
Thanks for reading.
Mark
Comments