• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Haversine formula not giving expected results

Sh4dow85

New Member
I hope someone can help, because despite extensive searching and trial and error I am getting nowhere...

My aim is to select a postcode area (eg. BS10, L20, LS9) and radius in miles, and the map visual will show all staff within that radius around the chosen postcode area.

I have two tables of data - rv_staff - this shows all staff members and their addresses, including postcodes (both full and partial) as well as the coordinates of said postcode Area Selection - this is a list of all UK postcode areas (so first half only) and the corresponding coordinates, along with area (Leeds/Manchester/Bristol etc) and other details.

These tables have a many-to-one relationship from the staff table to the area selection table on the partial postcodes (as there are more than one postcode on the staff table, and only one on the area table).

I have a radius slider set up, and I want a slicer/drop-down list to be able to choose a partial postcode from the Area Selection table. This should then populate a table of all staff and the distance from the selected postcode to each staff member (within the specified radius as selected).

Here is where I struggle though. My distance haversine formula just does not work and I don't know why, as I haven't changed anything other than the lat/long references.

The initial code I tried as per several sites/people:

Code:
Distance =

var Lat1 = MIN('Area Selection'[latitude])
var Lng1 = MIN('Area Selection'[longitude])

var Lat2 = MIN(rv_staff[Lat])
var Lng2 = MIN(rv_staff[Long])

var P = DIVIDE( PI(), 180)
var A = 0.5 - COS((Lat2-Lat1) * p)/2 +
    cos(Lat1 * p) * COS(Lat2 * P) * (1-COS((Lng2 - Lng1) * p)) /2
var final = (3958 * ASIN((SQRT(A))))
return final

With this formula (radius set to 100) it shows that people who live in Cornwall/Devon are between 50-100 miles from anywhere, even Glasgow.

So I set up two measures to retrieve the lat and long from the selected postcode area for checking purposes:

Code:
Selected Postcode - Lat = LOOKUPVALUE('Area Selection'[latitude], 'Area Selection'[postcode], SELECTEDVALUE('Area Selection'[postcode]))

Code:
Selected Postcode - Long = LOOKUPVALUE('Area Selection'[longitude], 'Area Selection'[postcode], SELECTEDVALUE('Area Selection'[postcode]))

I have these in a table and every time I select a new postcode area, the coordinates change accordingly. Some spot checking on google maps shows the coordinates are correct for the location, so I know there's no issues with the initial postcode selection.

So I tried adding these into the haversine formula to see if this would work (I don't know why it needs to be a min() value if you're using just one set of coordinates here - maybe someone can enlighten me):

Code:
Linguist Distance (Miles) =

var Lat1 = [Selected Postcode - Lat]
var Lng1 = [Selected Postcode - Long]

var Lat2 = MIN(rv_staff[Lat])
var Lng2 = MIN(rv_staff[Long])

var P = DIVIDE( PI(), 180)
var A = 0.5 - COS((Lat2-Lat1) * p)/2 +
    cos(Lat1 * p) * COS(Lat2 * P) * (1-COS((Lng2 - Lng1) * p)) /2
var final = (3958 * ASIN((SQRT(A))))
return final

This (probably expectedly) shows zero for everything.

I also tried a column approach as suggested
, but this just errors (An argument of function 'ACOS' has the wrong data type or the result is too large or too small).

Code:
Distance = acos(sin([Selected Postcode - Lat])*sin(rv_staff[Lat])+cos([Selected Postcode - Lat])*cos(rv_staff[Lat])*cos(rv_staff[Long]-[Selected Postcode - Long]))*3958

I just don't know what else to try or why I'm not getting the correct results... I have a test file which has the exact same issues, but as it's a .pbix I cannot upload it. I've double checked to ensure nothing is summarised (some were, so I've fixed that) and the Min values look to be correct. Any postcode that falls within the selected partial postcode looks to be working (so if NW1 is selected, all postcodes starting NW1 are showing correct miles, but no other postcodes show - anything that is not NW1 is showing a distance of 1780.25 miles.

Any help or suggestions would be most appreciated!
 
Last edited by a moderator:
Hello

I think there might be an issue with the formula or the way coordinates are being handled. The Haversine formula is sensitive to the units of measurement used for latitude and longitude. Ensure that your coordinates are in the correct format (usually decimal degrees) and that your distance calculation is based on the appropriate radius of the Earth (3958 miles for miles).

Here's a simplified version of the Haversine formula that you can try:

DAX

Code:
Haversine Distance = 

VAR Lat1 = RADIANS([Selected Postcode - Lat])

VAR Lon1 = RADIANS([Selected Postcode - Long])

VAR Lat2 = RADIANS(rv_staff[Lat])

VAR Lon2 = RADIANS(rv_staff[Long])

VAR dLat = Lat2 - Lat1

VAR dLon = Lon2 - Lon1

VAR a = SIN(dLat/2)^2 + COS(Lat1) * COS(Lat2) * SIN(dLon/2)^2

VAR c = 2 * ATAN2(SQRT(a), SQRT(1-a))

RETURN 6371 * c  // 6371 is the Earth's radius in kilometers. Use 3958 for miles.

This formula assumes that your coordinates are in decimal degrees. Adjust the Earth's radius according to your unit preference.

If the issue persists, there might be some data-specific aspects to investigate. Check for outliers or anomalies in your data that could be affecting the distance calculations.
 
You should be able to zip PBIX file to upload. Without looking at sample data, bit hard to pinpoint your issue.

But from what I understand, you probably shouldn't use relationships between the two tables. Just calculate starting coordinate (i.e. address coordinate). Then you need to calculate distance from that coordinate to each of coordinates in the other table and filter out those that fall outside.
 
Apologies for the lengthy delay in replying (other priorities superseded this one)...

I have tried using Montys new DAX solution, but it looks like I need to do this as a power query and I'm not sure what that is or what I need to do differently to be able to use it...

All Lat/Long values within all tables are stored as latitude/longitude in the data categories, set as decimals to 5dp - is this what you meant when you asked if they're stored as decimal degrees?

I tried zipping my tester file and attaching, but it's too large at 3.3MB, and that's after I stripped everything unessential out of it... :rolleyes: Not sure what else I can do to reduce it enough. Have added screenshots of my tester file in the hopes that helps more in the meantime.
 

Attachments

  • 2024-04-02 12_20_31-Area Map - Tester - Area Selection - Max Distance - Open Pubs.png
    2024-04-02 12_20_31-Area Map - Tester - Area Selection - Max Distance - Open Pubs.png
    138.6 KB · Views: 2
  • 2024-04-02 12_20_31-Area Map - Tester - Area Selection.png
    2024-04-02 12_20_31-Area Map - Tester - Area Selection.png
    138.9 KB · Views: 1
  • 2024-04-02 12_20_31-Area Map - Tester - Area Selection - Max Distance.png
    2024-04-02 12_20_31-Area Map - Tester - Area Selection - Max Distance.png
    66.6 KB · Views: 2
  • 2024-04-02 12_20_31-Area Map - Tester.png
    2024-04-02 12_20_31-Area Map - Tester.png
    413.3 KB · Views: 2
You can upload your file to OneDrive or Google file and share link I suppose. Or limit your data to x number of rows.
 
So your main issue here is the relationship between the two tables. If you pick NW10 in postcode slicer. That will already filter out calculation result to just the ones that match NW10 partial.

What's your end goal?
 
The end goal is to select a postcode area from the filter, and set the max radius. All results (pubs, in the case of my tester file) that fall within the max radius of the selected postcode should then be listed in the table and plotted on the map. This tester file is set up in the same way as my real file, I just have staff and their home addresses instead of pub locations.

I tried removing the relationship between the tables, which does help, but still doesn't work as expected. For example if I select LS1 and a 2 mile radius, I end up with a sausage shape of locations, stretching for around 5-6 miles N-W, and 20 miles E-W....
 
If you remove relationship. And add Filter to All within radius visual for "Is Within Threshold" is 1.
That should get you...
1712581529831.png
 
Oh by the way, you should use 7917.5 miles, in your Distance calculation.

The formula you are using uses Earth diameter and not radius.
 
That was literally going to be my next query, as things were much more accurate, but still almost a mile out. Changing the radius to diameter seems to have fixed that!

Thank you so much for your help, I've managed to transfer this into my 'real' report and believe I finally have it working :DD
 
Back
Top