Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

Insert into 2 fields value of one field

423911Mar 13 2007 — edited Apr 4 2007
I am trying to separate this value into 2 different fields.

Select last_name from clients1;

will return

' Lopez Gonzalez'

What I am trying to do is put 'Lopez' in last_name_1 and 'Gonzalez' in last_name_2 of a table.

The query will be returning around 3000 last names.

Any help?
Thanks,

Comments

Tubby
Answer

Only thing that comes to mind is using Oracle Spatial instead of trying to roll your own.

https://blogs.oracle.com/EnterpriseAnalytics/entry/create_the_geometry_columns_in

Cheers,

Marked as Answer by kjc · Sep 27 2020
Frank Kulash

Hi,

One thing you can do is to only call the function when you can't otherwise rule out the row.

Try something like this:

WHERE   CASE

           WHEN latitude   < p_lat  - p_destFromPoint  THEN 1E99

           WHEN  latitude  > p_lat  + p_destFromPoint  THEN 1E99

           WHEN  longitude < p_long - p_destFromPoint  THEN 1E99

           WHEN  longitude > p_long + p_destFromPoint  THEN 1E99

                                                       ELSE round(distance(latitude, longitude, p_lat, p_long), 2)

       END  < p_distFromPoint

I didn't figure the exact calculations, but I hope you get the idea.  If the latitude is way out of bounds, you can discard the row without calculating the exact distance.  The same goes for the longitude.

AndrewSayer

Its possible that Oracle Spacial has what you need (I'm not too familiar)

I think the key here will be reducing the row set before having to apply the function. If the result set is aimed to only be a few rows then indexes could help.

I would add to your query sanity limits e.g. 
lat1 has to be between lat2-distance and lat2+distance

long1 has to be between long2-distance and long2+distance
.

If the data in the table is grown with a serial process then you could achieve the access on this using bitmaps, but a standard btree index like (lat1,long1) will be alright (benchmark!),

This will mean that your function will only be evaluated for these rows rather than the entire table, this might provide a good enough performance boost on its own, the indexes may not necessarily be a good idea only when they would limit the number of blocks visited enough to outweigh a multiblock io full tablescan.

Could you provide some ddl , sample data and the execution plans it is currently running with? That should give us plenty of information to work with.

mathguy

[Sorry, I just noticed I started this as a response to Tubby...  meant it for the OP.]

You can try a few things.

First, I am not sure why using NVL makes sense in this case. If you don't have a place's longitude, why is it more meaningful to assume it is on the Greenwich meridian than on any other? I would do away with the NVL's and just return NULL when either latitude or longitude is not known. In most cases they ARE known and you will save the time of checking for NULL. Perhaps have a MV with just the points with known longitude and latitude (and with the additional fields discussed in the next paragraph) and work only off that MV.

Then, you may want to store the values of sin(lat/degtorad) and cos(lat/degtorad) in your base table (or MV), and lon/degtorad so you save that division when you query. (As a general observation, if you couldn't store the lon/degtorad results, you should at least compute the difference in latitudes first, and only then divide by degtorad - and perhaps compute 1/degtorad and MULTIPLY by this number as needed, as I believe multiplication will take less time than division. Better to store the pre-divided results, i.e. latitudes in radians, though.)

Finally, you may experiment with keeping the normal vectors at each position in the base table or MV, and use the arctan formula for spherical distance:

https://en.wikipedia.org/wiki/Great-circle_distance#Vector_version - this way most of the trig function computations are done ahead of time; vector and scalar products only involve addition and multiplication, length of the vector product requires three squares and a square root, and you only take an arctan.

I would be careful with excluding locations based on longitude alone; at high latitudes, ALL longitudes are close to each other (they all come together at the poles).

Best,   mathguy-ro

ADDED:   Also, do not compare "distance" to "radius TIMES <something>" - instead of 112,00 multiplications you can have one division, by comparing "distance/radius"  to  "<something>"  And same comment about applying NVL to radius. If really makes no sense to assume radius = 0 anyway.

Frank Kulash

Hi,

mathguy-ro wrote:

...

I would be careful with excluding locations based on longitude alone; at high latitudes, ALL longitudes are close to each other (they all come together at the poles).

...

Good point!

The basic point of repies 2 and 3 still stands: try to avoid calling any user-defined function whenever possible.  No matter how efficient the function is, merely calling it thousands of times will make a query slow.

Nimish Garg

I used following formula to calculate distance, if you are not planning to use Spatial Index

Distance = (sqrt((69.1 * (BLATITUDE - ALATITUDE)) * (69.1 * (BLATITUDE - ALATITUDE)) + 53.0 * (BLONGITUDE - ALONGITUDE) * 53.0 * (BLONGITUDE - ALONGITUDE))) * 1.609344

Here

PointA = (ALATITUDE,ALONGITUDE)

PointB = (BLATITUDE,BLONGITUDE)

BluShadow

Of course there's also the option of doing the calculation as part of the SQL query directly rather than calling a PL/SQL function for it, as that is currently introducing context switching which is known to slow down queries.

Chris Hunt

I don't know if it's an option for you, but can you store the location in Grid reference form as well/instead?

For example, if I knew that all the addresses I had to deal with would be in the UK, I'd store the OS Grid reference as well as the Lat/Long. The Grid is measured in Kilometre squares, so it's simple trigonometry to work out the distance between two points.

Even if you can't do this for all your data, it might be worthwhile writing a function that accepts both GRs and Lat/Longs, and works out the distance cheaply with trig if GRs are available for a particular pair of points falling back to Lat/Long only if they aren't.

Chris Hunt

Maybe shave a few microseconds off by not repeating calculations...

CREATE OR REPLACE FUNCTION UCIIS."DISTANCE" (Lat1 IN NUMBER,

                                    Lon1 IN NUMBER,

                                    Lat2 IN NUMBER,

                                    Lon2 IN NUMBER,

                                    Radius IN NUMBER DEFAULT 3963) RETURN NUMBER IS

-- Convert degrees to radians

   DegToRad NUMBER := 57.29577951;


   -- Lat/longs in radians

   Lat1Rad        NUMBER;

   Lon1Rad        NUMBER;

   Lat2Rad        NUMBER;

   Lon2Rad        NUMBER;


   ReturnValue    NUMBER;

   ACOS_Param     NUMBER;

BEGIN

   IF Lat1 IS NULL

   OR Lon1 IS NULL

   OR Lat2 IS NULL

   OR Lon2 IS NULL

   OR Radius IS NULL THEN

      -- We can't work out the distance if a parameter is missing. Return NULL

      ReturnValue := NULL;

   ELSE

      Lat1Rad := Lat1 / DegToRad;   -- Do these divisions once!

      Lon1Rad := Lon1 / DegToRad;

      Lat2Rad := Lat2 / DegToRad;

      Lon2Rad := Lon2 / DegToRad;


      ACOS_Param := (SIN(Lat1Rad) * SIN(Lat2Rad)) +

                    (COS(Lat1Rad) * COS(Lat2Rad) * COS(Lon2Rad - Lon1Rad));


      -- Ensure ACOS_Param is between -1..+1

      ACOS_Param := GREATEST(LEAST(ACOS_Param,1),-1);

     

      ReturnValue := Radius * ACOS(ACOS_Param);

   END IF;

   RETURN ReturnValue;

END;

You might also consider using one of the formulae at https://en.wikipedia.org/wiki/Geographical_distance for calculating the distance, though they may not come back any quicker. Note that if you do, it might be better to write a function that returns the square of the distance and comparing it with the square of p_distfrompoint, instead of doing an expensive square root.

[Deleted User]

I find it a bit of a shame that basically all these solutions are reinventing the wheel. If you create an sdo_geometry column from your lat/long columns (single update statement), you could solve this easily by using

select facility_id as facid

from facility_location f

where sdo_within_distance(f.location_point, point_of_interest, 'distance=10') = 'true';

No need for complicated calculations that do not take proper datums and projections into account, so if the distances are large won't be correct. Set up your spatial stuff properly and this will be a breeze (because Oracle will have done most of the complicated stuff for you).

Cheers,

Stefan

mathguy

@ Nimish Garg:

Your formula has no chance of being right. For two points at the same latitude, the distance depends on longitudes AND the common latitude; close to the poles the distance is close to zero regardless of longitudes. In your formula if latitudes are the same the distance becomes a constant times the difference in longitudes. Where did you come up with this?

John Brady - UK

user13399186 wrote:

I am trying to find all the records that are with in a given distance of a Lat/Long point.  I am successful at doing this but I need the result set to come back faster.  Do you have any suggestions on how I could do this different in order to get the data back with in 2 or 3 seconds?

Currently there are 112,400 records in this table and growing.

I am executing the following sql statement:

        SELECT DISTINCT FACILITY_ID AS facid

        FROM facility_location

        where round(distance(latitude, longitude, p_lat, p_long), 2) < p_distFromPoint

Currently it takes 18 seconds for the result set to be returned.  Any suggestions what I can do to obtain the result set faster.

Thanks!!

Your problem will be that Oracle is doing a full table scan on FACILITY_LOCATION for each query execution, and as the table gets bigger in size so the table scan takes longer and longer.  Every time the table doubles in size the table scan will take twice as long to complete.

The obvious solution is to create some kind of index. which can be used to lookup the matching data quicker.  But that is almost impossible given the way the query is currently phrased.  You need to be indexing data that is only in the table, whereas your function call uses what I assume are 2 PL/SQL variables, so you cannot easily create an index directly on that.

What I'd suggest is adding extra constraints to your WHERE clause that involve the same main columns but are simpler and approximate tests for failing the distance test.  False positives are okay, because we will be keeping the existing distance test as well.  This way the Optimizer can choose to use the indexes on the table for the "simpler, approximate" tests, and then do the final more accurate "distance" test on only the rows it retrieves.  Because you have filtered down the rows being retrieved from the whole table to only a small subset, it should execute much faster.

Assuming that Latitude and Longitude are references to points in a 2-D flat space, then you can also say that if the "real distance" from another point is less than a given value, then the differences in the Latitude and Longitude values are also less than the same distance.  This is a kind of transfer of one condition into a similar condition on different data columns.  The query then becomes:

SELECT DISTINCT FACILITY_ID AS facid

FROM facility_location

where round(distance(latitude, longitude, p_lat, p_long), 2) < p_distFromPoint

AND latitude between (p_lat - p_distFromPoint) and (p_lat + p_distFromPoint)

AND longitude between (p_long - p_distFromPoint) and (p_long + p_distFromPoint)

Then create an index on (latitude, longitude) and the Optimizer should use that instead - one index on both columns together.  In my mind you are rephrasing the query to say that logically first there is a square area with the p_lat, p_long point at its centre, within which any valid point must be.  You then refine this by only checking points within this square area for their absolute distance from the target point - which is really a circular area around the target point.  And you get the points within the square area using an index, which is faster than a full table scan and reduces the row count to be processed significantly.

You may have to adjust the comparisons depending on whether Latitude and Longitude can have negative values and whether an absolute value should be used for comparison purposes or not.  What about wrapping around 180 degrees where it switches from east to west, or vice versa?  I don't know anything about that, but you would need to make sure the simple distance calculations were correct and always worked for whatever input values were provided.

John Brady

Chris Hunt

John Brady - UK wrote:

Assuming that Latitude and Longitude are references to points in a 2-D flat space

That seems like a pretty bold assumption, given that latitude and longitude - by definition - are references to points on a 3D spherical surface.

[Deleted User]

Chris Hunt wrote:

John Brady - UK wrote:

Assuming that Latitude and Longitude are references to points in a 2-D flat space

That seems like a pretty bold assumption, given that latitude and longitude - by definition - are references to points on a 3D spherical surface.

Not by defintion. It depends on which projection you use. There are plenty of equidistant cylindrical projections out there that use degrees as their units of measurement, whill still representing a flat 2D surface: https://en.wikipedia.org/wiki/Equirectangular_projection

So it really depends on what coordinate system/projection the OP is using for his or her data which formula's would be the correct ones. Which is another reason to use spatial, because you do not have to worry as much about those things.

mathguy

@Stefan Jager:     No, there is NO equidistant projection of a sphere / ellipsoid / Earth surface to a cylinder. You don't seem to understand the difference between equidistant and equirectangular. On a cylindrical projection with the parallels shown as straight lines, the distance from the Greenwich meridian to the 180 degree meridian is the same at the Equator as at the poles, which is clearly incorrect in the real life. Since a cylinder surface can be flattened out on a plane, what you are saying is the same as "The Earth is flat."

I am not familiar with sdo, but from my brief and incomplete reading it won't work either. If I am reading it correctly, sdo_within_distance calculates Euclidian distance (straight line - the distance between the poles is through the center of the Earth, not navigating on its surface). I don't see how that is a better approximation than the correct one the OP had.

To the OP:  -  on the other hand, what do you need these distances for? They are ignoring elevations, and they are not road distances. They may help in navigation or flight, but those activities very likely already have their databases and distance formulas precompiled etc.

Best,    mathguy-ro

John Brady - UK

Chris Hunt wrote:

John Brady - UK wrote:

Assuming that Latitude and Longitude are references to points in a 2-D flat space

That seems like a pretty bold assumption, given that latitude and longitude - by definition - are references to points on a 3D spherical surface.

Chris - I fully agree.  That's why I stated it as an assumption.  I also stated later on in my reply that I don't know anything about spatial data and distance calculations.  I was suggesting something that the OP could look into and follow up on, and I wanted to make it clear that they would need to verify everything themselves.  I'm happy for you to point out how questionable that assumption is, as it just reinforces the point that the OP needs to double check everything themselves, and confirm whether such assumptions are valid or not for their circumstances.

[Deleted User]

mathguy-ro wrote:

@Stefan Jager:     No, there is NO equidistant projection of a sphere / ellipsoid / Earth surface to a cylinder. You don't seem to understand the difference between equidistant and equirectangular. On a cylindrical projection with the parallels shown as straight lines, the distance from the Greenwich meridian to the 180 degree meridian is the same at the Equator as at the poles, which is clearly incorrect in the real life. Since a cylinder surface can be flattened out on a plane, what you are saying is the same as "The Earth is flat."

Sorry, but you're the one who doesn't have it right. There are plenty of such projections, in which case the calculation methods used in this thread have some validity. See https://en.wikipedia.org/wiki/Map_projection

And this comment: "which is clearly incorrect in the real life" doesn't make any sense: Google Maps is using a variation of the Mercator projection, and is used very much in real life (whether that was the right thing to do by Google or not is a point of debate....). But EVERY projection will be more or less incorrect, since the Earth isn't even an ellispoid but a geoid - so there will never be a straightforward projection, especially not on a global level. Everything will have distortion in some way or another.

So what do you mean by this? It may be that you coming from a mathematicians point of view (I think, guessing from your nickname) and me coming from a geodesy/surveyor's point of view we're not talking about the same thing :-)

Anyway, whether the distance the OP tries to calculate makes any sense or has any use whatsoever is up to him, and completely depends on what his business case/goal is. I was merely pointing out that there is no need to do everything yourself if you use sdo, and in this case there isn't even the need for the extra license fee for Spatial - the datatype, indexing and the spatial operators are available on every version without having to pay the extra license fee for Spatial.

Cheers,

Stefan

mathguy

Stefan:  There is no equidistant ("isometric" is the technical term) mapping between a sphere and a plane. A sphere has constant Gaussian curvature of 1/R where R is the radius; the plane has Gaussian curvature 0, and Gaussian curvature is INVARIANT in isometric mappings. Therefore no portion of a sphere, no matter how small, can be mapped isometrically (distance-preserving) to the plane. For full disclosure, I literally have a PhD in this nonsense, so I won't discuss this further - it would be unfair. For the opposite point of view:  http://www.theflatearthsociety.org/cms/

The formula in the original post, and the additional one I suggested using normal vectors, is an EXACT formula for a geodesic (navigating on a sphere's surface, not THROUGH it). It is NOT based on ANY projection. On the other hand, as I mentioned, it seems to me sdo measures distance THROUGH the earth, not on its surface.

Best regards,     mathguy-ro

mathguy

Adding to my earlier reply (#4):  Not only compare "distance/radius" to "arc cos ..."  (so you don't multiply by radius 112,000 times); ASLO take cos of both sides. Keep in mind cos is a strictly DECREASING function; so your comparison should be:   cos (distance / radius) >=  ACOS_param.  This way you take cos once, instead of arc cos 112,000 times.

Best,   mathguy-ro

[Deleted User]

mathguy-ro wrote:

Stefan:  There is no equidistant ("isometric" is the technical term) mapping between a sphere and a plane. A sphere has constant Gaussian curvature of 1/R where R is the radius; the plane has Gaussian curvature 0, and Gaussian curvature is INVARIANT in isometric mappings. Therefore no portion of a sphere, no matter how small, can be mapped isometrically (distance-preserving) to the plane. For full disclosure, I literally have a PhD in this nonsense, so I won't discuss this further - it would be unfair. For the opposite point of view:  http://www.theflatearthsociety.org/cms/

The formula in the original post, and the additional one I suggested using normal vectors, is an EXACT formula for a geodesic (navigating on a sphere's surface, not THROUGH it). It is NOT based on ANY projection. On the other hand, as I mentioned, it seems to me sdo measures distance THROUGH the earth, not on its surface.

Best regards,     mathguy-ro

As I said: you are looking at this from amathematical point of view, I am looking from a geo-information geek point of view, and from both our standpoints we are both right. Congratulations on the PhD, I've worked with this kind of stuff for all of my life: surveying and creating maps of (parts of) the world, for customers worldwide. This is an application of such a thing, and for all we know the OP only wants to have a rough indication of the distances between his or her locations to see if they need heavier radio equipment, in which case all discussions of accuracy, through or above the earth, projections etc in any way shape or form are moot. If you'd like to rethink your decision to discuss thiss, let's move this discussion to the forum so others can join in as well. Right now I'll apologise to the OP for sort of hijacking this thread :-)

Regards,

Stefan

Scott Swank

Monotonically decreasing over this range anyway.

So the Euclidean metric gives you the length of the chord between the points, and the spherical distance follows from it deterministically.

If the Euclidean is within your range you do not need to calculate the spherical metric

Otherwise, given the chord, and the radius of the earth, R, the arc is just 2R * asin(chord / 2R).

And then, where d is your distance, you want: 2R * asin(chord/2R) <= d

equivalently: chord <= 2R * sin(d/2R)


For any given value of d you only have to compute the right hand side once, and of course you should square both sides of that inequality to avoid the sqrt() call for the chord distance.

(Double check my math against your current result of course...)

Cheers,

Scott

mathguy

OK. Without debating any further the need to do these computations, just for Oracle/SQL/PLSQL practice (which is my only interest in this), I ran a few tests. First I generated a table with 112,400 rows, with columns location_id, latitude, longitude. location_id was just "level" in a select from dual connect by level <= 112400; latitude and longitude generated randomly between 25 and 50 (degrees N) and between -125 and -70 (degrees W), simulating approximately the United States. Sorry for the bias!

Then I ran the query with the OP's function, looking for all locations within 100 miles of the point at 35 degrees N (latitude), -90 degrees West (longitude), which is somewhere on the Mississippi/Tennessee state line if I am seeing it right. The query returns 654 results (with the random locations generated earlier). Over a few trials, the query took between 14 and 16 seconds.

Then I added columns to the "locations" table, for sin(latitude), cos(latitude) and rad_lon (longitude in radians). I wrote the PL/SQL function to compute the cos of the central angle between "input position" and locations in the table, and in the query I compared cos (dist/radius) (computed just once and input as a magic number) against the calculated values of this function. To return just the locations (same 654 results) the running time is consistently around 1.3 seconds. Adding the distances to the results increases the time to 1.6 seconds, and to also ORDER by distance adds another 0.1 seconds - I get the results with distances and ordered in 1.7 seconds, consistently. In this approach distances are not calculated "in the first pass" (in the WHERE part of query execution), only the cos of the central angle is computed. To compute the actual distance we need to take an arc cos (ACOS) and multiply by the radius. This is done when the SELECT clause is evaluated, which means it's done only for the rows that pass the WHERE condition - I am still computing the arc cosine and multiplying by radius, but only 654 times, not 112,400 times.

Finally, I in-lined everything (wrote the computations directly in the query, no PL/SQL function, so no function call overhead); this way the time to get locations and distances, ordered by distance, is consistently less than 1.5 seconds.

If further improvement is needed: first, I imagine the job can be divided so it can run on more than one processor at once. (Not on my free XE version though!) A faster computer may help too, as well as turning off screen output (I didn't try that, it may reduce execution time, perhaps significantly). As the number of points increases, a subdivision method as suggested by others may help. In its simplest form, say you draw a "vertical line" along a meridian, down the middle of the covered area. For each location, compute and store the distance from that location to this "halfway meridian." Then, given any input position and input distance, test all the locations in the "proper half" and only those locations in the "wrong half" that have distance to the "halfway meridian" less than or equal to the input distance. This only divides the number of computations approximately in half (if the distance is reasonably small, like 100 miles). The process can be iterated, and there are more intelligent ways to subdivide than brute force (like my cutting in half).

Best,    mathguy-ro

Scott Swank

Could you post your scripts? I'd like to poke around at this too, see how things shake out.

Thanks,

Scott

mathguy

As soon as I can, traveling across the US right now. Cheers!

mathguy

Sir: I finally got around to looking at this again. Google may use the Mercator projection to build its maps; it almost certainly  does NOT use the projection to compute distances by simply measuring the planar distance between points on the Mercator projection. I haven't used the Google map tools before, so this exercise was good - I learned a few things. On Google maps, if you right click anywhere on the map, you can select "What's here?" Among the information returned are the longitude and latitude. Then, if you right-click anywhere on the map and select "Measure distance," you can then click somewhere else on the map and you will get the "crow's fly" distance between the two points.

Google does not expose (make public) their calculation methodology, as far as I was able to research it. Please share your information to the contrary if you are aware of any. They just call it "distance." For testing purposes, I used the map tools to display the Google-calculated distance between Dodgers Stadium in Los Angeles and Times Square in New York City. (For those outside the U.S., if you are not familiar with these places - it doesn't matter!) Google maps says the distance is 2,433.53 miles. The direct computation assuming the Earth is spherical, with radius 3,959 miles, yields 2424.43 miles; the difference is 0.37% in relative terms, and it makes me speculate that Google may in fact measure distances on the geoid (which would have been my first and only guess before ever looking at these issues in the last week).

Depending on the OP's practical problem he is trying to solve, considering the Earth a sphere of radius 3,959 miles  (the average radius of the reference ellipsoid) may or may not be acceptable. This is for the OP to answer. On the other hand, the three-dimensional Euclidian distance (through the sphere of radius 3,959 miles) from L.A. to N.Y., same locations, is 2,386.73 miles; an error of 1.96% compared to the Google maps distance (which I assume is accurate), or more than five times greater than the great circle distance. Perhaps an error of less than 2% is acceptable for the practical application too.

Regards,   mathguy-ro

Chris Hunt

Google may use the Mercator projection to build its maps

It uses this: https://en.wikipedia.org/wiki/Web_Mercator

mathguy

Understood, and thank you for the link. You understand, though, that the point in dispute is what Google uses to calculate distances. You are not suggesting they use the modified Mercator projection for THAT purpose, correct?

Best,   mathguy-ro

mathguy

Hi,

In this reply and the next, I will post the scripts I used. I look forward to expert critique of my code - I am still less than two months into learning Oracle, so I know I make plenty of mistakes and I often make things too complicated. My sincere thanks in advance.

To Scott:  In your discussion of using Euclidean distance as a first approximation, you should exclude points that fail the distance test, not include those that pass the test. Euclidean distance is shorter, not longer, than great circle distance. Please make sure you pay attention to that in your experiments.

OK, here goes. First, the script to create the locations table and the geo-constants table. I create the locations table in the format used by the OP first. Then I add columns and populate them; everything is in the script. Then I show what I see on my screen when I run the script. The last two commands in the script are just a sanity check, I want to make sure I didn't populate the tables with junk. Oh, and one more note - to focus on learning Oracle SQL and PL/SQL, I use SQL*Plus; I will learn SQL Developer and Toad later, but I am very much a console type of user for now.

variable c number   -- SQL*Plus command - declares bind variable used below

/

create table locations (location_id varchar(12), latitude number, longitude number)

/

insert into locations (select dbms_random.string('U', 6) || to_char(level, 'FM000009'),

                              dbms_random.value(25, 50),

                              dbms_random.value(-125, -70)

                      from dual

                      connect by level <= 112400

                      )

/

create table geo_constants as

                      (select 3959 earth_radius, -- average radius of reference ellipsoid, in miles

                              atan(1) / 45 degrees_to_radians -- atan(1) = PI/4, so this is PI/180; no "PI" in Oracle?

                      from dual

                      )

/

alter table locations add (cos_lat number, sin_lat number, lon_in_rad number)

/

begin select degrees_to_radians into :c from geo_constants; end;

/

update locations set cos_lat = cos (latitude * :c), sin_lat = sin (latitude * :c), lon_in_rad = longitude * :c

/

select * from geo_constants

/

select * from locations where rownum <= 10

/

commit

/

SQL> start create_locations   -- script is called create_locations.sql

Commit complete.          

Elapsed: 00:00:00.00          -- creation of bind variable complete

Table created.

Elapsed: 00:00:00.00          -- creation of table locations complete

112400 rows created.

Elapsed: 00:00:01.90          -- location_id, latitude and longitude are now populated

Table created.

Elapsed: 00:00:00.01          -- creation of table geo_constants complete

Table altered.

Elapsed: 00:00:00.00          -- added columns cos_lat, sin_lat, lon_in_rad

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.01                      -- value of degrees_to_radians constant factor loaded into bind variable

112400 rows updated.

Elapsed: 00:00:10.94          -- cos_lat, sin_lat, lon_in_rad calculated and saved in locations table; this takes long but is a one-time thing

EARTH_RADIUS DEGREES_TO_RADIANS
------------ ------------------
        3959         .017453293

1 row selected.

Elapsed: 00:00:00.00          -- shows what was saved in geo_constants table

LOCATION_ID    LATITUDE  LONGITUDE    COS_LAT    SIN_LAT LON_IN_RAD
------------ ---------- ---------- ---------- ---------- ----------
XLEWPY000001 29.5966451 -116.37418  .86952385 .493890954 -2.0311126
KYONPE000002 32.0941491 -79.941536 .847176182 .531312071  -1.395243
IJHZFY000003 28.1074674 -79.727984 .882065471 .471126845 -1.3915158
MAEOAF000004 43.4965253 -121.48587 .725416115 .688310584 -2.1203285
XBFLXE000005 48.6473538 -85.039053 .660691688 .750657374 -1.4842115
WOTQLP000006 40.4568868 -70.575654 .760894439 .648875684 -1.2317775
HQZQXI000007 44.8860445 -104.00229 .708511745  .70569902 -1.8151824
LILQEY000008 30.5181562 -109.12586 .861468285 .507811376 -1.9046055
ATZMIF000009 38.6451865 -114.22895 .781028204 .624495752 -1.9936713
YAQQOT000010  45.316528 -81.057205 .703189631 .711002351 -1.4147151

10 rows selected.

Elapsed: 00:00:00.41          -- shows ten rows from locations table

Commit complete.

Elapsed: 00:00:00.05

mathguy

Running the OP's code (adapted to my setup - I am not carrying around city, zip etc., my schema has a different name, ...)

Function code:

CREATE OR REPLACE FUNCTION op_distance (Lat1 IN NUMBER,

                                    Lon1 IN NUMBER,

                                    Lat2 IN NUMBER,

                                    Lon2 IN NUMBER,

                                    Radius IN NUMBER DEFAULT 3963) RETURN NUMBER IS

-- Convert degrees to radians

DegToRad NUMBER := 57.29577951;

ReturnValue NUMBER;

ACOS_Param NUMBER;

BEGIN

  ACOS_Param := (sin(NVL(Lat1,0) / DegToRad) * SIN(NVL(Lat2,0) / DegToRad)) +

        (COS(NVL(Lat1,0) / DegToRad) * COS(NVL(Lat2,0) / DegToRad) *

        COS(NVL(Lon2,0) / DegToRad - NVL(Lon1,0)/ DegToRad));

  -- Check if greater than 1 due to floating point errors

  IF ACOS_Param > 1 THEN

    ACOS_Param := 1;

  END IF;

  -- Check if less than -1 due to floating point errors

  IF ACOS_Param < -1 THEN

    ACOS_Param := -1;

  END IF;

  ReturnValue := NVL(Radius,0) * ACOS(ACOS_Param);

  RETURN ReturnValue;

END;

/

Function compiles fine. Then it is used in this query, with my test values: within 100 mile radius of point at 35 degrees N, 90 degrees W (shown as -90).

select distinct location_id

from locations

where round(op_distance(latitude, longitude, 35, -90), 2) < 100

/

Query runs fine and returns 685 results. The first time (when the optimizer has to work harder) takes longer, about 20 seconds on my machine, but after the execution plan is created and saved, execution takes 14 seconds. Of course, for production the desired distance and coordinates of the desired location should be bind variables.

Screen output (truncated to show only a few last locations that satisfy the condition):

[...]

JBPINO084585
OPBDKB085219
WSFJLM087308
RXWPSG090373
NMADIS094641
VBHHBA095884
BPWDXR096308
OHFFOJ105632
YQSYAP106533
VQEVCH108125

685 rows selected.

Elapsed: 00:00:13.97


Note: the number of results is different from my earlier post; to test my scripts, I re-created the locations table, which includes RANDOM locations, so the new locations are not the same as I used last week.

Then I tested the query again, dropping "distinct" from the select statement (location_id SHOULD be unique!) and also the round(..., 2) (why would that be needed)? - the execution time drops to 13.8 seconds, consistently.

Note that in this approach the distances are not shown, but they ARE all calculated, so they could be easily shown. ORDERING by distance should add the same overhead to the various methods.

mathguy

My function, calculating the cos of the central angle between two locations given by their coordinates:

CREATE OR REPLACE FUNCTION cos_central_angle (cos_lat1 IN NUMBER, sin_lat1 IN NUMBER,

                                    rad_lon1 IN NUMBER,

                                    cos_lat2 IN NUMBER, sin_lat2 IN NUMBER,

                                    rad_lon2 IN NUMBER) RETURN NUMBER IS

-- Convert degrees to radians ;

return_value NUMBER;

ACOS_Param NUMBER;

BEGIN

  return_value := sin_lat1 * sin_lat2 +

        cos_lat1 * cos_lat2 * cos(rad_lon1 - rad_lon2);

  -- Check if greater than 1 due to floating point errors

  IF return_value > 1 THEN

    return_value := 1;

  END IF;

  -- Check if less than -1 due to floating point errors

  IF return_value < -1 THEN

    return_value := -1;

  END IF;

  RETURN return_value;

END;

Compile, then call it like this:

variable p_cos number

/

variable p_sin number

/

variable p_lon_in_rad number

/

variable p_cos_dist_over_radius number

/

begin select cos(35 * degrees_to_radians), sin(35 * degrees_to_radians), -90 * degrees_to_radians,

   cos(100/earth_radius) into :p_cos, :p_sin, :p_lon_in_rad, :p_cos_dist_over_radius from geo_constants;

end;

/

select location_id from locations, geo_constants

where cos_central_angle(cos_lat, sin_lat, lon_in_rad, :p_cos, :p_sin, :p_lon_in_rad) > :p_cos_dist_over_radius

/

Results:

VQEVCH108125
PQWGFJ108166
XBJBHZ108621
NADEEU110841
JERCYH111236
WBXCDJ111761
CJONJM111782
CZATLW112266

688 rows selected.

Elapsed: 00:00:01.59

Note the number of results - 688 vs. 685 with the OP's method. This is almost surely due to rounding in both methods. ALSO (important) - the initial computation of the bind variables also takes some time; 1.6 seconds is the time taken by the query, after the variables are calculated and assigned to the bind variables.

unknown-7404

Seems we haven't heard from OP in a few days - maybe they got lost in the 'math' and can't find their way out again?

I am not familiar with sdo, but from my brief and incomplete reading it won't work either. If I am reading it correctly, sdo_within_distance calculates Euclidian distance (straight line - the distance between the poles is through the center of the Earth, not navigating on its surface). 

That package works with 'geometries' so calculations of distance are between those entities. So it really has NOTHING to do with poles, earth or navigation.

Tom Kyte first showed how the Oracle SDO_GEOMETRY package can be used for OPs problem over a decade ago when 9i was first coming out.

You may be interested to read his solution:

Ask Tom: On Measuring Distance and Aging

As for the second part—making it go faster—I think we can do that. Not only can you make it go faster, but you can do it more easily. Hidden inside the database is a lot of functionality for extended datatypes—XML, text, image, audio, video, and even spatial data.

So I asked our resident spatial guru, Clarke Colombo, to look at this problem, and he suggests that you augment the tables to include a spatial column and then use that spatial column in the query. Here is the code Clarke had to add:

alter table b add (geometry mdsys.sdo_geometry);

There are also numerous other threads in the forum with solutions offered using that same package.

Calculate distance between Latitude and Longitude

I assume you mean you have two points in lat/long, and want to know how far it is between them.

Most editions of Oracle come with no-cost spatial utilities known as Oracle Locator. Consequently, you can probably just do something like this:

select

        sdo_geom.sdo_distance

                (

                sdo_geometry

                        (

                        -- this identifies the object as a two-dimensional point.

                        2001,

                        -- this identifies the object as using the GCS_WGS_1984 geographic coordinate system.

                        4326,

                        null,

                        sdo_elem_info_array(1, 1, 1),

                        -- this is the longitude and latitude of point 1.

                        sdo_ordinate_array(151.20208, -33.883741)

                        ),

                sdo_geometry

                        (

                        -- this identifies the object as a two-dimensional point.

                        2001,

                        -- this identifies the object as using the GCS_WGS_1984 geographic coordinate system.

                        4326,

                        null,

                        sdo_elem_info_array(1, 1, 1),

                        -- this is the longitude and latitude of point 2.

                        sdo_ordinate_array(151.195986, -33.87266)

                        ),

                1,

                'unit=KM'

                ) distance_km

from

        dual

  

mathguy

... and the direct query (with everything inline so there are no function overcalls, and the point-specific data pre-calculated and shown in the query as magic numbers):

select location_id,

      3959 * acos(sin_lat * .573576436351 + cos_lat * .819152044289 * cos(-1.570796326795 - lon_in_rad)) as dist

from  locations

where  sin_lat * .573576436351 + cos_lat * .819152044289 * cos(-1.570796326795 - lon_in_rad)

      > .999681010851

order by dist

/

[...]

JVRATJ041896  99.242979
EQWVMO076822 99.4761816
SOQCKE061070 99.5289891
NWDOCK050545 99.7457305
WWIERH043986 99.7543319
RNKFPY060437 99.9260218
BZHWSZ025528 99.9435002
POIRRB092847 99.9904584

688 rows selected.

Elapsed: 00:00:01.36

unknown-7404

If you really want help you should start responding to the may people that are trying to help you.

I am trying to find all the records that are with in a given distance of a Lat/Long point.

Ok - if you read all of the replies thus far you will find that no one really knows EXACTLY what you mean or how you plan to use the data (e.g. what precision you need, how often queries are run, how often the data changes, etc).

The most common example is how 'distance' is at all useful without taking altitude and geography into account. Two points might be only a half mile apart but they are on opposite sides of the grand canyon (in the U.S.) then it may take you a couple of hours in a car to get from one to the other.

Or if one point is on the rim and the other on the colorado river you're gonna have one heck of a donkey ride!

Do you have any suggestions on how I could do this different in order to get the data back with in 2 or 3 seconds?

I'll let the mathematicians battle it out over which methods are better for what purposes.

My comments ONLY relate to the code you posted and are generic as relates to avoiding unnecessary work. They mostly summarize bits and pieces of what several others have already said.

Which calls the function "distance":

1. don't call functions when you don't have to. As others have said filter the data in the query so the function is ONLY called when needed and ONLY passing data that is VALID and within range

CREATE OR REPLACE FUNCTION UCIIS."DISTANCE" (Lat1 IN NUMBER,

                                     Lon1 IN NUMBER,

                                     Lat2 IN NUMBER,

                                     Lon2 IN NUMBER,

                                     Radius IN NUMBER DEFAULT 3963) RETURN NUMBER IS

-- Convert degrees to radians

DegToRad NUMBER := 57.29577951;

ReturnValue NUMBER;

ACOS_Param NUMBER;

BEGIN

  ACOS_Param := (sin(NVL(Lat1,0) / DegToRad) * SIN(NVL(Lat2,0) / DegToRad)) +

        (COS(NVL(Lat1,0) / DegToRad) * COS(NVL(Lat2,0) / DegToRad) *

         COS(NVL(Lon2,0) / DegToRad - NVL(Lon1,0)/ DegToRad));

2. don't call the function with parameters that are NULL. Filter those rows in the query

3. don't compute SIN, COS or any other function on EVERY ROW for a constant. One pair of parameters (lat2, lon2?) are constants. The SIN or COS of a constant is a constant so compute the value ONE TIME BEFORE calling the function.

Either put the values into a CONTEXT variable so the function can use them or create a wrapper function that returns a REF CURSOR and put the constants in that function.

  ReturnValue := NVL(Radius,0) * ACOS(ACOS_Param);

4. 'Radius' is a CONSTANT - make sure it has a valid value BEFORE you call the function. Set a CONTEXT variable or validate it in the wrapper function. Besides the parameter has a DEFAULT value so how can it possibly be NULL?

5. You get the BEST performance from code that actually does the work when you feed it VALID data so it doesn't have to do any NULL, limit or other types of validation. All validation (or as much as possible) should be done BEFORE calling the heavy-duty code.

6. As others have stated perform ANY AND ALL possible computations (conversions and the like) BEFORE you call the work code.

I suspect that just removing those totally unnecessary computations, without any other changes, will improve performance dramatically.

mathguy

Hi,

Thank you for the pointers and the links. I will spend some time reading to see what I can learn. As I mentioned in other replies, I am not particularly interested in this problem, but it was a very good practical problem to learn on. I probably learned more coming up with different ways to address the problem than by using a pre-packaged function. With that said, if I ever actually need something like this, I will use system functions as much as possible (after I read the documentation and convince myself the functions are done correctly... odd things happen sometimes). And the OP should definitely look at the links you provided.

So - thanks again for the help!    mathguy-ro

kjc

Sorry, for not replying;  Family emergency and was unavailable.

Thank you for all of your suggestions.  I appreciate your insight and advise.

Replies #1 (Tubby) and #10 (Stefan Jager) suggested using Oracle's Geo Spacial functionality.  I was not aware of this functionality.  I would assume that Oracle has tested and worked with with this functionality to make it as fast as possible.  Therefore, I would think it would be smart to use it and capitalize on someone else's work.

Your thoughts?  Has anyone used Oracle's Geo Spacial functionality?  Pros/Cons?

Thanks again for all of your help!!  Look forward to your replies.

PhilHerring

user13399186 wrote:

Your thoughts?  Has anyone used Oracle's Geo Spacial functionality?  Pros/Cons?

Spatial (or in your case, Oracle Locator) has been around for a long time and used heavily by thousands of organisations for critical applications. Aside from the overhead of installing it, it's easy to use. It also gives you Spatial indexes, and it sounds like you could benefit from that.

There's a forum for Spatial users that might be of more help than the standard SQL and PL/SQL forums:

1 - 36
Locked Post
New comments cannot be posted to this locked post.

Post Details

Locked on May 2 2007
Added on Mar 13 2007
7 comments
4,217 views