Creating connectivity data from spatial data
ronnie-m Apr 16, 2013 5:11 AMHi,
I have some spatial data, of which I will try to describe the relevant aspects:
- A LineSegment table, which contains information on cables, including Area ID (approx 4 million rows)
- A Location table, which contains a LineSegmentID (one-to-one), and a Geometry (spatial) column (approx 4 million rows).
What I need to do is create a new table, containing conceptual "nodes", containing the following columns:
- NodeID (number)
- LineSegmentID (number)
- LineSegmentEnd (1 or 2)
So I need to work out, for each cable, which other cables it connects to, by comparing its endpoints with the endpoints of other cables in the same area. An area contains up to 464 cables. There are a total of 160 thousand areas.
I am trying to work out the most efficient way of achieving this, ideally by performing a batch job that will take less than half an hour. Oracle is relatively new to me, but I am guessing the correct approach would be to use a series of intermediate (staging) tables, as I believe nested cursors would be way too slow (I have performed a simple test to confirm this).
I guess I will need to extract, into a temp table, the start point and end point of each cable using SDO_LRS.GEOM_SEGMENT_START_PT and SDO_LRS.GEOM_SEGMENT_END_PT, along with area ID. Then join the table with itself and use SDO_GEOM.SDO_DISTANCE to work out which points are close to each other (e.g. less than one metre). However I am struggling to outline a step by step process.
Does anyone have any thoughts which may help?
(Oracle 11g)
Sample data to illustrate problem:
create table line_location (lineid number,
geometry sdo_geometry);
create table line (id number, areaid number);
-- one cable in area 1, 2 in area 4, etc.
insert into line values (1, 1);
insert into line values (2, 4);
insert into line values (3, 4);
insert into line values (4, 3);
insert into line values (5, 3);
insert into line values (6, 3);
insert into line values (7, 2);
insert into line values (8, 2);
insert into line values (9, 2);
insert into line values (10, 2);
-- in reality the lines aren't necessarily straight and simple like these...
insert into line_location values (1, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(10,1,10,4))); -- area 1
insert into line_location values (2, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(3,9,5,9))); -- area 4
insert into line_location values (3, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(5,9,5,10))); -- area 4
insert into line_location values (4, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(1,1,2,1))); -- area 3
insert into line_location values (5, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(2,3,2,1))); -- area 3
insert into line_location values (6, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(2,3,3,3))); -- area 3
insert into line_location values (7, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(5,3,5,4))); -- area 2
insert into line_location values (8, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(4,4,5,4))); -- area 2
insert into line_location values (9, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(5,9,6,4))); -- area 2
insert into line_location values (10, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(5,7,5,9))); -- area 2
create table node_line (node_id number,
line_id number,
endpoint_id number, -- 1 for startpoint, 2 for endpoint,
area_id number
);
-- expected output here. if two lines are less than 0.5 from each other, then they should share a node.
insert into node_line values (1, 1, 1, 1); -- isolated cable in area 1, start point node
insert into node_line values (2, 1, 2, 1); -- isolated cable in area 1, end point node
insert into node_line values (3, 2, 1, 4); -- area 4, cable 2, start point node
insert into node_line values (4, 2, 2, 4); -- area 4, cable 2, end point node
insert into node_line values (4, 3, 1, 4); -- cable 2's end point node = cable 3's start point node, etc
insert into node_line values (5, 3, 2, 4);
insert into node_line values (6, 4, 1, 3); -- node at (1,1)
insert into node_line values (7, 4, 2, 3); -- node at (2,1)
insert into node_line values (8, 5, 1, 3); -- node at (2,3)
insert into node_line values (7, 5, 2, 3); -- node at (2,1)
insert into node_line values (8, 6, 1, 3); -- node at (2,3)
insert into node_line values (9, 6, 2, 3); -- node at (3,3)
insert into node_line values (10, 7, 1, 2); -- node at (5,3)
insert into node_line values (11, 7, 2, 2); -- node at (5,4)
insert into node_line values (12, 7, 1, 2); -- node at (4,4)
insert into node_line values (11, 7, 2, 2); -- node at (5,4)
insert into node_line values (13, 7, 1, 2); -- node at (5,9)
insert into node_line values (14, 7, 2, 2); -- node at (6,4)
insert into node_line values (15, 7, 1, 2); -- node at (5,7)
insert into node_line values (13, 7, 2, 2); -- node at (5,9)
Thanks
I have some spatial data, of which I will try to describe the relevant aspects:
- A LineSegment table, which contains information on cables, including Area ID (approx 4 million rows)
- A Location table, which contains a LineSegmentID (one-to-one), and a Geometry (spatial) column (approx 4 million rows).
What I need to do is create a new table, containing conceptual "nodes", containing the following columns:
- NodeID (number)
- LineSegmentID (number)
- LineSegmentEnd (1 or 2)
So I need to work out, for each cable, which other cables it connects to, by comparing its endpoints with the endpoints of other cables in the same area. An area contains up to 464 cables. There are a total of 160 thousand areas.
I am trying to work out the most efficient way of achieving this, ideally by performing a batch job that will take less than half an hour. Oracle is relatively new to me, but I am guessing the correct approach would be to use a series of intermediate (staging) tables, as I believe nested cursors would be way too slow (I have performed a simple test to confirm this).
I guess I will need to extract, into a temp table, the start point and end point of each cable using SDO_LRS.GEOM_SEGMENT_START_PT and SDO_LRS.GEOM_SEGMENT_END_PT, along with area ID. Then join the table with itself and use SDO_GEOM.SDO_DISTANCE to work out which points are close to each other (e.g. less than one metre). However I am struggling to outline a step by step process.
Does anyone have any thoughts which may help?
(Oracle 11g)
Sample data to illustrate problem:
create table line_location (lineid number,
geometry sdo_geometry);
create table line (id number, areaid number);
-- one cable in area 1, 2 in area 4, etc.
insert into line values (1, 1);
insert into line values (2, 4);
insert into line values (3, 4);
insert into line values (4, 3);
insert into line values (5, 3);
insert into line values (6, 3);
insert into line values (7, 2);
insert into line values (8, 2);
insert into line values (9, 2);
insert into line values (10, 2);
-- in reality the lines aren't necessarily straight and simple like these...
insert into line_location values (1, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(10,1,10,4))); -- area 1
insert into line_location values (2, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(3,9,5,9))); -- area 4
insert into line_location values (3, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(5,9,5,10))); -- area 4
insert into line_location values (4, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(1,1,2,1))); -- area 3
insert into line_location values (5, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(2,3,2,1))); -- area 3
insert into line_location values (6, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(2,3,3,3))); -- area 3
insert into line_location values (7, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(5,3,5,4))); -- area 2
insert into line_location values (8, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(4,4,5,4))); -- area 2
insert into line_location values (9, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(5,9,6,4))); -- area 2
insert into line_location values (10, MDSYS.SDO_GEOMETRY(2002,3785,NULL,MDSYS.SDO_ELEM_INFO_ARRAY(1,2,1),MDSYS.SDO_ORDINATE_ARRAY(5,7,5,9))); -- area 2
create table node_line (node_id number,
line_id number,
endpoint_id number, -- 1 for startpoint, 2 for endpoint,
area_id number
);
-- expected output here. if two lines are less than 0.5 from each other, then they should share a node.
insert into node_line values (1, 1, 1, 1); -- isolated cable in area 1, start point node
insert into node_line values (2, 1, 2, 1); -- isolated cable in area 1, end point node
insert into node_line values (3, 2, 1, 4); -- area 4, cable 2, start point node
insert into node_line values (4, 2, 2, 4); -- area 4, cable 2, end point node
insert into node_line values (4, 3, 1, 4); -- cable 2's end point node = cable 3's start point node, etc
insert into node_line values (5, 3, 2, 4);
insert into node_line values (6, 4, 1, 3); -- node at (1,1)
insert into node_line values (7, 4, 2, 3); -- node at (2,1)
insert into node_line values (8, 5, 1, 3); -- node at (2,3)
insert into node_line values (7, 5, 2, 3); -- node at (2,1)
insert into node_line values (8, 6, 1, 3); -- node at (2,3)
insert into node_line values (9, 6, 2, 3); -- node at (3,3)
insert into node_line values (10, 7, 1, 2); -- node at (5,3)
insert into node_line values (11, 7, 2, 2); -- node at (5,4)
insert into node_line values (12, 7, 1, 2); -- node at (4,4)
insert into node_line values (11, 7, 2, 2); -- node at (5,4)
insert into node_line values (13, 7, 1, 2); -- node at (5,9)
insert into node_line values (14, 7, 2, 2); -- node at (6,4)
insert into node_line values (15, 7, 1, 2); -- node at (5,7)
insert into node_line values (13, 7, 2, 2); -- node at (5,9)
Thanks