Forum Stats

  • 3,873,868 Users
  • 2,266,625 Discussions


Joining columns in a DataFrame with an Oracle table

user6410582 Member Posts: 1 Green Ribbon


Here's our situation:

We have to replace existing pl/sql code which contains dblinks, since the enterprise standards have ruled dblinks to be insecure. Right now we are exploring various options. One option is to use Python but before we do that, need to figure out if it will be able to satisfy one of our use cases.

The use case is as follows:

1) Retrieve data from a driving table into a dataframe based on certain date criteria.

2) Parse pipe delimited data in one of the columns in the dataframe and create new derived columns (in memory)

3) Use the derived columns data in sql query to fetch additional data elements from another table in the source database

4) Finally once all the required data elements are populated in dataframe, update a target table (in a different database than source)

Some things to note:

1) We cannot create a temp table or any pl/sql procedures in source database. It is read only

2) We are trying to avoid looping through every record in dataframe and fetch from another source table, as performance may not be acceptable

3) Most Python examples I have seen online, suggest we get the data from both tables into separate dataframes and then merge (join?) this data in memory. However this may not work well, if the lookup tables (second table in use case above) is huge and has millions of records whereas we may need to retrieve only a few thousand based on criteria from first dataset.

Any thoughts?

Would appreciate suggestions from experts here, on how to implement this kind of logic in Python. Thank you.