Categories
- All Categories
- Oracle Analytics and AI Learning Hub
- 53 Oracle Analytics and AI Sharing Center
- 20 Oracle Analytics and AI Lounge
- 293 Oracle Analytics and AI News
- 57 Oracle Analytics and AI Videos
- 16.4K Oracle Analytics and AI Forums
- 6.5K Oracle Analytics and AI Labs
- Oracle Analytics and AI User Groups
- 116 Oracle Analytics and AI Trainings
- 21 Oracle Analytics and AI Challenge
- Find Partners
- For Partners
Data flow should support AIDP standard catalog access
Overview
Oracle AIDP provides a managed Spark runtime and a standard catalog for organizing enterprise data assets such as tables and volumes. Oracle Data Flow, in contrast, operates as an ephemeral job cluster that spins up Spark resources on demand to execute customer workloads. Today, Data Flow jobs cannot leverage the standard catalog that AIDP exposes, creating friction for teams that want a unified metadata layer and governance surface across both services.
Problem Statement
- AIDP native workflows are scoped to AIDP objects alone, limiting an enterprise’s ability to orchestrate cross-system processes that include external schedulers and operational tooling.
- Enterprises need a consistent way to manage Spark jobs, metadata, and governance controls across heterogeneous environments; the current separation between Data Flow and the catalog makes this difficult.
Proposed Solution
Enable Oracle Data Flow job clusters to authenticate with and read/write against the Oracle AIDP standard catalog. This capability would let each Data Flow job treat the AIDP catalog as the system of record for discovering, creating, and managing data objects (tables, volumes, and future asset types) during ETL execution. Configuration would be handled either through the Data Flow console or SDK by referencing a catalog profile (credentials, tenancy information, and permissions) that the service can use at runtime.
Key Benefits
- Unified ETL execution surface – Customers can run their ETL pipelines on Data Flow while using the standard catalog as a single metadata layer, eliminating divergent definitions.
- Scheduler-friendly orchestration – Because Data Flow jobs can already be triggered from external schedulers such as Autosys, Control-M, or OCI native schedulers, catalog-aware jobs inherit the same flexibility without being constrained by the AIDP workflow engine.
- SDK-driven automation – Data Flow’s SDK support allows programmatic submission and monitoring of ETL jobs; once catalog access is enabled, those SDK flows can seamlessly manage metadata operations as part of the same job submission.
- Enterprise integration – Large enterprises often need to coordinate AIDP-managed assets with other platforms (data warehouses, operational stores, regulatory systems). Allowing Data Flow to interact with the catalog provides a controllable interface that can participate in broader integration and synchronization patterns.
Example Use Cases
- Core ETL ingestion – Nightly ingestion jobs running in Data Flow can read source systems, land transformed data into AIDP-managed tables, and register them in the catalog automatically.
- External scheduler governance – Autosys or other scheduler-managed workflows can orchestrate Data Flow jobs that update catalog entries while coordinating dependent systems such as downstream analytics or reporting platforms.
- SDK-based DevOps pipelines – Infrastructure-as-code pipelines can submit Data Flow jobs via SDK, ensuring catalog updates are versioned and repeatable alongside application deployments.