Current Functionality
The "Auto-populate Catalog" feature currently appears to function as a one-time metadata ingestion process.
As an example, once metadata has been successfully extracted from an Object Storage bucket and catalog records have been created, there is no functionality to re-run the same metadata extraction process.
This could be particularly helpful if the underlying data structure changes e.g., when new fields are introduced over time.
The only option within the action menu of an auto-populate job is to "Delete", so the current workaround would be to create a new Auto-populate extractor from scratch.
Suggested Enhancement
Introduce a “Re-run” or “Refresh Metadata” capability for existing Auto-populate Catalog extractors.
This would allow users to rescan the same Object Storage source and update the same catalog metadata when schemas evolve, without needing to recreate extractor definitions.
Potential Benefits
This enhancement could help to:
- Keep managed catalog metadata synchronized with evolving datasets
- Prevent duplication and clutter from multiple extractors for the same source
- Better support real-world schema evolution and metadata lifecycle management