Oracle Identity Manager 11gR2 Bulkload Utility - Strategy and Case Studies

Version 12

    To support exponential growth in the volume of identities being managed, Oracle Identity Manager’s (OIM) onboarding solutions and their implementation strategy should make operations fast and performance efficient so that system ramp-up results in minimal downtime. This article recommends a few best practices evolved over time and successfully applied for the pre- go-live phase of OIM (a.k.a. "Day 0") for onboarding, catering to some of the most common functional use cases.


     

    By Lokesh Gupta

     

    This article is intended to:

     

    • Augment what is already documented in the OIM Developers Guide
    • Present best practices/recommendations for medium- to high-volume load
    • Identify standard use cases to be referred to as templates based on a real world application of the Bulk Load tool
    • Present customer case studies and success stories

     

    Note: The content in this paper assumes use of OIM 11gR2 (11.1.2.x.x) and above, running on WebLogic Server and Oracle Database.

     

    OIM Bulk Load Utility Role in Entity Onboarding

     

    The tool uses the SQL Loader and Database PL/SQL functionality at the core to push records into OIM’s tables--rather than working through the Java API layer--because it's more scalable and performance-efficient for large volumes. There is a downside, however: limited processing options.

     

    The OIM Bulk Load utility provides for loading data in pre-defined formats from a source flat file or database source table into the OIM system. (For details on complying with the prescribed formats by the Bulk Load tool, refer to the OIM Admin Guide for Documentation of Bulk Load.)

     

    Bulk loading data is one of the few gateways via which entities are created in the OIM system, the others being user interface, reconciliation operations, etc. At this stage, it is worth mentioning that there are also post-entity creation operations (also known as post-processing jobs), such as audit snapshot generation for loaded entities, email notification, synchronization with Lightweight Directory Application Protocol (LDAP), etc.

     

    The pure bulk-load operation involves a legitimate entity creation in the OIM system. Subsequent post-processing requirements can either be taken care of by the available options in OIM or as templatized custom solution approaches (covered later in this document).

     

    This utility can load the following entities:

     

    • User
    • Accounts (i.e., target representation)
    • Role
    • Role hierarchy
    • Role membership
    • Role category

     

    OIM Bulk Load post-processing (available in OIM11g R2PS1 onwards as an OIM Scheduled Job) does the post-processing operations for users loaded via the Bulk Load utility. This job can perform only the following operations:

     

    • Password generation
    • Email notification
    • LDAP sync

     

     

    The entire bulk load and subsequent post-processing operations involve cross-cutting calls in the different tiers of OIM; for efficient data loading and the desired functionality, the following two approaches to bulk loading are proposed, based upon the criteria of data volume to be loaded.

     

    Bulk Load Strategy Overview

     

       

    Functional Requirement

     
       

    Low-Volume Load

     
       

    High-Volume Load

       

    (disable bulk load post- processing options)

     
       

    Pure Bulk Load of Entity Data

     
       

    Load via Bulk Load Utility1

     
       

    Load via Bulk Load Utility2

     
       

    Audit Trail of created user

     
       

    Inherently taken care of by bulk load post-processing job

     
       

    Post-bulk load, invocation, "Generate Snapshot" utility usage is recommended for large-volume audit snapshots.

     
       

    Password generation

     
       

     

    Inherently taken care of by bulk load post-processing job

     
       

    Recommended custom method listed for password encryption and loading. OIM Scheduled Task option for this to be disabled if password loading is part of Bulkload.

     
       

    LDAP sync

     
       

    Recommended steps prior to bulkload operation detailed out.

     
       

    Role-rule membership evaluation

     
       

    Recommended steps to handle high volume role membership listed5

     

     

     

    General Best Practices and Recommendations

     

    There are a number of best practices and recommendations that must be followed for the optimal performance in load operations.

     

    • Bulk load batch size
    • Bulk load debug flag: Debug Flag option is only for troubleshooting failed scenarios or for a sample initial load, and not for the normal loads because it can produce overhead due to logging in diagnostics information and can certainly cause performance degrades in the load.
    • OIM database schema statistics collection
    • Source of data for bulk load
    • Indexing key matching rule columns of database
      • To ensure what columns should be indexes
        • User keys
        • Account keys
        • Organization keys
        • Matching columns like usr_login,usr_status
      • DB tablespace and disk space logistics

     

    Recommendations for High-Volume Data Loads

     

    All the general best practices and guidelines above are necessary prerequisites for high-volume data loads. However, large volumes need to be split into smaller chunks, in a divide-and-conquer policy, for performance of read and writes in the database.

     

    • Split source load: Large volumes of input data to be loaded via the bulk load operations should be split into multiple smaller loads as source file/table records volume.
    • For multiple load operations, remember the following:
    • Initial load operation should be for a smaller number (like 50,000 source records).
    • Gather the OIM database schema stats.
    • From the next run onwards, a single batch should be restricted to no more than 1 million records in the input source.
    • For large loads (especially for Accounts and role memberships), a single load also should be divided into smaller chunks (i.e., split the 1 million pieces of data into multiple CSV files and load using their name references in master.txt).

     

    Processing Activities Post-Bulk Load in OIM

     

    Available Out of the Box in OIM

     

    A few additional post-process functionalities for bulk loaded users are commonly required for deployments and are available via the Bulk Load post-process job.

     

    • Email Notification: OIM notifies all users with their credentials.
    • Password generation: By default, the OIM Bulk Load utility copies the OIM-created user password, which it prompts during execution. Initially, the entire set of users will have the same password initially; we can then generate a random password using bulk load post-process job.
    • LDAP sync: Bulk load post-process job takes care of the bulk loaded users to the LDAP configured in LDAP sync mode.
    • Auditing: All the bulk loaded users will record for auditing; the first snapshot is available in the OIM system.

     

    Custom Alternate Solutions for Scalability in High Volumes

     

    Here are the suggested alternate custom approaches for scalability in high volumes:

     

    • Password generation: By default, the OIM Bulk Load utility copies the OIM-created user password, which it prompts during execution. Initially, the entire set of users will have the same password initially; we can then generate a random password using bulk load post-process job.
    • LDAP sync
      1. Load the data to LDAP first.
      2. Export the users’ data from LDAP with Globally Unique Identifier (GUID) into CSV files/DB tables.
      3. Ensure the input source for OIM Bulk Load has Distinguished Name (DN) and GUID in it (taken from step #2), and bulk load into OIM.
    • Auditing: An audit snapshot of loaded users won't be generated as a part of the OIM bulk load process. If the functional requirement is to capture an initial audit snapshot to keep track of the changes that happened in the system, we must provide a way to generate this snapshot using Generate Audit Snapshot.
    • Role Assignment: An implicit functionality of a bulk load post-process job is to assign roles to the loaded users on the basis of rules associated with that role. All users who qualify under the defined rule will be assigned those roles. With this approach, you need to use a bulk load role membership load option.

     

    A Standard Functional Use Case of OIM Bulk Load

     

    A primary use case for the Bulk Load Utilityis to seed/bootstrap the data in OIM and sync with LDAP.

     

    High-level steps

     

    Loading data into Directory [applicable only in case of LDAP-SYNC deployments only]:

     

    1. Load the data to LDAP
    2. Export the users’ data from LDAP with GUID into CSV files/DB tables.
    3. Ensure the input source for OIM Bulk Load has DN and GUID in it (taken from step 2) and bulk load into OIM.
    4. Load data into OIM  
      • Load 50K data for the first load then gather DB stats. Follow the steps in next section. (We can also opt to gather stats for individual tables to reduce the time if DB size is large.)
      • Load 1M data from next run onwards, then gather DB stats
      • Continue with 1M batch further and collect DB stats after every 4 runs

    With the above approach, at least we don't need to do any sync between OIM and LDAP to link users.

     

    Below, you'll find sample DB stats collection commands; for more detail, please refer to OIM Performance Tuning and Oracle RDBMS documentation.

     

       

    OIM Database Schema Stats Collection Method

     

    BEGIN

    dbms_stats.gather_schema_stats

    (ownname => ’<OIM Schema Name>’,

    estimate_percent => dbms_stats.auto_sample_size,

    options => ’GATHER AUTO’,

    degree => 8,

    cascade => TRUE

    );

    END;

    /

     

       

    Table Stats Collection Method

     

    BEGIN

    dbms_stats.gather_table_stats

    (ownname => ’<OIM Schema Name>’,

    Tabname => ‘<Table Name>’

       

    estimate_percent => dbms_stats.auto_sample_size,

       

    options => ’GATHER AUTO’,

       

    degree => 8,

       

    cascade => TRUE

       

    );

       

    END;

       

    /

     

     

     

     

    Case Studies

     

    To further illustrate the real-world relevance of the approaches mentioned for bulk loading of entities in OIM, the following three customer case studies each cover a specific aspect around on-boarding:

     

    1. Reusing existing user passwords
    2. Assigning roles to users (i.e., "role memberships")
    3. Access Policy Harvesting

     

    The case studies present approaches that have been developed over time for these more or less standard functional requirements, approaches that have been found useful in certain scenarios and may also prove effective in similar use cases.

     

    Note: If you are new to this utility, please review the OIM product documentation on Bulk Load Utility.

     

    CASE STUDY 1 – A global telecom solution provider

     

    Problem Statement

     

    This use case demanded password handling in OIM for all 25 million users. These passwords were being pulled from other trusted systems as all systems were now to be managed via OIM.

     

    The standard and documented usage of the tool does not make provision for the loading of unique user passwords. The only solution is to generate new random passwords via a bulk load post-process job, but that was not a good idea for such a high volume.

     

    Password sync functionality between the target and OIM must be handled by a different option.

     

    High-Level Solution Approach

     

    A custom solution was designed for the customer to bulk load users and achieve the functional requirement of the original password loading from the target itself. Downtime was to be minimal as the customer’s deployment was up and running with the OIM system.

     

    The customer and system integration teams developed a custom utility using the OIM public encryption API to encrypt the clear text password and to use the encrypted field as part of CSV files, to avoid bulk load post-process job execution.

     

    (For further insight into the solution approach, refer to Loading unique passwords with OIM bulk load.)

     

    Strategy Summary


    The 25-million-user workload was split into smaller load batches of .25 million each, to achieve load completion in 10 iterations, while meeting the customer's criteria for acceptable downtime.

     

    The strategy for each batch load is depicted in the flow diagram below:

     

     

    Strategy Execution Steps

                                                                                                                                                                                                                                                                                                                                                                                                                                                                          

       

    Step #

     
       

    Bulk Load Strategy Steps

     
       

    A

     
       

    Data loading in target from legacy trusted source

     
       

    1.

     
       

    User data is received from the legacy system

     
       

    2.

     
       

    LDIF file for OID bulk load is generated

     
       

    3.

     
       

    Generated LDIF file (bulk) loaded into OID system

     
       

    B

     
       

    Encrypting password for loading users in OIM

     
       

    4.

     
       

    Generate CSV file for OIM bulk load from OID

     
       

    5.

     
       

    CSV file includes ORCLGUID, DN for each user

     
       

    6.

     
       

    Add any encrypted field data in CSV file in addition to password field (requirement specific to this customer)

     
       

    7.

     
       

    Run custom password encryption utility to encrypt the required fields in CSV file

     
       

    8.

     
       

    Generate final CSV file for OIM Bulk Load (including user id, GUID and other encrypted files)

     
       

    9.

     
       

    Disable Issue Audit Message Scheduled Task

     
       

    10.

     
       

    Shut down OIM

     
       

    11.

     
       

    Bulk load CSV file into OIM for 250K users using batch size of 10K

     
       

    C

     
       

    Repeat load for all 250 K user batches

     
       

    12.

     
       

    Repeat the remaining iterations for rest of the load

     
       

    13.

     
       

    Start up OIM

     
       

    14.

     
       

    Verify the functionality

     
       

    15.

     
       

    End users can start using the system

     

     

     

    CASE STUDY 2 - A telecom provider in EMEA

     

     

    Problem Statement

     

    We had to onboard existing/new 3 million users, along with their telecom product subscription. This amounted to 5.2 million accounts for OIM using OIM Bulk Load Utility. We had then to harvest those accounts and link with access policies in the system so that bulk loaded accounts could be considered for any modify/retrofit use cases. We suggested using the access policy harvesting feature of OIM11gR2PS2.

     

    A final requirement: we had to complete this onboarding within 1-2 weeks.

     

    High-Level Solution Approach

     

    We had the following architecture challenges:

     

    1. OIM doesn’t provide the capability to address the requirements directly, but it does include the concepts of role/accounts.
    2. Load must be completed within acceptable SLA of customer.
    3. Existing accounts must be linked with the target.

     

     

    Assumptions for Access Policy Harvesting Functionality

     

    OIM application and database tuning as per the Oracle Tuning white paper [ID 1539554.1] available from support.oracle.com.

     

    Strategy Execution Steps

     

    The plan was to perform the task using the OIM Bulk Load Utility, leveraging the capabilities of a user post-processing job, which would trigger all the event handlers and achieve the objective in a simplified manner. Post-processing would then be followed by access policies evaluation to perform subsequent provisioning/harvesting.

     

    However, for performance and scalability, the load was to be split into batches of 200K users for bulk loading role memberships, followed by post-processing and access policy harvesting.

     

     

    Step #Bulk Load Strategy Steps
    A3M User Data Load in OIM via Bulkload Tool
    1.

    3 Million Users bulk load split in batches

    1. Load 100K
    2. Collect OIM Schema Statistics
    3. Load 900K
    4. Collect OIM Schema Statistics
    5. Load 1M
    6. Load 1M

     

    BLoad 5.2M Accounts Load 100K accounts
    2.

    Load 5.2M Accounts Load 100K accounts

    1. Collect OIM Schema Statistics
    2. Load 900k
    3. Collect OIM Schema Statistics
    4. Load 1Mn
    5. Load 1Mn
    6. Collect OIM Schema Statistics
    7. Load 1Mn
    8. Load 1.2Mn
    CLoad Roles using Bulkload
    DRepeat Load Role Membership for Users in batches of 200K
    EPost Processing for the 200K user batch
    FAccess Policy Harvesting for loaded users
    3.Ran the ‘Evaluate user policies’ job to harvest the loaded accounts. It harvested the 200K entities loaded role memberships

     

     

    CASE STUDY 3 - Popular fitness and weight loss solutions provider

     

    Problem: Load 6 Million users, 12 Million Accounts, 12M Role Memberships and achieve LDAP synchronization.

     

    Solution: Devise a strategy to enable LDAP synchronization for this high volume of users, accounts, memberships.

     

    Problem Statement

     

    Customer requirement was to load above-target volume in their OIM, which is in LDAP sync mode with OUD.

     

    Requirement is to seed/bootstrap the data in OIM and sync (i.e., update the GUID to attach the records) with OUD.

     

    Execution of post processing job is out of scope due to high load and less time.

     

    Total volume:

    • Users - 6M
    • Accounts - 12M
    • Role-memberships - 12M

     

    High-Level Strategy Steps

     

    Load data into Directory

     

    1. Load the data to OUD.
    2. Export the users’ data from OUD with GUID values.
    3. Ensure the input file for OIM bulk load has DN and GUID in it (taken from OUD after Step 1, above, is complete) and bulk load into OIM.
    4. Do not run bulk load post-process job to LDAP Sync the users from OIM to OUD.

     

    Load data into OIM

     

    1. Load Users data - Use above files for user load (including mapping of GUID columns with actual GUID values

     

    • They have already tested and are fine with the user bulk load performance
    • This load was completed in 7 iterations
    • First iteration was for 200K users
    • Second iteration loaded 800K users
    • The rest of all iterations loaded 1 million users in a single run of user bulk load
    • Batch size used was 10K
    • DB stats were collected after first iteration, then after every 2 iterations

    2. Load Accounts data - 12 M

     

    • High-level strategy is to load the data in chunks
    • Load 50K accounts first then gather DB stats
    • Load 500K accounts then gather DB stats
    • Load 1M accounts then gather DB stats
    • Continue with 1M batch further, and collect DB stats after every 4 runs

    3. Load User Role memberships - 12M

     

    • High-level strategy is to load the data in chunks
    • Load 50K memberships, then gather DB stats
    • Load 500K memberships, then gather DB stats
    • Load 1M memberships, then gather DB stats
    • Continue with 1M batch further and collect DB stats after every 4 runs

     

    For DB stats

     

    We can also opt for stats gathering for individual tables if DB size is large and to reduce the time.

     

    For generating initial audit snapshot:

     

    We suggested Generate Snapshot utility of OIM in batches.

     

    Review the latest documentation for Generate Snapshot utility.

     

     

     

    Appendix A - Known Issues and Solutions

     

       

    Issue Description

     
       

    Releases Affected

     
       

    Symptoms

     
       

    Patch # /Fixed versions/

       

    Workarounds

     
       

    Account Load performance degrades as load increases

     
       

    OIM11gR2 PS2

    (OIM versions before 11.1.2.2.2 )

     
       

    Account load taking time to complete per iterations

     

       
           
    • Slow DB queries
    • Bulkload utility not coming out
    • Patch 19245744 - ORACLE IDENTITY MANAGER BUNDLE PATCH 11.1.2.1.9 for R2PS1
    •    
     
       

     

     
       

    Role membership load is slow

     
       

    OIM11gR2 PS2 (OIM versions before 11.1.2.2.5 )

     
       

    Role membership load taking consuming significant time to complete the load.

           
    • Slow DB queries
    • Interim Patch 19628607 for INDEXES NOT BEING CREATED ON ALL THETEMP TABLES DURING ROLE MEMBERSHIP BULK LOAD for 11gR2PS1 BP6
    •    
     
       

     

           
    • This fix will also be available in R2PS2 BP5
    •    
     

     

    DISCLAIMER: Expected performance characteristics are based on laboratory test implementations and can vary based on customer requirements.

     

    Case studies might not follow the exact steps mentioned in defined strategies.

     

    The applicability of a recommended strategy depends on customer requirements and infrastructure.

     

    About the Author

     

    Lokesh Gupta is a Project Lead with the Oracle Server Technology group for Oracle Identity Manager, where he focuses on issues related to database, enterprise performance and sizing, and security.