Oracle Identity Manager 11gR2 Bulkload Utility - Strategy and Case Studies

Version 12

    To support exponential growth in the volume of identities being managed, Oracle Identity Manager’s (OIM) onboarding solutions and their implementation strategy should make operations fast and performance efficient so that system ramp-up results in minimal downtime. This article recommends a few best practices evolved over time and successfully applied for the pre- go-live phase of OIM (a.k.a. "Day 0") for onboarding, catering to some of the most common functional use cases.


    By Lokesh Gupta


    This article is intended to:


    • Augment what is already documented in the OIM Developers Guide
    • Present best practices/recommendations for medium- to high-volume load
    • Identify standard use cases to be referred to as templates based on a real world application of the Bulk Load tool
    • Present customer case studies and success stories


    Note: The content in this paper assumes use of OIM 11gR2 (11.1.2.x.x) and above, running on WebLogic Server and Oracle Database.


    OIM Bulk Load Utility Role in Entity Onboarding


    The tool uses the SQL Loader and Database PL/SQL functionality at the core to push records into OIM’s tables--rather than working through the Java API layer--because it's more scalable and performance-efficient for large volumes. There is a downside, however: limited processing options.


    The OIM Bulk Load utility provides for loading data in pre-defined formats from a source flat file or database source table into the OIM system. (For details on complying with the prescribed formats by the Bulk Load tool, refer to the OIM Admin Guide for Documentation of Bulk Load.)


    Bulk loading data is one of the few gateways via which entities are created in the OIM system, the others being user interface, reconciliation operations, etc. At this stage, it is worth mentioning that there are also post-entity creation operations (also known as post-processing jobs), such as audit snapshot generation for loaded entities, email notification, synchronization with Lightweight Directory Application Protocol (LDAP), etc.


    The pure bulk-load operation involves a legitimate entity creation in the OIM system. Subsequent post-processing requirements can either be taken care of by the available options in OIM or as templatized custom solution approaches (covered later in this document).


    This utility can load the following entities:


    • User
    • Accounts (i.e., target representation)
    • Role
    • Role hierarchy
    • Role membership
    • Role category


    OIM Bulk Load post-processing (available in OIM11g R2PS1 onwards as an OIM Scheduled Job) does the post-processing operations for users loaded via the Bulk Load utility. This job can perform only the following operations:


    • Password generation
    • Email notification
    • LDAP sync



    The entire bulk load and subsequent post-processing operations involve cross-cutting calls in the different tiers of OIM; for efficient data loading and the desired functionality, the following two approaches to bulk loading are proposed, based upon the criteria of data volume to be loaded.


    Bulk Load Strategy Overview



    Functional Requirement


    Low-Volume Load


    High-Volume Load


    (disable bulk load post- processing options)


    Pure Bulk Load of Entity Data


    Load via Bulk Load Utility1


    Load via Bulk Load Utility2


    Audit Trail of created user


    Inherently taken care of by bulk load post-processing job


    Post-bulk load, invocation, "Generate Snapshot" utility usage is recommended for large-volume audit snapshots.


    Password generation



    Inherently taken care of by bulk load post-processing job


    Recommended custom method listed for password encryption and loading. OIM Scheduled Task option for this to be disabled if password loading is part of Bulkload.


    LDAP sync


    Recommended steps prior to bulkload operation detailed out.


    Role-rule membership evaluation


    Recommended steps to handle high volume role membership listed5




    General Best Practices and Recommendations


    There are a number of best practices and recommendations that must be followed for the optimal performance in load operations.


    • Bulk load batch size
    • Bulk load debug flag: Debug Flag option is only for troubleshooting failed scenarios or for a sample initial load, and not for the normal loads because it can produce overhead due to logging in diagnostics information and can certainly cause performance degrades in the load.
    • OIM database schema statistics collection
    • Source of data for bulk load
    • Indexing key matching rule columns of database
      • To ensure what columns should be indexes
        • User keys
        • Account keys
        • Organization keys
        • Matching columns like usr_login,usr_status
      • DB tablespace and disk space logistics


    Recommendations for High-Volume Data Loads


    All the general best practices and guidelines above are necessary prerequisites for high-volume data loads. However, large volumes need to be split into smaller chunks, in a divide-and-conquer policy, for performance of read and writes in the database.


    • Split source load: Large volumes of input data to be loaded via the bulk load operations should be split into multiple smaller loads as source file/table records volume.
    • For multiple load operations, remember the following:
    • Initial load operation should be for a smaller number (like 50,000 source records).
    • Gather the OIM database schema stats.
    • From the next run onwards, a single batch should be restricted to no more than 1 million records in the input source.
    • For large loads (especially for Accounts and role memberships), a single load also should be divided into smaller chunks (i.e., split the 1 million pieces of data into multiple CSV files and load using their name references in master.txt).


    Processing Activities Post-Bulk Load in OIM


    Available Out of the Box in OIM


    A few additional post-process functionalities for bulk loaded users are commonly required for deployments and are available via the Bulk Load post-process job.


    • Email Notification: OIM notifies all users with their credentials.
    • Password generation: By default, the OIM Bulk Load utility copies the OIM-created user password, which it prompts during execution. Initially, the entire set of users will have the same password initially; we can then generate a random password using bulk load post-process job.
    • LDAP sync: Bulk load post-process job takes care of the bulk loaded users to the LDAP configured in LDAP sync mode.
    • Auditing: All the bulk loaded users will record for auditing; the first snapshot is available in the OIM system.


    Custom Alternate Solutions for Scalability in High Volumes


    Here are the suggested alternate custom approaches for scalability in high volumes:


    • Password generation: By default, the OIM Bulk Load utility copies the OIM-created user password, which it prompts during execution. Initially, the entire set of users will have the same password initially; we can then generate a random password using bulk load post-process job.
    • LDAP sync
      1. Load the data to LDAP first.
      2. Export the users’ data from LDAP with Globally Unique Identifier (GUID) into CSV files/DB tables.
      3. Ensure the input source for OIM Bulk Load has Distinguished Name (DN) and GUID in it (taken from step #2), and bulk load into OIM.
    • Auditing: An audit snapshot of loaded users won't be generated as a part of the OIM bulk load process. If the functional requirement is to capture an initial audit snapshot to keep track of the changes that happened in the system, we must provide a way to generate this snapshot using Generate Audit Snapshot.
    • Role Assignment: An implicit functionality of a bulk load post-process job is to assign roles to the loaded users on the basis of rules associated with that role. All users who qualify under the defined rule will be assigned those roles. With this approach, you need to use a bulk load role membership load option.


    A Standard Functional Use Case of OIM Bulk Load


    A primary use case for the Bulk Load Utilityis to seed/bootstrap the data in OIM and sync with LDAP.


    High-level steps


    Loading data into Directory [applicable only in case of LDAP-SYNC deployments only]:


    1. Load the data to LDAP
    2. Export the users’ data from LDAP with GUID into CSV files/DB tables.
    3. Ensure the input source for OIM Bulk Load has DN and GUID in it (taken from step 2) and bulk load into OIM.
    4. Load data into OIM  
      • Load 50K data for the first load then gather DB stats. Follow the steps in next section. (We can also opt to gather stats for individual tables to reduce the time if DB size is large.)
      • Load 1M data from next run onwards, then gather DB stats
      • Continue with 1M batch further and collect DB stats after every 4 runs

    With the above approach, at least we don't need to do any sync between OIM and LDAP to link users.


    Below, you'll find sample DB stats collection commands; for more detail, please refer to OIM Performance Tuning and Oracle RDBMS documentation.



    OIM Database Schema Stats Collection Method




    (ownname => ’<OIM Schema Name>’,

    estimate_percent => dbms_stats.auto_sample_size,

    options => ’GATHER AUTO’,

    degree => 8,

    cascade => TRUE






    Table Stats Collection Method




    (ownname => ’<OIM Schema Name>’,

    Tabname => ‘<Table Name>’


    estimate_percent => dbms_stats.auto_sample_size,


    options => ’GATHER AUTO’,


    degree => 8,


    cascade => TRUE











    Case Studies


    To further illustrate the real-world relevance of the approaches mentioned for bulk loading of entities in OIM, the following three customer case studies each cover a specific aspect around on-boarding:


    1. Reusing existing user passwords
    2. Assigning roles to users (i.e., "role memberships")
    3. Access Policy Harvesting


    The case studies present approaches that have been developed over time for these more or less standard functional requirements, approaches that have been found useful in certain scenarios and may also prove effective in similar use cases.


    Note: If you are new to this utility, please review the OIM product documentation on Bulk Load Utility.


    CASE STUDY 1 – A global telecom solution provider


    Problem Statement


    This use case demanded password handling in OIM for all 25 million users. These passwords were being pulled from other trusted systems as all systems were now to be managed via OIM.


    The standard and documented usage of the tool does not make provision for the loading of unique user passwords. The only solution is to generate new random passwords via a bulk load post-process job, but that was not a good idea for such a high volume.


    Password sync functionality between the target and OIM must be handled by a different option.


    High-Level Solution Approach


    A custom solution was designed for the customer to bulk load users and achieve the functional requirement of the original password loading from the target itself. Downtime was to be minimal as the customer’s deployment was up and running with the OIM system.


    The customer and system integration teams developed a custom utility using the OIM public encryption API to encrypt the clear text password and to use the encrypted field as part of CSV files, to avoid bulk load post-process job execution.


    (For further insight into the solution approach, refer to Loading unique passwords with OIM bulk load.)


    Strategy Summary

    The 25-million-user workload was split into smaller load batches of .25 million each, to achieve load completion in 10 iterations, while meeting the customer's criteria for acceptable downtime.


    The strategy for each batch load is depicted in the flow diagram below:



    Strategy Execution Steps



    Step #


    Bulk Load Strategy Steps




    Data loading in target from legacy trusted source




    User data is received from the legacy system




    LDIF file for OID bulk load is generated




    Generated LDIF file (bulk) loaded into OID system




    Encrypting password for loading users in OIM




    Generate CSV file for OIM bulk load from OID




    CSV file includes ORCLGUID, DN for each user




    Add any encrypted field data in CSV file in addition to password field (requirement specific to this customer)




    Run custom password encryption utility to encrypt the required fields in CSV file




    Generate final CSV file for OIM Bulk Load (including user id, GUID and other encrypted files)




    Disable Issue Audit Message Scheduled Task




    Shut down OIM




    Bulk load CSV file into OIM for 250K users using batch size of 10K




    Repeat load for all 250 K user batches




    Repeat the remaining iterations for rest of the load




    Start up OIM




    Verify the functionality




    End users can start using the system




    CASE STUDY 2 - A telecom provider in EMEA



    Problem Statement


    We had to onboard existing/new 3 million users, along with their telecom product subscription. This amounted to 5.2 million accounts for OIM using OIM Bulk Load Utility. We had then to harvest those accounts and link with access policies in the system so that bulk loaded accounts could be considered for any modify/retrofit use cases. We suggested using the access policy harvesting feature of OIM11gR2PS2.


    A final requirement: we had to complete this onboarding within 1-2 weeks.


    High-Level Solution Approach


    We had the following architecture challenges:


    1. OIM doesn’t provide the capability to address the requirements directly, but it does include the concepts of role/accounts.
    2. Load must be completed within acceptable SLA of customer.
    3. Existing accounts must be linked with the target.



    Assumptions for Access Policy Harvesting Functionality


    OIM application and database tuning as per the Oracle Tuning white paper [ID 1539554.1] available from


    Strategy Execution Steps


    The plan was to perform the task using the OIM Bulk Load Utility, leveraging the capabilities of a user post-processing job, which would trigger all the event handlers and achieve the objective in a simplified manner. Post-processing would then be followed by access policies evaluation to perform subsequent provisioning/harvesting.


    However, for performance and scalability, the load was to be split into batches of 200K users for bulk loading role memberships, followed by post-processing and access policy harvesting.



    Step #Bulk Load Strategy Steps
    A3M User Data Load in OIM via Bulkload Tool

    3 Million Users bulk load split in batches

    1. Load 100K
    2. Collect OIM Schema Statistics
    3. Load 900K
    4. Collect OIM Schema Statistics
    5. Load 1M
    6. Load 1M


    BLoad 5.2M Accounts Load 100K accounts

    Load 5.2M Accounts Load 100K accounts

    1. Collect OIM Schema Statistics
    2. Load 900k
    3. Collect OIM Schema Statistics
    4. Load 1Mn
    5. Load 1Mn
    6. Collect OIM Schema Statistics
    7. Load 1Mn
    8. Load 1.2Mn
    CLoad Roles using Bulkload
    DRepeat Load Role Membership for Users in batches of 200K
    EPost Processing for the 200K user batch
    FAccess Policy Harvesting for loaded users
    3.Ran the ‘Evaluate user policies’ job to harvest the loaded accounts. It harvested the 200K entities loaded role memberships



    CASE STUDY 3 - Popular fitness and weight loss solutions provider


    Problem: Load 6 Million users, 12 Million Accounts, 12M Role Memberships and achieve LDAP synchronization.


    Solution: Devise a strategy to enable LDAP synchronization for this high volume of users, accounts, memberships.


    Problem Statement


    Customer requirement was to load above-target volume in their OIM, which is in LDAP sync mode with OUD.


    Requirement is to seed/bootstrap the data in OIM and sync (i.e., update the GUID to attach the records) with OUD.


    Execution of post processing job is out of scope due to high load and less time.


    Total volume:

    • Users - 6M
    • Accounts - 12M
    • Role-memberships - 12M


    High-Level Strategy Steps


    Load data into Directory


    1. Load the data to OUD.
    2. Export the users’ data from OUD with GUID values.
    3. Ensure the input file for OIM bulk load has DN and GUID in it (taken from OUD after Step 1, above, is complete) and bulk load into OIM.
    4. Do not run bulk load post-process job to LDAP Sync the users from OIM to OUD.


    Load data into OIM


    1. Load Users data - Use above files for user load (including mapping of GUID columns with actual GUID values


    • They have already tested and are fine with the user bulk load performance
    • This load was completed in 7 iterations
    • First iteration was for 200K users
    • Second iteration loaded 800K users
    • The rest of all iterations loaded 1 million users in a single run of user bulk load
    • Batch size used was 10K
    • DB stats were collected after first iteration, then after every 2 iterations

    2. Load Accounts data - 12 M


    • High-level strategy is to load the data in chunks
    • Load 50K accounts first then gather DB stats
    • Load 500K accounts then gather DB stats
    • Load 1M accounts then gather DB stats
    • Continue with 1M batch further, and collect DB stats after every 4 runs

    3. Load User Role memberships - 12M


    • High-level strategy is to load the data in chunks
    • Load 50K memberships, then gather DB stats
    • Load 500K memberships, then gather DB stats
    • Load 1M memberships, then gather DB stats
    • Continue with 1M batch further and collect DB stats after every 4 runs


    For DB stats


    We can also opt for stats gathering for individual tables if DB size is large and to reduce the time.


    For generating initial audit snapshot:


    We suggested Generate Snapshot utility of OIM in batches.


    Review the latest documentation for Generate Snapshot utility.




    Appendix A - Known Issues and Solutions



    Issue Description


    Releases Affected




    Patch # /Fixed versions/




    Account Load performance degrades as load increases


    OIM11gR2 PS2

    (OIM versions before )


    Account load taking time to complete per iterations


    • Slow DB queries
    • Bulkload utility not coming out



    Role membership load is slow


    OIM11gR2 PS2 (OIM versions before )


    Role membership load taking consuming significant time to complete the load.

    • Slow DB queries


    • This fix will also be available in R2PS2 BP5


    DISCLAIMER: Expected performance characteristics are based on laboratory test implementations and can vary based on customer requirements.


    Case studies might not follow the exact steps mentioned in defined strategies.


    The applicability of a recommended strategy depends on customer requirements and infrastructure.


    About the Author


    Lokesh Gupta is a Project Lead with the Oracle Server Technology group for Oracle Identity Manager, where he focuses on issues related to database, enterprise performance and sizing, and security.