A Guide to Maintaining Your Search Indexes for WebCenter Content


Some time ago, we provided a script that provides a general health output of the OracleTextSearch Index. However, the best management of WCC-native index solutions is to have DBAs proactively optimize search indexes on a regular basis.

WCC, by default, performs a “fast” optimization of the full-text index after every 5000 indexed documents. There are multiple different types of optimization including:

  • Full – Removes garbage data (postings for deleted doc IDs) and compresses fragmented postings. Full optimization is the only type that can be time-limited.
  • Token – Useful if very specific queries need to be tuned, such as an empty search.
  • Fast – Only compresses fragmented postings.
  • Rebuild – Takes a complete copy of the $I table (and $S if present) and removed all garbage and fragmentation when building the new copy.

Note: If an index is highly fragmented, it may not be possible to time-limit a Full optimization. Depending on the amount of data, optimization can take upwards of a day to complete. Please see Doc ID 467215.1 for more details.

For best management of your search index, it is recommended to run ‘Full’ and ‘Rebuild’ optimization on a regular basis.

Prior to Configuring a Scheduled DB Job:

  1. In the config.cfg file, set the following parameters-
  • RebuildOptimizationInterval=50000000
  • FastOptimizationInterval=50000000

2. Restart all WCC managed servers.

3. In WCC, go to Administration → Scheduled Jobs Administration → Active Scheduled Jobs

4. If Opt_FT_IDCText1 or Opt_FT_IdcText2 are listed, select delete in the action drop down list.

5. In WCC, Administration → Configuration. The active indexes are listed.


SearchIndexerEngineNameActive indexDatabase tableDatabase index

The active index values in the table show the table and index are active in the database.

Determining the Level of Index Fragmentation:

To determine the fragmentation state of the index, the following PL/SQL can be run to pull the CTX_REPORT.INDEX_STATS report.  This can be run as the WCC user and creates a one-row table called “output” in the WCC user’s schema.  Notice that FT_IDCTEXT2 is set in this example.


The report may be longer than 4000 characters—

From SQL Developer:

  1. Use the “run statement” (not “run script”)
  2. Double click on the blob
  3. Click on the pencil icon to see the full output

Using SQL Plus see the full report in chunks of 4000 characters using queries like the following:

  • SELECT DBMS_LOB.SUBSTR(result,4000,1) FROM output;
  • SELECT DBMS_LOB.SUBSTR(result,4000,4000) FROM output;
  • SELECT DBMS_LOB.SUBSTR(result,4000,8000) FROM output;

Search in the returned report for the term “FRAGMENTATION STATISTICS”.  For example, the fragmentation statistics on this sample report show the following for the $I table of the index (DR$FT_IDCTEXT2$I is where index tokens are actually stored).


                         FRAGMENTATION STATISTICS


   total size of $I data:              25,975,528 (24.77 MB)

   $I rows:                            567,178

   estimated $I rows if optimal:       22,209

   estimated row fragmentation:        96 %

Note: Fragmentation of the index is high in this example, and therefore the index should be optimized. Index fragmentation can be high even after performing an index rebuild.

Depending on the size of your index, this script can sometimes take several hours to run. If several hours of lowered performance cannot be handled, we recommend running this script instead. This script estimates the level of fragmentation, rather than providing a definitive result.

Optimization of Index:

The following are example statements for running FULL and REBUILD optimization:

Note: the table name FT_IDCTEXT1, should be changed to match the active index as found on the Configuration for Instance page in Content Server UI.

In this example the FULL optimization runs for 240 minutes with a parallel degree of 1.

Full Optimization Script

Occasionally the index should be optimized in REBUILD mode as well. This is especially true if the index remains fragmented even after a FULL optimization

Rebuild Optimization Script

In some cases, running the optimize index for the first time can cause a large amount of redo logs to be created on the database. If the optimize index cannot complete, or the log files on the database are filling up, restart the database in NOARCHIVELOG mode and run the optimize index procedure. Once the optimization completes, restart the database in ARCHIVELOG mode. 

How can the optimize process be monitored to see where the progress is?

There is no direct report that can be run. However, you can add tracing around the procedure and monitor the resulting logs

The following SQL is just an example. The table and usernames will need to be changed to match your parameters.

nohup sqlplus /nolog @opt_ind_p4.sql > opt_ind_p4.log &

Where the opt_ind_p4.sql script contains:

connect / as sysdba
set echo on timing on time on
exec ctx_output.start_log (‘Opt_REBUILD_P4_FT_IDCTEXT2_’||to_char(sysdate,’DD-MON-YY_HH24MISS’)||’.log’);
exec CTX_DDL.optimize_index(‘WCCPRD_OCS.FT_IDCTEXT2’, ‘REBUILD’, parallel_degree =>’1′);
exec ctx_output.end_log

Five CTX logs will be created under $ORACLE_HOME/ctx/log

Occasionally the procedure will run for a VERY long time:  Review OEMM Database Maintenance Task Not Running or Running Very Slow (Doc ID 2370180.1)

It is recommended by Oracle to run an optimization every week (but depends on the level of content) during off-hours. Index optimization is a very I/O and CPU intensive.

Checking Index Fragmentation Post-Optimization:

This PL/SQL is is the same as the one listed in the blog section, ‘Determining the Level of Index Fragmentation’, except that the output table is dropped and re-created so there will only be one row in the table.

Index Fragmentation Post-Optimization Script

Example Screenshots of Scheduled DB Jobs from EM:

Has your organization been in fear of performing an index collection rebuild? Contact Inspired ECM to learn how we can manage and optimize your content repository or offload your indexing capabilities to newer, more efficient products in the WebCenter sphere.