Stopwords and Stoplists

A stopword is a word that is not to be indexed. A stopword is usually a low information word such as a, the, this, or with in English. A stoplist is a list of stopwords. Oracle Text supplies a stoplist for every language. By default during indexing, the system uses the Oracle Text default stoplist for your language.

You can edit the default stoplist CTXSYS.DEFAULT_STOPLIST or create your own with the following PL/SQL procedures:

For detailed information about these procedures, see the description of CTX_DDL Package in the Oracle Text Reference.

Suppose that you decide that the word forget should not be indexed. Follow these steps to create a new stoplist consisting of the default English stoplist with the word forget added:

  1. In Oracle Data Miner, select Tools | SQL Worksheet.
  2. To create MY_STOPLIST in the current schema, you must drop the stoplist if it already exits, create a basic stoplist, and add the word forget to the stoplist. To do this, enter the following statements in the Enter SQL Statement Window:
    
    BEGIN
      CTX_DDL.DROP_STOPLIST('MY_STOPLIST');
      CTX_DDL.CREATE_STOPLIST('MY_STOPLIST', 'BASIC_STOPLIST');
      CTX_DDL.ADD_STOPWORD('MY_STOPLIST', 'forget');
    END;
    

    Note: If MY_STOPLIST does not exist, the DROP_STOPLIST statement is not required.

  3. Execute the statement, using File | Execute SQL Statement or by clicking the appropriate icon. After you execute the statement, MY_STOPLIST is created in current schema.
  4. To use this new stoplist, select it by either clicking Advanced Options on the last step (Finish) of the model build wizard or by clicking Options in the Text step of an Activity, as follows:
  5. When run the activity, Oracle text uses MY_STOPLIST instead of the default stoplist.