Creating a search tool for ColdFusion applications



The three main tasks in creating a search tool for your ColdFusion application are:

  1. Create a collection.

  2. Index the collection.

  3. Design a search interface.

You can perform each task programmatically—that is, by writing CFML code. Alternatively, you can use the ColdFusion Administrator to create and index a collection.

Creating a collection with the ColdFusion Administrator

Use the following procedure to quickly create a collection with the ColdFusion Administrator:

  1. In the ColdFusion Administrator, select Data & Services > Verity Collections.

  2. Enter a name for the collection; for example, DemoDocs.

  3. Enter a path for the directory location of the new collection, for example, C:\CFusion\verity\collections\.

    By default in the server configuration, ColdFusion stores collections in cf_root\verity\collections\ in Windows and in cf_root/verity/collections on UNIX. In the multiserver configuration, the default location for collections is cf_webapp_root/verity/collections. In the J2EE configuration, the default location for collections is verity_root/verity/collections, where verity_root is the directory in which you installed Verity.

    Note: This is the location for the collection, not for the files that you search.
  4. (Optional) Select a language other than English for the collection from the Language drop-down list.

    For more information on selecting a language, see Specifying a language.

  5. (Optional) Select Enable Category Support to create a Verity Parametric collection.

    For more information on using categories, see Narrowing searches by using categories.

  6. Click Create Collection.

    The name and full path of the new collection appears in the list of Verity Collections.

You have successfully created an empty collection. A collection becomes populated with data when you index it.

About indexing a collection

In order for information to be searched, it must be indexed. Indexing extracts both meaning and structure from unstructured information by indexing each document that you specify into a separate Verity collection that contains a complete list of all the words used in a given document along with metadata about that document. Indexed collections include information such as word proximity, metadata about physical file system addresses, and URLs of documents.

When you index databases and other recordsets that you generated using a query, Verity creates a collection that normalizes both the structured and unstructured data. Search requests then check these collections rather than scanning the actual documents and database fields. This provides a faster search of information, regardless of the file type and whether the source is structured or unstructured.

Just as with creating a collection, you can index a collection programmatically or by using the ColdFusion Administrator. Use the following guidelines to determine which method to use:

Use the Administrator

Use the cfindex tag

To index document files

To index ColdFusion query results

When the collection does not require frequent updates

When the collection requires frequent updates

To create the collection without writing any CFML code

To dynamically update a collection from a ColdFusion application page

To create a collection once

When the collection requires updating by others

You can use cfcollectionaction="optimize" if you notice that searches on a collection take longer than they did previously.

Updating an index

Documents are modified frequently in many user environments. After you index your documents, any changes that you make are not reflected in subsequent Verity searches until you reindex the collection. Depending on your environment, you can create a scheduled task to automatically keep your indexes current. For more information on scheduled tasks, see Configuring and Administering ColdFusion.

Creating a ColdFusion search tool programmatically

You can create a Verity search tool for your ColdFusion application in CFML. Although writing CFML code can take more development time than using these tools, in some situations, writing code is the preferred development method.

Creating a collection with the cfcollection tag

The following are cases in which you might prefer using the cfcollection tag rather than the ColdFusion Administrator to create a collection:

  • You want your ColdFusion application to be able to create, delete, and maintain a collection.

  • You do not want to expose the ColdFusion Administrator to users.

  • You want to create indexes on servers that you cannot access directly; for example, if you use a hosting company.

When using the cfcollection tag, you can specify the same attributes as in the ColdFusion Administrator:

Attribute

Description

action

(Optional) The action to perform on the collection (create, delete, or optimize). The default value for the action attribute is list. For more information, see cfcollection in CFML Reference.

collection

The name of the new collection, or the name of a collection upon which you perform an action.

path

The location for the Verity collection.

language

The language.

categories

(Optional) Specifies that cfcollection create a Verity Parametric Index (PI) for this collection. By default, the categories attribute is set to False. To create a collection that uses categories, specify Yes.

You can create a collection by directly assigning a value to the collection attribute of the cfcollection tag, as shown in the following code:

<cfcollection action = "create" 
    collection = "a_new_collection" 
    path = "c:\CFusion\verity\collections\">

If you want your users to be able to dynamically supply the name and location for a new collection, use the following procedures to create form and action pages.

Create a simple collection form page

  1. Create a ColdFusion page with the following content:

    <html> 
    <head> 
        <title>Collection Creation Input Form</title> 
    </head> 
     
    <body> 
    <h2>Specify a collection</h2> 
    <form action="collection_create_action.cfm" method="POST"> 
         
        <p>Collection name:  
        <input type="text" name="CollectionName" size="25"></p> 
         
        <p>What do you want to do with the collection?</p> 
        <input type="radio"  
            name="CollectionAction"  
            value="Create" checked>Create<br> 
        <input type="radio"  
            name="CollectionAction"  
            value="Optimize">Optimize<br> 
        <input type="submit"  
            name="submit"  
            value="Submit">  
    </form> 
     
    </body> 
    </html>
  2. Save the file as collection_create_form.cfm in the myapps directory under the web root directory.

Note: The form does not work until you write an action page for it, which is the next procedure.

Create a collection action page

  1. Create a ColdFusion page with the following content:

    <html> 
    <head> 
        <title>cfcollection</title> 
    </head> 
     
    <body> 
    <h2>Collection creation</h2> 
     
    <cfoutput> 
     
        <cfswitch expression=#Form.collectionaction#> 
            <cfcase value="Create"> 
                <cfcollection action="Create" 
                collection="#Form.CollectionName#" 
                path="c:\CFusion\verity\collections\"> 
                <p>The collection #Form.CollectionName# is created.</p> 
            </cfcase> 
     
            <cfcase value="Optimize"> 
                <cfcollection action="Optimize" 
                collection="#Form.CollectionName#"> 
                <p>The collection #Form.CollectionName# is optimized.</p> 
            </cfcase> 
     
            <cfcase value="Delete"> 
                <cfcollection action="Delete" 
                collection="#Form.CollectionName#"> 
                <p>The collection is deleted.</p> 
            </cfcase> 
        </cfswitch> 
    </cfoutput> 
    </body> 
    </html>
  2. Save the file as collection_create_action.cfm in the myapps directory under the web root directory.

  3. In the web browser, enter the following URL to display the form page:

    http://hostname:portnumber/myapps/collection_create_form.cfm

  4. Enter a collection name; for example, CodeColl.

  5. Verify that Create is selected and submit the form.

  6. (Optional) In the ColdFusion Administrator, reload the ColdFusion Collections page.

The name and full path of the new collection appear in the list of Verity Collections.

You successfully created a collection, named CodeColl, that currently has no data.

Indexing a collection by using the cfindex tag

You can index a collection in CFML by using the cfindex tag, which eliminates the need to use the ColdFusion Administrator. The cfindex tag populates the collection with metadata that is then used to retrieve search results. You can use the cfindex tag to index either physical files (documents stored within your website’s root folder), or the results of a database query.

Note: Before indexing a collection, create a Verity collection by using the ColdFusion Administrator, or the cfcollection tag. For more information, see Creating a collection with the ColdFusion Administrator, or Creating a collection with the cfcollection tag.

When using the cfindex tag, the following attributes correspond to the values that you would enter by using the ColdFusion Administrator to index a collection:

Attribute

Description

collection

The name of the collection.

action

Specifies what the cfindex tag should do to the collection. The default action is to update the collection, which generates a new index. Other actions are to delete, purge, or refresh the collection.

type

Specifies the type of files or other data to which the cfindex tag applies the specified action. The value you assign to the type attribute determines the value to use with the key attribute (see the following list). When you enter a value for the type attribute, cfindex expects a corresponding value in the key attribute. For example, if you specify type=file, cfindex expects a directory path and filename for the key attribute.

The type attribute has the following possible values:

  • file: Specifies a directory path and filename for the file that you are indexing.

  • path: Specifies a directory path that contains the files that you are indexing.

  • custom: Specifies custom data, such as a recordset returned from a query.

extensions

(Optional) The delimited list of file extensions that ColdFusion uses to index files if type="path".

key

The value that you specify for the key attribute depends on the value set for the type attribute:

  • If type="file", the key is the directory path and filename for the file you are indexing.

  • If type="path", the key is the directory path that contains the files you are indexing.

  • If type="custom", the key is a unique identifier specifying the location of the documents you are indexing; for example, the URL of a specific web page or website whose contents you want to index. If you are indexing data returned by a query (from a database for example), the key is the name of the recordset column that contains the primary key.

URLpath

(Optional) The URL path for files if type="file" and type="path". When the collection is searched with the cfsearch tag, ColdFusion works as follows:

  • type="file": The URLpath attribute contains the URL to the file.

  • type="path": The path name is automatically prefixed to filenames and returned as the URLpath attribute.

recurse

(Optional) Yes or No. If type = "path" , Yes specifies that directories below the path specified in the key attribute are included in the indexing operation.

language

(Optional) The language of the collection. The default language is English Basic.

To learn more about support for languages, see Specifying a language.

You can use form and action pages like the following examples to select and index a collection.

Select which collection to index

  1. Create a ColdFusion page with the following content:

    <html> 
    <head> 
        <title>Select the Collection to Index</title> 
    </head> 
    <body> 
     
    <h2>Specify the index you want to build</h2> 
     
    <form method="Post" action="collection_index_action.cfm"> 
        <p>Enter the collection you want to index: 
        <input type="text" name="IndexColl" size="25" maxLength="35"></p> 
        <p>Enter the location of the files in the collection: 
        <input type="text" name="IndexDir" size="50" maxLength="100"></p> 
        <p>Enter a Return URL to prepend to all indexed files: 
        <input type="text" name="urlPrefix" size="80" maxLength="100"></p> 
     
        <input type="submit" name="submit" value="Index">  
     
    </form> 
     
    </body> 
    </html>
  2. Save the file as collection_index_form.cfm in the myapps directory under the web_root.

Note: The form does not work until you write an action page for it, which you do when you index a collection.

Use cfindex to index a collection

  1. Create a ColdFusion page with the following content:

    <html> 
    <head> 
    <title>Creating Index</title> 
    </head> 
    <body> 
    <h2>Indexing Complete</h2> 
     
    <cfindex collection="#Form.IndexColl#" 
        action="refresh" 
        extensions=".htm, .html, .xls, .txt, .mif, .doc" 
        key="#Form.IndexDir#" 
        type="path" 
        urlpath="#Form.urlPrefix#" 
        recurse="Yes" 
        language="English"> 
     
    <cfoutput> 
        The collection #Form.IndexColl# has been indexed. 
    </cfoutput> 
    </body> 
    </html>
  2. Save the file as collection_index_action.cfm.

  3. In the web browser, enter the following URL to display the form page:

    http://hostname:portnumber/myapps/collection_index_form.cfm

  4. Enter a collection name; for example, CodeColl.

  5. Enter a file location; for example, C:\ColdFusion9\wwwroot\vw_files.

  6. Enter a URL prefix; for example, http://localhost:8500/vw_files (assuming that you are using the built-in web server).

  7. Click Index.

    A confirmation message appears on successful completion.

Note: For information about using the cfindex tag with a database to index a collection, see Working with data returned from a query.

Indexing a collection with the ColdFusion Administrator

As an alternative to programmatically indexing a collection, use the following procedure to index a collection with the ColdFusion Administrator.

  1. In the list of Verity Collections, select a collection name; for example, CodeColl.

  2. Click Index to open the index page.

  3. For File Extensions, enter the types of files to index. Use a comma to separate multiple file types; for example, .htm, .html, .xls, .txt, .mif, .doc.

  4. Enter (or Browse to) the directory path that contains the files to be indexed; for example, C:\Inetpub\wwwroot\vw_files.

  5. (Optional) To extend the indexing operation to all directories below the selected path, select the Recursively index subdirectories check box.

  6. (Optional) Enter a Return URL to prepend to all indexed files.

    This step lets you create a link to any of the files in the index; for example, http://127.0.0.1/vw_files/.

  7. (Optional) Select a language other than English.

    For more information, see Specifying a language.

  8. Click Submit Changes.

    On completion, the ColdFusion Collections page appears.

Note: The time required to generate the index depends on the number and size of the selected files in the path.

This interface lets you easily build a very specific index based on the filename extension and path information you enter. In most cases, you do not need to change your server file structures to accommodate the generation of indexes.