Enhancing search results



ColdFusion lets you enhance the results of searches by letting you incorporate search features that let users more easily find the information they need. Verity provides the following search enhancements:

  • Highlighting search terms

  • Providing alternative spelling suggestions

  • Narrowing searches using categories

Highlighting search terms

Term highlighting lets users quickly scan retrieved documents to determine whether they contain the desired information. This can be especially useful when searching lengthy documents, letting users quickly locate relevant information returned by the search.

To implement term highlighting, use the following cfsearch attributes in the search results page:

Attributes

Description

ContextHighlightBegin

Specifies the HTML tag to prefix to the search term within the returned documents. This attribute must be used in conjunction with ContextHighlightEnd to highlight the resulting search terms. The default HTML tag is <b>, which highlights search terms using bold type.

ContextHighlightEnd

Specifies the HTML tag to append to the search term within the returned documents.

ContextPassages

The number of passages/sentences Verity returns in the context summary (the context column of the results). The default value is 0; this disables the context summary.

ContextBytes

The total number of bytes that Verity returns in the context summary. The default is 300 bytes.

The following example adds to the previous search results example by highlighting the returned search terms with bold type.

Create a search results page that includes term highlighting

  1. Create a ColdFusion page with the following content:

    <html> 
    <head> 
        <title>Search Results</title> 
    </head> 
    <body> 
    <cfsearch  
        name = "codecoll_results" 
        collection = "CodeColl" 
        criteria = "#Form.Criteria#" 
        ContextHighlightBegin="<b>" 
        ContextHighlightEnd="</b>" 
        ContextPassages="1"  
        ContextBytes="500" 
        maxrows = "100"> 
    <h2>Search Results</h2> 
    <cfoutput> 
    Your search returned #codecoll_results.RecordCount# file(s). 
    </cfoutput> 
     
    <cfoutput query="codecoll_results"> 
        <p> 
        File: <a href="#URL#">#Key#</a><br> 
        Document Title (if any): #Title#<br> 
        Score: #Score#<br> 
        Summary: #Summary#<br> 
        Highlighted Summary: #context#</p> 
    </cfoutput> 
    </body> 
    </html>
  2. Save the file as collection_search_action.cfm.

    Note: This overwrites the previous ColdFusion example page.
  3. View collection_search_form.cfm in the web browser:

  4. Enter target words and click Search.

Providing alternative spelling suggestions

Many unsuccessful searches are the result of incorrectly spelled query terms. Verity can automatically suggest alternative spellings for misspelled queries using a dictionary that is dynamically built from the search index.

To implement alternative spelling suggestions, you use the cfsearch tag’s suggestions attribute with an integer value. If the number of documents returned by the search is less than or equal to the value you specify, Verity provides alternative search term suggestions. In addition to using the suggestions attribute, you can also use the cfif tag to output the spelling suggestions, and a link through which to search on the suggested terms.

Note: Using alternative spelling suggestions incurs a small performance penalty. This occurs because the cfsearch tag must also look up alternative spellings in addition to the specified search terms.

The following example specifies that if the number of search results returned is less than or equal to 5, an alternative search term—which is displayed using the cfif tag—is displayed with a link that the user can click to activate the alternate search.

Create a search results page that provides alternative spelling suggestions

  1. Create a ColdFusion page with the following content:

    <html> 
    <head> 
        <title>Search Results</title> 
    </head> 
    <body> 
    <cfsearch  
        name = "codecoll_results" 
        collection = "CodeColl" 
        criteria = "#Form.Criteria#"> 
        status = "info" 
        suggestions="5" 
        ContextPassages = "1" 
        ContextBytes = "300" 
        maxrows = "100"> 
    <h2>Search Results</h2> 
    <cfoutput> 
    Your search returned #codecoll_results.RecordCount# file(s). 
    </cfoutput> 
    <cfif info.FOUND LTE 5 AND isDefined("info.SuggestedQuery")> 
        Did you mean: 
        <a href="search,cfm?query=#info.SuggestedQuery#>#info.SuggestedQuery#</a> 
    </cfif>  
    <cfoutput query="codecoll_results"> 
        <p> 
        File: <a href="#URL#">#Key#</a><br> 
        Document Title (if any): #Title#<br> 
        Score: #Score#<br> 
        Summary: #Summary#<br> 
        Highlighted Summary: #context#</p> 
    </cfoutput> 
    </body> 
    </html>
  2. Save the file as collection_search_action.cfm.

    Note: This overwrites the previous ColdFusion example page.
  3. View collection_search_form.cfm in the web browser:

  4. Enter any misspelled target words and click Search.

Narrowing searches by using categories

Verity lets you organize your searchable documents into categories. Categories are groups of documents (or database tables) that you define, and then let users search within them. For example, if you wanted to create a search tool for a software company, you can create categories such as whitepapers, documentation, release notes, and marketing collateral. Users can then specify one or more categories in which to search for information. Thus, if users visiting the website wanted to learn about a conceptual aspect of your company’s technology, they might restrict their search to the whitepaper and marketing categories.

Typically, you provide users with pop-up menus or check boxes from which they can select categories to narrow their searches. Alternately, you can create a form that lets users enter both a category name in which to search, and search keywords.

Create a search application that uses categories

  1. Create a collection with support for categories enabled.

  2. Index the collection, specifying the category and categoryTree attributes appropriate to the collection.

    For more information on indexing Verity collections with support for categories, see Indexing collections that contain categories.

  3. Create a search page that lets users search within the categories that you created.

    Create a search page using the cfsearch tag that lets users more easily search for information by restricting searches to the specified category and, if specified, its hierarchical tree.

    For more information on searching Verity collections with support for categories, see Searching collections that contain categories.

Creating collections with support for categories

You can either select Enable Category Support from the ColdFusion Administrator, or write a cfcollection tag that uses the category attribute. By enabling category support, you create a collection that contains a Verity Parametric Index (PI).

<cfcollection  
    action = "action" 
    collection = "collectionName" 
    path = "path_to_verity_collection" 
    language = "English" 
    categories = "yes">

For more information on using the cfcollection tag to create Verity collections with support for categories, see cfcollection in the CFML Reference.

Indexing collections that contain categories

When you index a collection with support for categories enabled, do the following:

  • Specify a category name using the category attribute. The name (or names) that you provide identifies the category so that users can specify searches on the documents that the collection contains. For example, you create five categories named taste, touch, sight, sound, and smell. When performing a search, users could select from either a pop-up menu or check box to search within one or more of the categories, thereby limiting their search within a given range of topics.

    <cfindex collection="#Form.IndexColl#" 
        action="update" 
        extensions=".htm, .html, .xls, .txt, .mif, .doc, .pdf" 
        key="#Form.IndexDir#" 
        type="path" 
        urlpath="#Form.urlPrefix#" 
        recurse="Yes" 
        language="English" 
        category="taste, touch, sight, sound, smell">
  • Specify a hierarchical document tree (like a file system tree) within which you can limit searches, when you use the categoryTree attribute. With the categoryTree attribute enabled, ColdFusion limits searches to documents contained within the specified path.

    To use the categoryTree attribute, you specify a hierarchical document tree by listing each category as a string, and separating them using forward slashes (/). The tree structure that you specify in a search is the root of the document tree from which you want the search to begin. The type=path attribute appends directory names to the end of the returned value (as it does when specifying the urlpath attribute).

    Note: You can specify only a single category tree.
    <cfindex collection="#Form.IndexColl#" 
        action="update" 
        extensions=".htm, .html, .xls, .txt, .mif, .doc, .pdf" 
        key="#Form.IndexDir#" 
        type="path" 
        urlpath="#Form.urlPrefix#" 
        recurse="Yes" 
        language="English" 
        category="taste, touch, sight, sound, smell" 
        categoryTree="human/senses/taste">

For more information on using the cfindex tag to create Verity collections with support for categories, see cfindex in the CFML Reference.

Searching collections that contain categories

When searching data in a collection created with categories, you specify category and categoryTree. The values supplied to these attributes specify the category to be searched for the specified search string (the criteria attribute). The category attribute can contain a comma-separated list of categories to search. Both attributes can be specified at the same time.

<cfsearch collection="collectionName"  
    name="results" 
    maxrows = "100" 
    criteria="search keywords" 
    category="FAQ,Technote" 
    categoryTree="Docs/Tags">
Note: If cfsearch is executed on a collection that was created without category information, an exception is thrown.

To search collections that contain categories, you use the cfsearch tag, and create an application page that searches within specified categories. The following example lets the user enter and submit the name of the collection, the category in which to search, and the document tree associated with the category through a form. By restricting the search in this way, the users are better able to retrieve the documents that contain the information they are looking for. In addition to searching with a specified category, this example also makes use of the contextHighlight attribute, which highlights the returned search results.

<cfparam name="collection" default="test-pi"> 
 
<cfoutput> 
<form action="#CGI.SCRIPT_NAME#" method="POST"> 
    Collection Name: <input Type="text" Name="collection" value="#collection#"> 
    <P> 
    Category: <input Type="text" Name="category" value=""><br> 
    CategoryTree: <input Type="text" Name="categoryTree" value=""><br> 
    <P> 
Search: <input Type="text" Name="criteria"> 
    <input Type="submit" Value="Search"> 
</form> 
</cfoutput> 
 
<cfif isdefined("Form.criteria")> 
    <cfoutput>Search results for: <b>#criteria#</b></cfoutput> 
    <br> 
    <cfsearch collection="#form.collection#" 
        category="#form.category#" 
        categoryTree="#form.categoryTree#"  
        name="sr" 
        status="s" 
        criteria="#form.criteria#" 
        contextPassages="3" 
        contextBytes="300" 
        contextHighlightBegin="<i><b>" 
        contextHighlightEnd="</b></i>" 
        maxrows="100"> 
    <cfdump var="#s#"> 
 
    <cfoutput> 
    <p>Number of records in query: #sr.recordcount#</P> 
    </cfoutput> 
 
    <cfdump var="#sr#"> 
 
    <cfoutput Query="sr"> 
    Title: <i>#title#</i><br> 
    URL: #url#<br> 
    Score: #score#<br> 
    <hr> 
    #context#<br> 
    <br> 
    #summary#<br> 
    <hr> 
    </cfoutput> 
</cfif>

For more information on using the cfindex tag to create Verity collections with support for categories, see cfsearch in the CFML Reference.

Retrieving information about the categories contained in a collection

You can retrieve the category information for a collection by using the cfcollection tag’s categoryList action.The categoryList action returns a structure that contains two keys:

Variable

Description

categories

The name of the category and its hit count, where hit count is the number of documents in the specified category.

categorytrees

The document tree (a/b/c) and hit count, where hit count is the number of documents at or below the branch of the document tree.

Use the information returned by categoryList to display to users the number of documents available for searching, as well the document tree available for searching. You can also create a search interface that lets the user select what category to search within based on the results returned by categoryList.

<cfcollection  
    action="categoryList"  
    collection="collectionName" 
    name="info">  
 
<cfoutput>  
    <cfset catStruct=info.categories>  
    <cfset catList=StructKeyList(catStruct)>  
    <cfloop list="catList" index="cat"> Category: #cat# <br>  
        Documents: #catStruct[cat]#<br>  
    </cfloop> 
</cfoutput> 

To retrieve information about the categories contained in a collection, you use the cfcollection tag, and create an application page that retrieves category information from the collection and displays the number of documents contained by each category. This example lets the user enter and submit the name of the collection via a form, and then uses the categoryList action to retrieve information about the number of documents contained by the collection, and the hierarchical tree structure into which the category is organized.

<html>  
<head> 
    <title>Category information</title> 
</head> 
<body> 
<cfoutput> 
<form action="#CGI.SCRIPT_NAME#" method="POST"> 
    Enter Collection Name: <input Type="text" Name="collection"  
        value="#collection#"><br> 
    <input Type="submit" Value="GetInfo"> 
</form> 
</cfoutput> 
<cfif isdefined("Form.collection")> 
    <cfoutput> 
    Getting collection info... 
    <br> 
    <cfflush> 
    <cfcollection  
        action="categorylist"  
        collection="#collection#"  
        name="out"> 
    <br> 
    <cfset categories=out.categories> 
    <cfset tree=out.categorytrees> 
    <cfset klist=StructKeyList(categories)> 
    <table border=1> 
    <th>Category</th> <th>Documents</th> 
    <cfloop index="x" list="#klist#"> 
    <tr> 
        <td>#x#</td> <td align="center">#categories[x]#</td> 
    </tr> 
    </cfloop> 
    </table> 
    <cfset klist=StructKeyList(tree)> 
    <table border=1> 
    <th>Category</th> <th>Documents</th> 
    <cfloop index="x" list="#klist#"> 
    <tr> 
     <td>#x#</td> <td align="center">#tree[x]#</td> 
    </tr> 
    </cfloop> 
    </table> 
    </cfoutput> 
</cfif> 
</body>

For more information on using the cfcollection tag to create Verity collections with support for categories, see cfcollection in CFML Reference.