Coldfusion Verity Thesaurus

Website design and web site design website

Award winning websites designed to build business

creativity - usability - accessibility
content management - search engine placement
http://www.mightymedia.co.uk

Enabling Thesaurus functionality in Coldfusion MX Verity Search Engine


Coldfusion MX is supplied with the popular and widely used Verity Search Engine. Verity enables quick and easy searching of multiple document types using intelligent search technology to enhance results.


One of the functions that can dramatically improve on the results is the built in English Language Thesaurus. The supporting documentation for the Verity functionality is poor and, whilst it makes reference to the additional features like SOUNDEX, TYPO and THESAURUS, there is no straightforward explanation for how to implement them.

To setup the THESAURUS feature we did the following:

1) Create a Verity collection using your usual method. Navigate to the folder that contains your collection and find the style.prm document. In our case this document was found twice; once in yourcollection/custom/style/style.prm and once in yourcollection/file/style/style.prm. We edited both though further testing may prove it's only necessary to edit one

2) Find the line that contains the text:
$define WORD-IDXOPTS "Stemdex Casedex"

3) Amend it to:
$define WORD-IDXOPTS "Stemdex Casedex Thesaurus"

4) In your search query page amend your search criteria from:
<cfsearch name="GetResults" collection="yourcollection" criteria="yoursearchterm">
To:
<cfsearch name="GetResults" collection="yourcollection " criteria="<MANY><THESAURUS>yoursearchterm">

5) Re-index your collection and try a search. You should find additional, on-topic results being displayed. For example, a search for 'cash' will include pages that contain the term 'money'.

NOTE: This page is copyright. It may NOT be reproduced without prior written permission from the author of http://www.website-design-101.co.uk contactable here. The address to use when linking to this page is http://www.website-design-101.co.uk/coldfusion/verity-thesaurus-howto.html.

In our new search query above we have included an additional operator <MANY>. This instructs Coldfusion to return correct scores for the results. Without this each page returned is given a score of 100%.

In addition, the same technique can be used to make use of the SOUNDEX operator which will search for words with similar sounds.

The Thesaurus only works for the English language in the default configuration.

Some articles suggest that as an alternative to fixing the <MANY><THESAURUS> operators in your query as above, you could suggest your user manually enters <THESAURUS> into the search field (i.e. on an advanced search page). Assuming you have no restrictions placed on including the <> symbols in search terms this would also trigger the thesaurus functionality. However, attempts at this have returned errors. Of course, you could also present the SOUNDEX, TYPO, THESAURUS and other options as check boxes to then add as criteria to your query.

As yet, we have not been able to combine operators. For example, variants of criteria="<MANY><THESAURUS><SOUNDEX><TYPO>yoursearchterm" consistently return errors. If you can provide info on how to achieve this let us know and we'll post it here and credit you (with link if required).

One drawback to the Thesaurus we have found is the returning of pages that are off-topic. In general, this is not the case but off-topic, content rich pages can produce erroneous results. For example a lengthy 'terms and conditions' document will often feature in results for more specific terms. Our solution was to filter these pages from the results. Terms and Conditions are often linked to from every page on a site so it was decided this would not negatively impact the search accuracy.

What is this page? A Verity Thesaurus Howto and Soundex Typo functions and operators in Coldfusion tutorial. A guide to getting the modifiers working that should have been provided by Macromedia or Verity but who for some reason choose to limit the information shared for Verity functionality such as Thesaurus and Soundex.

© 2006 Mighty Media | Website design directory | Coldfusion Tips | Search Engine News

Vauxhall Zafira | Website design | Main directory
Find jobs | Credit Consolidation | Credit Consolidation | Jobs search | Society 2007
Sources
Verity K2 Toolkit Search System Administration Guide:
Macromedia

Mischa Uppelschoten: Practical Verity using CFMX on Windows (PowerPoint):
Atlanta CF Users

Eron Cohen and Michael Smith: The truth about Verity:
Maryland CF Users

Macromedia Live Docs: Using Verity Search Expressions:
Macromedia

Creating a Custom Thesaurus:
Sybase

Verity search engine overview:
ETS
Website solutions
Creative, functional website's designed to attract and retain new business. View our portfolio at:
www.mightymedia.co.uk
Valid XHTML 1.0
Valid CSS 2.0