Using the didump utility



Using the didump utility, you can view key components of the word index per partition. The word list is a list of all words indexed by the Verity engine; the zone list is a list of all zones; and the zone attribute list is a list of the zone attributes found by the Verity engine.

The didump executable, which starts the didump application, is located in the platform/bin directory. For more information on the specific location of this directory, see Location of Verity utilities.

For example:

c:\coldfusion9\verity\k2\_nti40\bin\didump /common = c:\coldfusion9\verity\k2\common    -pattern llama 
c:\new\parts\00000001.did

Viewing the word list with the didump utility

You can view the contents of the word list for a partition by using the didump utility with the ‑words flag. The command-line syntax must include the ‑words flag and a path to a partition file, like the following:

didump -words /z/collbldg/html/parts/00000003.did 
An alphabetical listing of the words in the word index displays, as follows: 
didump - Verity, Inc. Version 2.5.0 (_nti31, Jul 7 1999) 
 
Text             Size    Doc    Word 
A                10      3      4 
a                34      5      24 
abbreviations    4       1      1 
about            4       1      1 
acronym          5       1      2 
acronyms         4       1      1 
actual           4       1      1 
administrator    3       1      1 
advance          3       1      1 
all              8       2      3 
also             9       2      4 
Always           4       1      1 
always           9       2      3 
ampersand        4       1      1

The columns in the display indicate the following:

Size
The number of bytes used by the Verity engine to store information about the word

Doc
The number of unique documents in which the word appears

Word
The total number of occurrences of a word for the partition

To view the occurrences of a specific word or pattern, enter a command using the -pattern option, as in the following example:

didump -pattern acronym 00000003.did

In this example, the didump utility displays information about the number of occurrences of the word acronym. You can display the individual occurrences of a word using the -verbose option.

Viewing the zone list with the didump utility

The zone list contains a list of the zones identified by the zone filter. You can search the zones listed using the Verity IN operator in a query. To view the contents of the zone list, use the didump utility with the -zones flag plus the path to a partition, like the following:

didump -zones /z/collbldg/html/parts/00000003.did

This partition is for a collection containing the Verity Collection Building Guide in HTML. The Verity universal filter started the HTML filter by default, and indexed the documents using these zones.

didump - Verity, Inc. Version 2.5.0 (_solaris, Jul 07 1999) 
 
ZoneName    Fmt    Size   Doc   Regions 
A           Wct    10239  85    5016 
ADDRESS     Array  34     1     1 
BODY        Array  197    85    85 
CAPTION     Wct    298    31    85 
CODE        Wct    3868   66    1829 
H1          Array  80     83    83 
H2          Wct    646    53    212 
H3          Wct    517    49    171 
H4          Wct    128    8     47 
HEAD        Array  70     85    85 
HTML        Array  165    85    85 
TITLE       Array  70     85    85

The columns in the display indicate the following:

Fmt
The internal data format used to store the zone information.

Size
The number of bytes used by the Verity engine to store information about the zone.

Doc
The number of unique documents in which the zone appears

Region
The total number of instances of a zone for the partition

Viewing the zone attribute list with the didump utility

The zone attribute list contains a list of the HTML attributes for the zones identified by the HTML zone filter. You can search the zone attributes listed using the Verity IN operator together with the WHEN operator in a query. To view the contents of the zone attributes list, use the didump utility with the -attributes flag plus the path to a partition, like the following:

didump -attributes /z/collbldg/html/parts/00000003.did

This partition is for a collection containing the Verity Collection Building Guide in HTML.

didump - Verity, Inc. Version 2.5.0 (_solaris, Jul 9 1999) 
 
Text                        Size   Doc   Word 
href 01_cbg.htm             10     2     4 
href 01_cbg.htm#282870      3      1     1 
href 01_cbg.htm#282872      6      2     2 
href 01_cbg1.htm            8      2     3 
href 01_cbg1.htm#286513     7      2     2 
href 01_cbg1.htm#286520     3      1     1 
...

The columns in the display indicate the following:

Size
The number of bytes used by the Verity engine to store information about the zone attribute

Doc
The number of unique documents in which the zone attribute appears

Word
The total number of occurrences of a zone attribute for the partition