Form for querying words and word strings

© 2007 Anthony Kroch and Beth Randall

In this demo, each corpus of the PPCHE is represented by ten texts.

Get query from screen Get query from text Get query from file
  • This form allows users to query files of annotated running text. The path of the file or directory to be queried is entered into the Source File text box.

  • Files are queried in one of two ways: searching for individual words and sequences of words or searching for words with their associated part-of-speech tags. The choice between the two kinds of search is made in the "CHOOSE QUERY TYPE" pop-up menu.

  • Once a query type is chosen, the appropriate form appears, containing just one word. Clicking on the "ADD WORD" button adds another word up to a total of five. Words are removed by the "REMOVE WORD" button. Tags are specified for each word with the "CHOOSE TAG" pop-up menu, if it is available. Words are entered into the text box. The wild card character asterisk ('*') may be used.

  • Between the words of the form there is a pop-up menu on which the ordering relationship between the word to the left of the popup and the word to the right of the popup can be specified. The default value is "Precedes" and the choices are:
    • First word precedes second word (Precedes).
    • First word immediately precedes second word (iPrecedes).
    • The relationship between the words is free or unspecified (Free Order).

  • Below the ordering popup there is a radio button labelled "Neighborhood." When it is clicked, an "N SIZE" window appears, in which the initial label must be replaced by an integer. The integer specifies the maximum number of words that may appear between the word on the left and the one on the right in an output sentence.

  • Once the web form has been filled out and a POS-tagged file to be searched has been chosen, the query is executed by clicking on the Submit Query button. The result is returned on a new web page.

  • The Annotation Manual for the Penn Parsed Corpora of Historical English is available on the web. A full list of part-of-speech tags is included. A user who wants to use the full list of tags in searching should use the option "Enter Tag" under one or more words. Clicking on this option brings up a text box into which any POS tag may be inserted. The wild card asterisk may be used here.
WORD 1