archive-com.com » COM » C » CLRES.COM

Total: 469

Choose link from "Titles, links and description words view":

Or switch to "Titles and links view".
  • CL Research competition participation
    SENSEVAL 2 Participation CLR TREC 9 Q A Participation CLR TREC 8 Q A Participation CLR SENSEVAL 1 Participation Site Map This document maintained by Ken Litkowski Copyright 2001 CL

    Original URL path: http://www.clres.com/competitions.html (2016-02-11)
    Open archived version from archive


  • Papers
    K C 2001 Use of Machine Readable Dictionaries in Word Sense Disambiguation for Senseval 2 Proceedings of the Association of Computational Linguistics Special Interest Group on the Lexicon Senseval 2 ps Litkowski K C 2000 The Synergy of NLP and Computational Lexicography Tasks Technical Report 00 01 Damascus MD CL Research Litkowski K C 2000 SENSEVAL The CL Research Experience Computers and the Humanities 34 1 2 pp 153 8 Online expanded version Litkowski K C 1999 Towards a Meaning Full Comparison of Lexical Resources Proceedings of the Asscociation for Computational Linguistics Special Interest Group on the Lexicon June 21 22 College Park MD ps Litkowski K C 1998 Analysis of Subordinating Conjunctions Technical Report 98 01 Gaithersburg MD CL Research This paper describes ongoing analysis of the definitions of subordinating conjunctions in Webster s Third International Dictionary in development of a lexical knowledge base for these lexical items We describe 1 procedures used to move toward identification of the primitives of subordinating conjunctions using digraph analysis and 2 a preliminary analysis of the syntactic and semantic structure of subordinating conjunctions synthesizing a a WordNet analysis of key words used in subordinating conjunction definitions b observations about subordinating conjunctions in Quirk et al c the work of Ken Barker Interactive semantic analysis of clause level relationships and d the work of Alistair Knott A data driven method for classifying connective phrases Journal of Language and Speech 39 1996 and 4 a brief overview of the next stages of the analysis which are under way Any comments criticisms and suggestions are welcome as are any offers of assistance for characterizing appropriate features to be assigned to the lexical entries Litkowski K C 1997 Automatic Creation of Lexical Knowledge Bases New Developments in Computational Lexicology Technical Report 97 03 Gaithersburg MD CL Research WordPerfect 6 0a 88k Postscript 202k Litkowski K C 1997 Desiderata for Tagging with WordNet Synsets and MCCA Categories Proceedings of the 4th SIGLEX Workshop on Tagging Text with Lexical Semantics Why What and How April 1997 Washington DC WordPerfect 6 0a 35k Postscript 167k Word 97 74k Litkowski K C and M D Harris 1997 Category Development Using Complete Semantic Networks Technical Report 97 01 Gaithersburg MD CL Research WordPerfect 6 0a 43k Postscript 194k Primer on Computational Lexicology 1992 159k Litkowski K C 1980 Requirements of Text Processing Lexicons Proceedings of the 18th Annual Meeting of the Association for Computational Linguistics Philadelphia PA Expanded Paper Litkowski K C 1978 Models of the Semantic Structure of Dictionaries American Journal of Computational Linguistics Microfiche 81 Frames 25 74 This paper describes the philosophy underlying the development of the DIMAP toolkit The procedures for analyzing dictionary definitions are as valid today as when they were first written Current developments in WordNet Microsoft s MindNet and Caroline Barrière s analysis of a children s dictionary were prefigured here The paper particularly focuses on the meanings of verbs and provides a framework for using dictionaries to formalize the semantic characteristics described in

    Original URL path: http://www.clres.com/papers.html (2016-02-11)
    Open archived version from archive

  • Frame Element Taxonomy
    The information on this page is displayed in frames If your browser cannot view frames click here to go to CL Research Site Map

    Original URL path: http://www.clres.com/db/feindex.html (2016-02-11)
    Open archived version from archive

  • The Preposition Project
    preposition is applicable Since similar items may be grouped together i e frame name frame element name and lexical unit several instances were tagged at a time In some instances the lexicographer found that multiple senses are applicable in this case each applicable sense number is included Overall less than 500 instances have multiple senses The lexicographer also found that the sense inventory requires splitting of senses during the first phase of the project the sense inventory was expanded by approximately 10 percent A major innovation of ODE was the development of a mini hierarchy in grouping senses leading to core senses and subsenses that are semantically similar to the core but may represent some type of sense extension usually a narrowing or broadening of meaning see below for tables showing examples of how these were coded The lexicographer annotated each subsense with the type of extension During the course of annotating the FrameNet instances for major prepositions the lexicographer kept notes and prepared a summary describing the treatment of the preposition When warranted a specific comment was attached to a particular sense frequently with reference to the preposition s summary After the instances for major prepositions had been completed the lexicographer began a systematic traversal through the dictionary for all prepositions that had not yet been analyzed When a preposition with instances was reached the procedures described above were followed For prepositions without FrameNet instances the lexicographer made use of other corpora available to him such as the British National Corpus to analyze a prepositions senses Each sense of these prepositions was characterized in the same way as above The only difference is that for these less prominent prepositions usually with only one or two senses no set of tagged sentences is available From a lexicographic perspective it turns out each source of information about the behavior of a preposition is incomplete in itself All three sources in use on the project are complementary in providing an overall assessment of the meaning and characterization of the preposition ODE may be found wanting when placed next to the FrameNet instances this project thus revealed further aspects of the appropriate sense inventory ODE does not provide a summary picture of a preposition s meanings the characterization in Quirk et al provides such a perspective but it too is incomplete both in coverage of a particular meaning and in not identifying correspondences with other prepositions The FrameNet database does not provide instances for all the senses Litkowski Hargraves 2005 provides further details on the use of the three different sources Litkowski Hargraves 2006 provides further details on the coverage of the three sources Analyzing the Semantic Role for a Sense With the tagged instances a simple sort by sense number of the Excel spreadsheet identifies the Frame Frame Element pairs for each sense These pairs are aggregated into one list in the Sense Analysis spreadsheet Table 3 shows these pairs for the first three senses of through Table 3 Frame FrameElement Pairs Identified for Senses of through Sense Relation Name Frame FrameElement Pairs 1 1 ThingTransited Arriving Path Cause motion Path Cotheme Path Departing Path Escaping Location Escaping Path Evading Path Fluidic motion Path Mass motion Path Motion Path Motion directional Path Motion noise Path Operate vehicle Path Path shape Path Placing Goal Placing Path Removing Path Roadways Area Self motion Area Self motion Path Breathing Path 2 1a ThingBored Cause harm Body part Impact Impactee Natural features Relative location Use firearm Path 3 1b ThingTransited Emotion heat Location Path shape Area Ride Vehicle Path Roadways Path Self motion Self mover Travel Path As indicated above the lexicographer identified a semantic role label for each sense based on intuition These labels are developed somewhat independently of computational linguistic theories These labels are intended to be used in characterizing prepositional phrases after using criteria laid out in the complement and attachment syntactic and semantic properties for disambiguating the prepositions Gildea Jurafsky 2002 developed a mapping of frame elements into 18 higher level semantic roles The methodology followed here provides an alternative mapping that is more data driven and less subjective In many senses for which FrameNet instances were identified there is a clear correspondence between the frame element names and the semantic relation assigned by the lexicographer After about half the prepositions had been analyzed the lexicographer began to group the semantic roles into higher level categories i e generic classes of prepositions There are 21 categories at the present time including common semantic role names such as Agent Cause Means Spatial and Temporal but also less common names such as Backdrop Quantity Scalar and Target The less common names emerged from the data Each preposition sense and its associated semantic role was examined carefully with respect to these categories and placed in one of them This examination led to refinement of the categories and to changes in the semantic role names as the full sense inventory was completed Although the generic classes are still regarded as preliminary they provide an initial taxonomy for the complete set of preposition senses in the English language The Frame FrameElement pairs identified in the project show the range and variation of frame elements that have been developed by the FrameNet lexicographers Frame FrameElement pairs and lexical units are shown in Table 4 for through sense 3 given the label ThingTransited Examination of a table like this might indicate that this sense encapsulates a Path semantic role Since other senses of through also have a Path role the lexicographer s assignment indicates a finer granularity on the type of path At the same time however the FrameNet assignment of an Area frame element for hitchhike also indicates a finer granularity on the type of path suggesting that the path might be through a region The other Path frame elements might also have such an interpretation Table 4 Analysis of Sense 3 ThingTransited for through Frame Frame Element Lexical Units Emotion heat Location boil v seethe v burn v Path shape Area crisscross v Ride Vehicle Path hitchhike v Roadways Path bypass n highway n line n motorway n path n pathway n road n street n track n trail n Self motion Self mover sprint v Travel Path journey n journey v tour n travel v This type of analysis demonstrates the richness of the data generated by tagging instances This type of analysis has only begun Efforts are currently under way to use this type of analysis to integrate results from TPP into FrameNet Gold Standards for Preposition and Semantic Role Disambiguation In addition to identifying the instances for the lexicographer to use in characterizing the different senses of a preposition an XML file of the sentences themselves was also generated Each sentence was given an identifier consisting of the preposition name the sentence number and the character position of the preposition The sentences for which the preposition senses have been assigned constitute a suitable corpus for the development of disambiguation routines for semantic role assignment In this respect these sentences are essentially equivalent to the lexical sample task followed in Senseval In addition since these instances are FrameNet tagged sentences they provide a suitable dataset for the Senseval FrameNet semantic role task The sentence instances for 34 of the most common prepositions were used in a SemEval 2007 task on disambiguation of prepositions Litkowski Hargraves 2007 The accompanying table shows the number of sentences that were used in SemEval 2007 for each preposition broken down into the number used for training and the number used for testing The papers by Ye and Baldwin 2007 Yuret 2007 and Popescu et al 2007 describe the results of the three teams participating in this task These results show considerable progress in disambiguating preposition senses with nearly 70 percent accuracy by the top performing team Litkowski 2002 described a set of disambiguation tests for the preposition of based solely on introspection of its definitions Those tests are not sufficient As implied in Table 2 the complement and attachment properties require a richer set of semantic tests for which suitable lexical resources do not presently exist Sense 1 of through requires that the prepositional phrase be attached to a verb of motion WordNet has a general motion category for verbs so in this case a suitable test can be made However for sense 2 it is necessary to identify verbs of penetration no such category is available in WordNet A Roget style thesaurus might provide the necessary information e g look up penetration in the thesaurus and then examine the verbs in the same thesaurus category Prepositions like verbs may have associated subcategorization patterns e g requiring a gerundial complement such as a means sense of by The Quirk syntax described in Table 2 provides some additional syntactic properties In general however it appears that syntactic properties will not be sufficient The machine learning algorithms used in Senseval for the semantic roles task may prove to be the most appropriate set of techniques An important question surrounding the use of prepositions is whether the phrases they introduce are arguments or adjuncts In Merlo Esteve Ferrer The Notion of Argument in Prepositional Phrase Attachment Computational Linguistics 32 3 pp 341 78 it was shown that argument hood could be predicted frequently in conjunction with lexical classes of their attachment points A current task under TPP is to examine whether it is possible to predict whether senses in the preposition inventory can be assigned to argument or adjunct status based on their characteristics as developed in TPP see Hargraves 2007 for an initial assessment of this possibility In any event the corpus instances developed in TPP can serve as an appropriate testbed for the development of disambiguation routines The properties identified by the lexicographer will be used in the further development of these routines particularly in the use of various lexical resources including syntactic dictionaries WordNet machine readable dictionaries and thesauruses It is expected that further development of these routines will lead to further refinement of the lexicographer s characterizations as well as a greater level of specificity about the kinds of information necessary from the lexical resources Further examination of results from SemEval 2007 is ongoing Identifying Other Prepositions and Other Syntactic Realizations Filling the Same Semantic Roles A tagged sentence in the FrameNet database identifies a specific frame element within a specific frame for the prepositional phrase introduced by the preposition For example by introduces the frame element Mode of transportation or Path in the Arriving frame The FrameNet database can be queried to determine other prepositions and other syntactic realizations in which these frame elements occur The distinct patterns in which these occur are summarized by identifying all unique occurrences of Frame Frame Element Lexical Unit Grammatical Function Phrase Type Preposition within the database Preposition is included only when the Phrase Type is PP There may be many sentences that have been tagged similarly but only unique occurrences need to be identified to examine the distribution of the same frame element In the example in Table 5 below taken from the file generated for the preposition by several combinations are evoked by the seed element The Mode of transportation frame element was seeded by the instances for arrive v and or come v sense 8 of by the Path element was evoked by the instances for enter v sense 5 of by It can be seen that in addition to by in is also used to indicate the Mode of transportation frame element also as a Complement to the main verb For the Path frame element in addition to by the prepositions on through via round past towards and across are used The Path frame element is also expressed as the Direct Object for one verb come Table 5 Variations in Syntactic Realizations of a Frame Element for by Frame Frame Element Lexical Unit GF PT Preposition Arriving Mode of transportation arrive v Comp PP by Arriving Mode of transportation arrive v Comp PP in Arriving Mode of transportation come v Comp PP by Arriving Mode of transportation return n Comp PP by Arriving Path approach v Comp PP on Arriving Path approach v Comp PP through Arriving Path approach v Comp PP via Arriving Path arrive v Comp PP through Arriving Path arrive v Comp PP via Arriving Path come v Comp PP round Arriving Path come v Comp PP through Arriving Path come v Comp PP via Arriving Path come v Obj NP Arriving Path enter v Comp PP at Arriving Path enter v Comp PP by Arriving Path enter v Comp PP through Arriving Path enter v Comp PP via Arriving Path get v Comp PP past Arriving Path reach v Comp PP by Arriving Path reach v Comp PP through Arriving Path reach v Comp PPing Arriving Path return n Comp PP towards Arriving Path return v Comp PP across In a second example shown in Table 6 52 lines were generated for the Cure Treatment combination from a single instance of through via the verb rehabilitate v sense 12 labeled Intermediary by the lexicographer but essentially a means semantic role The Cure Treatment pair occurs in a much greater range of lexical items including not only verbs alleviate cure ease heal rehabilitate resuscitate and treat but also nouns cure healer palliation remedy therapist therapy and treatment and adjectives curative palliative rehabilitative and therapeutic Examining just those with a Phrase Type of PP we see that by with without and for are other prepositions in addition to through expressing the Treatment frame element Table 6 Variations in Syntactic Realizations of a Frame Element for through Frame Frame Element Lexical Unit GF PT Preposition Cure Treatment alleviate v Comp PP by Cure Treatment alleviate v Comp PP with Cure Treatment alleviate v Comp PPing Cure Treatment alleviate v Ext NP Cure Treatment curative a Ext NP Cure Treatment curative a Head N Cure Treatment cure n Comp NP Cure Treatment cure n Comp VPing Cure Treatment cure n Ext NP Cure Treatment cure v Comp PP by Cure Treatment cure v Comp PP with Cure Treatment cure v Comp PP without Cure Treatment cure v Comp PPing Cure Treatment cure v Ext NP Cure Treatment ease v Comp PP by Cure Treatment ease v Comp PP with Cure Treatment ease v DNI Cure Treatment ease v Ext NP Cure Treatment heal v Comp PP by Cure Treatment heal v Comp PP with Cure Treatment heal v Comp PPing Cure Treatment heal v Ext NP Cure Treatment healer n Ext NP Cure Treatment palliation n Mod N Cure Treatment palliative a Head N Cure Treatment rehabilitate v Comp PP through Cure Treatment rehabilitate v Comp PPing Cure Treatment rehabilitate v Ext NP Cure Treatment rehabilitative a Head NP Cure Treatment remedy n Comp AJP Cure Treatment remedy n Comp NP Cure Treatment remedy n Comp PP for Cure Treatment remedy n Comp PPing Cure Treatment remedy n Mod AJP Cure Treatment resuscitate v Comp PP through Cure Treatment therapeutic a Comp NP Cure Treatment therapeutic a Ext NP Cure Treatment therapeutic a Head N Cure Treatment therapist n Mod N Cure Treatment therapy n Ext NP Cure Treatment therapy n INI Cure Treatment therapy n Mod AJP Cure Treatment therapy n Mod N Cure Treatment treat v Comp AVP Cure Treatment treat v Comp PP by Cure Treatment treat v Comp PP with Cure Treatment treat v Comp PPing Cure Treatment treat v Ext NP Cure Treatment treatment n Comp NP Cure Treatment treatment n Comp PP by Cure Treatment treatment n Comp PP with Cure Treatment treatment n Ext NP Using the frames and frame elements from all sense tagged instances as seeds 9309 lines and 5440 lines similar to those in Tables 5 and 6 were generated for by and through respectively These results can then be examined by sense number and lead to an identification of all other prepositions expressing the frame elements as shown in Table 3 These prepositions are shown in Table 7 alongside those the lexicographer listed on the basis of intuition and Quirk assessments of semantic similarity Table 7 Other Similar Prepositions for Senses of through Sense Lexicographer Prepositions Prepositions Identifiable from FrameNet 2 1a into into on over about at across in under against between through around with behind off onto towards by down outside along near below beneath above of within underneath beside beyond throughout close up for from 3 1b among within inside through under within at beneath amongst between on behind among above around over all close across along down towards up past via from of alongside by with to The number of other prepositions expressing frame elements encompassed by a single sense was quite surprising The first explanation for this large number was simply that the lexicographer had overlooked some possibilities And indeed upon reviewing the lists the lexicographer could imagine substituting some of the suggestions in example sentences However the large number requires a more systematic explanation To assess the substitutability of other prepositions for a given semantic role the lexicographer first examined their definitions in ODE for similarity Many had similar definitions but many did not The lexicographer then examined the definitions in the Oxford English Dictionary OED which has a much larger number of senses than ODE Rather than finding similar senses the lexicographer concluded that in fact ODE simply provided a better organization of the many senses ignoring obsolete and dated senses Instead of attempting to reach a final conclusion on substitutability this issue will await further data when the other prepositions undergo their sense tagging The analysis at that time will examine the semantic role assignments for prepositions deemed substitutable and determine their congruence In particular it will be possible to examine the array of frame elements of putative substitutable senses In addition to the other preposition analysis the FrameNet data support an in depth examination of other methods of

    Original URL path: http://www.clres.com/prepositions.html (2016-02-11)
    Open archived version from archive

  • The Preposition Project (Online Lookup)
    Enter a preposition index data TPP home TPP feedback help The Preposition Project CL Research 2005 2007 Online Oxford Dictionary of English Oxford University Press 2005

    Original URL path: http://www.clres.com/cgi-bin/onlineTPP/find_prep.cgi (2016-02-11)
    Open archived version from archive

  • Generic classes of Prepositions
    candlelight a man all at sea In all cases the purpose is to associate the presence of a particular condition or fact with some other element of the sentence Barrier This is a small category The complement represents a physical thing that stops action There is some kinship with Target below and this category could possibly be absorbed by it except that the barrier may not be an intentional stopping place or destination whereas the target is Cause This category embraces the many prepositional senses that name the cause for something sometimes for the POA but also for other things named in the sentence The smaller category Purpose has been absorbed into this one Doubles This is a small category that could possibly be combined with Tandem though not usefully I think because Tandem is already a bit large and unwieldy Doubles is confined to only two prepositions between and among whose complements in the senses included are always dual or plural since the prepositions essentially stipulate a relationship embracing two or more things Typically the POA indicates the nature of the relationship Exception This category of prepositional senses includes mostly subjuncts and disjuncts that indicate something constituting an exception or exclusion to what is predicated in the related clause Means Medium This category takes in all prepositional senses where the complement identifies the means by which or the medium through which something happens or is done This category roughly corresponds to a grammatical instrumental case More granularity could be achieved if the means are separated from the mediums but it doesn t strike me that this would be terribly useful Membership This relatively small category is for senses that establish a relationship of membership between POA and complement wherein either can be a member of the genus that the other represents the salient thing is that the preposition and very often along with other words in proximity state that the relationship is one of genus and species Party Relatively small and perhaps provisional category for senses whose complement is a person though not the main actor and so not classifiable under Agent and that don t clearly fall into another category such as e g spatial or temporal In principal all SRTypes that begin with Party partake of this category if they are assigned elsewhere it is because their current home to me reflects a more important or useful classification Possession A relatively small category for complements representing something that is owned held or worn by the complement All such sentences could in theory be written with the complement or POA as subject and some form of the verb have By implication then some of this category could be absorbed into Agent Quantity This category holds complements that can be expressed as a number or some other quantity Scalar This category holds complements that have reference to a scale Most often they identify a point on a continuum but I have also included those that establish the existence of

    Original URL path: http://www.clres.com/PrepositionClasses.htm (2016-02-11)
    Open archived version from archive

  • Pattern Dictionary of English Prepositions
    indicate that this instance is really a transitive phrasal verb where the lemma should be tagged as a particle and not a prepositional phrase and unk for unknown i e not yet tagged The Save option is for registered editors and is used to commit taggings to the database Steps and aids using in tagging instances are described below In addition to making use of the pattern descriptions features identified in parsing all instances can be examined and used as the basis for selecting instances automatically These features characterize the context of a preposition s use and provide links to FrameNet frame elements associated with FrameNet lexical units Preposition Syntagmatic Patterns In characterizing preposition behavior the general semantic content of each element of Governor preposition Complement must be specified We consider each component Complement Syntactically the complement is a noun phrase a nominal wh clause or a nominal ing clause Considered by itself the complement has a meaning i e some ontological category For example Boston is a city This category may frequently help in disambiguating the preposition However more generally some additional meaning is given to the complement For example Boston may be a destination or a point of reference The precise meaning will come from the preposition and the governor preposition The preposition associated with the complement provides a first step in allowing us to determine what additional meaning should be added to the complement In general a given complement can appear after a large number of prepositions For the example of Boston we can imagine sentences using the following prepositions across against around beyond from in into of over through to and within Other prepositions such as between by reason of during and until are unlikely to have Boston as a complement The specific preposition will impart some information on how we want to interpret the complement Governor The final piece of meaning associated with the complement is provided by the governor or the point of attachment of the prepositional phrase For the example of Boston the verb played with against Boston will invoke a sports context while resided with in Boston will invoke a locational sense In analyzing preposition behavior therefore the objective is to tease apart these various elements The procedures for doing so are laid out below Steps and Aids in Tagging Instances In general tagging TPP instances is based on considering the pattern descriptions in the pattern manager Since the pattern sets definitions are based on the Oxford Dictionary of English the likelihood is that the coverage and accuracy of the sense distinctions is quite high However since prepositions have generally not received the close attention of words in other parts of speech PDEP is intended to ensure the coverage and accuracy During the development of the SemEval 2007 tagged instances using FrameNet sentences the lexicographer found it necessary to increase the number of senses by about 10 percent Since the lack of coverage in FrameNet is well recognized the representative sample developed for PDEP should provide the basis for ensuring the coverage and accuracy of the sense inventory As indicated the first step in tagging instances involves looking at the patterns and seeing whether the TPP instances can be tagged with existing patterns In addition to the patterns instances that have been tagged for SemEval 2007 labeled FN or the Oxford English Corpus labeld OEC can be opened and used as the basis for making judgments on the TPP corpus We have provided tools to enhance the examination of similarities from the FN or OEC corpora and applying the results to the TPP instances As indicated all sentences in the corpora have been fully parsed with a dependency parser Features characterizing the context of the target preposition have also been developed for each sentence using Tratz system There are approximately 1500 features for each sentences these data are almost instantly available for examination When a particular corpus has been opened whether for a particular sense or for the entire set the menu bar includes an Examine item and a Select item Next to the Examine item there are two drop down boxes with the initial options labeled WFRs word finding rules and FERs feature extraction rules To use the examine or select capability a WFR and an FER need to be selected Word finding rules enable examination of features for words in a certain contextual location with respect to the target preposition They are divided into two sets words pertaining to the governor and words pertaining to the complement Words pertaining to the governor are 1 verb or head to the left l 2 head to the left hl 3 verb to the left vl 4 word to the left wl and 5 governor h Words pertaining to the complement are 1 syntactic preposition complement c and 2 heuristic preposition complement hr Thus selecting one of these options identifies the word whose properties are to be examined Feature extraction rules identifies the specific kind of feature to be examined There are 9 feature kinds 1 part of speech using the Penn Treebank categories pos 2 word class the 4 major word classes wc 3 lexical name the WordNet file name category 27 possibilities for nouns and 15 for verbs ln 4 lemma the base form of a word l 5 the word as it appears w 6 synonyms as identified in WordNet s 7 hypernyms the first level in WordNet h 8 whether the word is capitalized c and 9 affixes present in the word a set of 27 suffix or prefix characteristics af Thus the feature extraction rules enable examination of specific syntactic or semantic features of the selected word The combination of WFRs and FERs provide 63 features that can be examined for any corpus that is opened When a WFR and an FER have been selected clicking on Examine brings up a new tab with the results for that word feature combination The results are presented in a table with the headings Value Count and Description Value gives the value of the feature Count indicates the number of instances with this value Description is given for only two features the part of speech and the affixes where the codes given in the value field are not always transparent For the feature identifying whether a word is capitalized the value is only true For most features the number of possible values is relatively small so the table is only several rows deep For the lemma and the word itself the number of distinct entries is limited by the number of instances in the particular corpus set being examined For the synonym and hypernym features the number of entries may be quite a bit larger In addition to the features that have developed through parsing the sentences in a corpus an additional capability allows examination of potential semantic role labels using FrameNet data associated with lexical units as annotated in the FrameNet project Next to the drop down boxes for specifying WFRs and FERs there is a checkbox labeled FN when the given preposition has been used for marking a frame element When frames are developed and sentences containing lexical units for the frame are annotated a set of frame element realizations are recorded in summary form Many of these realizations are in the form PP prep We have created a dictionary of the FrameNet lexical units that contains a list of all frame element realizations associated with the lexical unit Throughout the FrameNet data 75 distinct prepositions are recorded along with the frame element When the FN box is checked for a particular corpus of a preposition the set of lexical units with that preposition is retrieved We hypothesize that the governor of a prepositional phrase is the trigger for this phrase To examine the occurrences of a possible frame element governed by one of these triggers we need to select the governor WFR h and the lemma FER l With this combination and with the FN box checked clicking on Examine will generate a table of all governors in the lemma form i e lexical units in the current corpus that have been tagged in FrameNet In addition to the count of instances the results also identify the set of frame elements that have assigned to these prepositional phrases in FrameNet under the Description heading In many cases more than one frame element has been tagged with the given lexical unit For example some sentences for the lexical unit dance have been tagged for the preposition across with the Area or the Path frame element A similar capability has been added to examine prepositions identified in VerbNet Throughout the VerbNet data 31 distinct prepositions have been identified in VerbNet frames Again with the selection of the governor WFR h and the lemma FER l and with the VN box checked clicking on Examine will generate a table of all governors in the lemma form i e members of VerbNet verb classes in the current corpus that have been identified in VerbNet frames In addition to identifying the lemmas the results also identify the VerbNet classes In some cases a lemma may appear as a member of more than one verb class using the given preposition The general objective of examining features is to identify those that are diagnostic of specific senses To do this most effectively it is best to open the corpus instances that have been tagged with a specific sense in either FN or OEC see the instructions above for Preposition Corpus Instances Experience in examining features will identify the most useful combinations When an interesting feature has been identified it can be used to select sentences in the open corpus set To do this it is necessary to put the value identified in a feature examination in the box next to Select and then click on Select or just pushing the Enter key after entering text in this field When this is done on an FN or OEC corpus particularly those for specific senses the selected instances will generally show the consistency with which these instances have been tagged When the same feature combination is used with the TPP corpus particularly for instances not yet tagged the selection will identify candidate instances for tagging with a specific sense For example opening the full TPP corpus for over specifying hr as the WFR and ln as the FER and then placing noun time in the selection box will identify 122 instances out of 500 that have this characteristic Inspection will show how well this combination is diagnostic of sense 14 5 of over Recording Preposition Behavior in the Pattern Box By examining features the behavior of a particular sense can be constructed As indicated above examining characteristics of the two tagged corpora OEC and FN will be useful in formalizing the TPP data in the pattern box This may begin with an examination of the word classes wc and parts of speech pos of the complements and governors These can be used to check the appropriate boxes in the pattern description NN NNP WH or ING for the complements and Noun Verb or Adj for the governors A next step might be to examine the complement and governor lemmas l and words w It is likely that several words or lemmas will be identified Several potential categorizations of these words can be examined including WordNet lexical names ln WordNet synonyms s WordNet hypernyms h FrameNet frame element realizations with FN checked and VerbNet verb classes with VN checked When these features are examined the results show the number of instances in the particular subcorpus and the total number of instances in that corpus so that some assessment of generality can be made The WordNet features tend to produce a larger number of total hits reflecting the polysemy present in WordNet The number of FrameNet and VerbNet hits are always below the total number of instances this reflects the coverage of these two resources When some features appear to be diagnostic of a sense the specifications can be applied to the TPP corpus using the Select facility When the selected instances appear to have been selected appropriately they can then be tagged with the particular sense under investigation In such cases the selection criteria are entered into the Selector fields of the patterns For example for pattern 12 10 of for indicating the length of a period of time the WordNet lexical name noun time is found to be quite prevalent in the OEC and FN corpora for this sense When applied to the TPP corpus most selected instances appear to be correctly identified Upon examination any incorrect selections can be unselected The sense 12 10 is then applied to the selected instances Finally the annotation hr ln noun time is entered into the Selector field for the complement Once instances in TPP have been tagged for a specific sense the next time this sense is examined these instances can then be investigated in further depth It is much easier to examine the consistency of the tagging when only the instances with these tags are shown Further shades of meaning can perhaps be identified perhaps with further refinement of all fields in the pattern description It is worth noting that examination of WordNet FrameNet and VerbNet features may provide additional insights into those resources The WordNet features frequently reveal unexpected characterizations such as school as a time period For FrameNet the FN corpus shows a very high number of hits for FrameNet head lemmas while the OEC and TPP corpora show a much lower number of hits VerbNet also has a much smaller number of hits Thus presuming that the identification of head lemmas is quite accurate analysis of the TPP instances may provide an opportunity for expanding the coverage of FrameNet and VerbNet Preposition Class Analyses PDEP enables an indepth analysis of TPP classes Tratz clusters and Srikumar semantic realations First we query the database underlying the patterns to identify all senses with a particu lar class We then examine each sense on each list in detail We follow the procedures laid out above for examining the features to add information about selectors complement types and categories We use this information to tag the TPP instances conservatively assuring the tagging e g leaving untagged questionable instances Finally we carefully place each sense into a preposition class or subclass grouping senses together and making annotations that attempt to capture any nuance of meaning that distinguishes the sense from other members of the class To build a description of the class and its sub classes we make use of the Quirk reference in the pattern box i e the relevant discussions in Quirk et al 1985 We build the description of a class as a separate web page and make this available as a menu item in the pattern box labeled Analysis A class analysis is not yet available for all classes the current state of class analysis is described in Preposition Class Analyses The description provides an overview of the class making use of the TPP data and the Quirk discussion and indicating the number of senses and the number of prepositions Next the description provides a list of the categories within the class characterizing the complements of the category and then listing each sense in the category with any nuance of meaning as necessary Finally we attempt to summarize the selection criteria that have been used across all the senses in the class A list of prepositions senses in each class and their semantic relation type Srtype is also provided along with a count of the number of instances tagged with each sense the percentage of instances for the preposition that have been tagged with each sense and a normalized frequency of the occurrence of each sense in the British National Corpus per million prepositions The process of building a class description reveals inconsistencies in each of the class fields When we place a preposition sense into the class we may find it necessary to make changes in the underlying data At the top level these class analyses in effect constitute a coarse grained sense inventory As the subclasses are developed a finer grained analysis of a particular area is available We believe these analyses may provide a comprehen sive characterization of particular semantic roles that can be used for various NLP applications Downloading Data All data used in PDEP is available for download directly The full database is available in a set of MySQL files for upload into a MySQL database Specific data is also available in Javascript Object Notation JSON using a simple format of string value pairs This is done through PHP scripts as provided below Each script is described with 1 a link to the section above where the relevant portion of PDEP is described 2 a brief statement of what the script returns 3 a link to the script opening in another window and 4 a detailed list of the field names the strings and the values when these are not obvious Pattern Dictionary of English Prepositions Data All data from PDEP are available as a download in a set of three MySQL files sutiable for upload into a MySQL database These include 1 definitions for all 1040 senses patterns of 304 prepositions 2 properties for each sense in 26 fields and 3 tagged instances for all sentences in the TPP corpora As significant changes to the PDEP are made a new version of this data will be make available Latest October 11 2015 Preposition Inventory This script http www clres com db prepstats php returns a list of all the prepositions with summary data about each The fields are preposition patterns the number status fn FrameNet instances oec OEC instances tpptags TPP instances that have been tagged tpp TPP instances bnc BNC frequency create by creator of the entry created date

    Original URL path: http://www.clres.com/pdep.html (2016-02-11)
    Open archived version from archive

  • Preposition Pattern Editor
    the case of prepositions the units are the complement object of the preposition and the governor point of attachment of the prepositional phrase See principles and instructions for more details Enter a preposition or select a status and push Load to begin exploring this dictionary Only registered editors can save changes Pattern Dictionary of English Prepositions Help Classes Download Comments Filter complete WIP tagged ready initial all VLF Show Hide columns Preposition Status Patterns FN Insts OEC Insts TPP Tagged TPP Insts BNC Freq Created by Created Last editor Modified Print Preposition Status Patterns FN Insts OEC Insts TPP Tagged TPP Insts BNC Freq Created By Created Last Editor Modified Patterns for Add pattern Annotate Renumber Save Save Close Close Sample size Semantic class Status WIP MODAL complete tagged Initial VLF Comment Pattern primary implicature Register Domain pv idiom Pattern Annotate Gen Prim Impl Copy Delete Save Save Close Close Semantic class Domain Register Subject Role Lexset Subject alt Role Lexset Verb no object no adverbial Indirect object Role Lexset Opt Object Role Lexset Opt Object alternation Role Lexset Opt Clausal objects Opt to INF V ING that C WH C Q Role Lexset Opt Complement object subject as opt

    Original URL path: http://www.clres.com/db/TPPEditor.html (2016-02-11)
    Open archived version from archive



  •