Adar Bet 5760 -- This Purim, the silencing of women in the Jewish community sounds loud amidst the cacophony of yells and groggers. Jewish Women Watching (JWW), the notorious anonymous action group, has taken Jews to task for their failure to nurture and celebrate women who speak out for themselves, for women's issues and for the entire community. Over the last two days, some 1500 leaders in all
Microsoft word - pedersen-acl2005-corrected.docUniversity of Minnesota Digital Technology Center, May 2005 Measures of Semantic Similarity and Relatedness in the Medical Domain
Ted Pedersen1 Serguei Pakhomov2 Siddharth Patwardhan3
1 Department of Computer Science, University of Minnesota, Duluth, MN 2 Division of Biomedical Informatics, Mayo Clinic, Rochester, MN 3 School of Computing, University of Utah, Salt Lake City, UT and thereby automatically augment existing on- Abstract
tologies with new relations and concepts. Measures of semantic similarity and relatedness have also proven useful in a number of NLP tasks. For example, Budanitsky and Hirst (2001) identify vectors derived from medical corpora. We malapropisms using various measures of similarity and relatedness. Resnik (1995), Patwardhan, et al. mantic similarity for general English to (2003), and McCarthy, et. al. (2004) employ meas- ures of similarity in their approaches to word sense methods with a newly created test bed of disambiguation. However, much this work has 30 medical concept pairs that were scored been relative to WordNet (Fellbaum, 1998), which dex experts. We find that our context vec- There are a growing number of ontologies that tor measure correlates better with these organize medical concepts into hierarchies and semantic networks, perhaps best exemplified by the Unified Medical Language System (UMLS) of the National Library of Medicine (NLM). The largest and most extensive of the ontologies in-cluded in UMLS is SNOMED-CT, which we use Introduction
in the experiments in this paper. We adapted a number of measures of semantic similarity to A measure of semantic similarity takes as input SNOMED-CT, and compare those to our own cor- two concepts, and returns a numeric score that pus based measure. quantifies how much they are alike, based on is-a In this paper we introduce a measure of seman- relations. For example, common cold and illness tic relatedness that derives context vectors from are similar in that a common cold is a kind of ill- medical corpora. It is a robust measure since it ness. However, there are other relations between does not rely on the structure of an ontology to concepts such as has-part, is-a-way-of-doing, etc., measure relatedness between concepts. As such it that existing measures of similarity can not use can be used between any two concepts for which since they only account for is-a relations.
we have the necessary corpus based information This suggests that more general measures of se- (which will be described shortly). Here we confine mantic relatedness are needed to take advantage of its use to concept pairs that are joined in an is-a increasingly rich ontologies (particularly in the hierarchy (SNOMED-CT) since the measures to medical domain) which have a wealth of relations which we compare require this, but in general our beyond is-a. This is especially relevant in light of measure is more flexible and does not require this. progress in automatically identifying a wide range This paper proceeds with a review of a number of semantic relations in medical text (e.g., Rosario of existing measures of semantic similarity that and Hearst, 2004). It seems likely that measures of have been applied to general English and the medi- relatedness such as we propose here can help to cal domain. Then we introduce the medical ontol- organize discovered relations between concepts, ogy SNOMED-CT and the Mayo Clinic corpus of clinical notes, which are our information sources for the measures we explore in this paper. We in- concepts being measured. The path length from troduce a new measure of relatedness based on this shared concept to the root of the ontology is second order context vectors derived from corpora. scaled by the sum of the distances of the concepts We also introduce a new test bed for the evaluation to the subsuming concept. Leacock and Chodorow of measures of semantic similarity and relatedness (1998) define a similarity measure that is based on in the medical domain. Finally, we present our ex- finding the shortest path between two concepts and perimental results, and suggestions for future work. scaling that value by twice the maximum depth of the hierarchy, and then taking the logarithm of the Measures of Semantic Similarity
resulting score. Hirst and St-Onge (1998) introduce a path finding measure of relatedness, which is a Measures of semantic similarity are often based more general relation than is similarity. In brief, on information regarding is-a relations found in a the relatedness of two concepts is determined by concept hierarchy. This information can have to do the nature of the paths that join them; ideally this with path lengths between concepts, or it may should be a path that is not too long and has rela- augment such structural information with corpus tively few changes in direction. based statistics. We describe existing measures of For the experiments in this paper, we developed similarity here, and then in Section 4 we introduce a shortest path algorithm, and adapted the measure our own method, which is adapted from the word sense discrimination technique of Schütze (1998). Information Content Measures
Path Finding Measures
Resnik (1995) presents an alternative to path When concepts are organized in a hierarchy, it is finding via the notion of information content. This convenient to measure similarity according to is a measure of specificity assigned to each concept structural measures that find path lengths between in a hierarchy based on evidence found in a corpus. concepts. In fact, there have been a variety of such A concept with high information content is very approaches proposed in both the medical domain specific, while concepts with lower information content are associated with more general concepts. Rada, et al. (1989) developed a measure based The information content of a concept is estimated on path lengths between concepts in the Medical by counting the frequency of that concept in a Subject Headings (MeSH) ontology1. They relied large corpus, along with the frequency of all the on broader than relations, which provide succes- concepts that are subordinate to it in the hierarchy. sively more or less specific concepts as you travel The probability of a concept is determined via a from concept to concept. They used this measure maximum likelihood estimate, and the information to improve information retrieval by ranking docu- content is the negative log of this probability. ments retrieved from MEDLINE, a corpus made Resnik defines a measure of similarity that holds up of abstracts of biomedical journal articles. More that two concepts are semantically related propor- recently, Caviedes and Cimino (2004) developed a tional to the amount of information they share. The measure called CDist which finds the shortest path quantity of shared information is determined by the between two concepts in the UMLS. Their evalua- information content of the lowest concept in the tion relative to a small set of concepts and concept hierarchy that subsumes both the given concepts. clusters drawn from a subset of the UMLS consist- However, the Resnik measure may not be able to ing of MeSH, ICD9CM2 and SNOMED shows that make fine grained distinctions since many concepts even such relatively simple approaches tend to may share the same least common subsumer, and would therefore have identical values of similarity. Wu and Palmer (1994) present a measure of Jiang and Conrath (1997) and Lin (1998) devel- similarity for general English that relies on finding oped measures that scale the information content the most general concept that subsumes both of the of the subsuming concept by the information con- tent of the individual concepts. Lin does this via a 1 MeSH is distributed by the National Library of Medicine ratio, and Jiang and Conrath with a difference.
2 International Classification if Diseases, 9th revision, Clinical Modification Lord, et al. (2003) adapted these three informa- chies and the path length between two concepts is tion content measures to the Gene Ontology (GO). calculated by taking the average path lengths for They found that these measures can be success- each hierarchy in which the concept was found. fully used for “semantic searching” of the textual resources available to bioinformatics research. Mayo Clinic Corpus of Clinical Notes
We adapted the measures of Resnik, Lin, and Ji- ang and Conrath to SNOMED-CT, using the Mayo The corpus that was used in this paper consists of Clinic clinical notes for information content esti- ~1,000,000 clinical notes which cover a variety of major medical specialties at the Mayo Clinic. Clinical notes have a number of specific character- SNOMED-CT
istics that are not found in other types of discourse, such as news articles of even scientific medical SNOMED-CT (Systematized Nomenclature of articles found in MEDLINE. Clinical notes are
Medicine, Clinical Terminology) is an ontologi-
generated in the process of treating a patient at a cal/terminological resource that has a wide and clinic and normally represent the dictations every relatively consistent coverage of the clinical do- physician practicing in the US is required to file. main. It is produced by the College of American As a result, these notes represent a kind of quasi-Pathologists and is now distributed as part of the spontaneous discourse (Anonymous) where the UMLS through the National Library of Medicine. dictations are made partly from notes and partly SNOMED-CT is used for indexing electronic from memory. More often than not, the speech medical records, ICU monitoring, clinical decision tends to be telegraphic which presents certain chal- support, medical research studies, clinical trials, lenges for natural language processing. computerized physician order entry, disease sur-veillance, image indexing and consumer health ****CC**** information services, to name a few. The version of SNOMED-CT we use in this paper consists of more than 361,800 unique concepts with over Imdur 30 mg q.d. Lisinopril 5 mg q.d. (increased to 10 mg q.d. today) 975,000 descriptions (entry terms) (SNOMED-CT Her vocal cord examination yesterday was unremarkable. She broke her ankle toward the end of YEAR and is still limping but it is im- The terminology is organized into 13 hierarchies proving. While she was hospitalized for aspiration pneumonia after at the top level: clinical findings, procedures, ob- her vocal cord biopsy in DATE, she developed tachycardia with ECG servable entities, body structures, organisms, sub- changes. Echocardiogram showed EF of 30-35% with regional wall motion abnormalities. She was started on Lisinopril and Imdur. stances, physical objects, physical forces, events, ****IP**** geographical environments, social contexts, con- text-dependent categories and staging and scales. #2 ASO Plan: Because of some elevated blood pressure, we will increase her The concepts and their descriptions are linked Lisinopril to 10 mg q.d. with approximately 1.47 million semantic relation- ships such as is-a, assists, treats, prevents, associ- ated etiology, associated morphology, has #1 Probable CAD property, has specimen, associated topography, #2 ASO has object, has manifestation, associated with, Figure 1. A short excerpt from a Clinical Note classifies, has ingredient, mapped to, mapped from, measures, clinically associated with, used by, At the Mayo Clinic, the dictations are tran- anatomic structure is physical part of, to name a scribed by trained personnel and are stored in the patient’s chart electronically. These transcriptions One characteristic of SNOMED-CT that pre- are then made available for health science research. sents challenges for calculating semantic similarity The notes are semi-structured where each note is that it allows multiple inheritance where a con- consists of a number of subsections such as Chief cept can belong to more than one of the 13 hierar- Complaint (CC), History of Present Illness (HPI), chies. As a preliminary solution, we introduce a Impression/Report/Plan (IP), Final Diagnoses “root” node that is the parent of the top 13 hierar- (DX), among others. A typical example of a clini- pital International Classification of Diseases Adaptation (HICDA). The HICDA classification is We are particularly interested in the CC, HPI, IP a hierarchy consisting of four levels. The top level and DX section of the clinical notes. The CC sec- is the most general and has 19 categories such as tion records the reason for visit; HPI section has Neoplasms, Diseases of the Circulatory System,information of what other treatments/problems the etc. The next 3 levels group diagnoses into more patient has had in the past; IP section contains the diagnostic and current treatment information, while The Mayo Clinic thesaurus is constructed on the the DX section is an abstraction of the IP section – assumption that if several diagnostic phrases have it contains only a list of diagnoses. Other sections been classified to the same category in the HICDA such as SI (Special Instructions) and CM (Current hierarchy, then these phrases can be considered as Medications) are less interesting from the stand- synonymous at the level of granularity afforded by point of semantic relatedness measures, although if HICDA. For example, diagnostic phrases such as we were to focus on computing semantic related- “primary localized hilar cholangiocarcinoma” and ness between medications, then we may want to “cholangiocarcinoma of the Klatskin variety” are consider the CM section as well. linked in a thesaurus-like fashion because these two statements have been manually classified the Context Vector Measure
same way. We consider these two phrases nearly synonymous and use them to generate quasi- We have developed a measure of semantic relat- definitions for terms found in both SNOMED and edness that represents a concept with a context this utterance level thesaurus of diagnostic phrases. vector. This is more flexible than measures of There is a fair amount of noise in this collection, similarity since it does not require that the concepts which we attempt to reduce by excluding those be connected via a path of relations in an ontology. phrases that occur 5 times or less and those phrases Our context vector measure is an adaptation of that are classified as “Admission, diagnosis not Schütze’s (1998) method of word sense discrimi- given.” The Mayo Clinic thesaurus is then merged nation. First, a co-occurrence matrix is constructed with the UMLS. The merging consists of string- such that each cell contains the log-likelihood matching each term in the Mayo Clinic thesaurus score between a term found in the description of a to a concept in the UMLS and then transferring all concept, and each of the words it co-occurs with in the other terms associated with the UMLS (which a given corpus. Thus, the rows of this matrix repre- subsumes SNOMED) concept into the Mayo Clinic sent the description terms that are used to define thesaurus. The resulting thesaurus contains concepts, while the columns are the words with 3,665,721 diagnostic phrases organized into which it occurs in a large corpus of text. The exact 594,699 clusters, which is an average of 6 terms in nature of the description words for a concept can a cluster. The thesaurus represents 344,550 or 95% vary, but could consist of a gloss or definition, or a Then, to represent the concepts that occurred in We have created context vectors that rely on a both the thesaurus and SNOMED-CT for semantic rich source of descriptions that has been systemati- similarity measures, we take each of the descriptor cally collected over the past ten years at Mayo and terms in the thesaurus/cluster and build a co- represents a large amount of human coding experi- occurrence matrix for it from the Mayo Clinic ence. This resource contains over 16 million clinical notes. After the vectors are created, the unique diagnostic phrases expressed through natu- concepts represented by the descriptor words are ral language that correspond to over 21,000 diag- themselves represented by an averaging of all the noses and represents an utterance level thesaurus. vectors associated with all the descriptor words. This database provides entry points to patient in- Thus, a SNOMED-CT concept is represented en- formation at Mayo Clinic. Each diagnostic state- tirely outside of SNOMED-CT by way of an aver- ment has been uttered by a practicing physician at aged vector of word co-occurrences, where the the Mayo Clinic as part of the patient’s medical words that represent a concept are derived from an record manually coded and cataloged for subse- quent retrieval using a Mayo Clinic modified Hos- Experimental Data
Measures of semantic similarity can be evaluated both directly and indirectly. The direct method compares systems relative to human judgments; chronic obstructive
20 pulmonary disease
common standards for general English are pro- vided by Rubenstein and Goodenough (1965) and Miller and Charles (1991). The indirect methods evaluate similarity and relatedness measures based the performance of the application that relies on the measures. Spelling correction (Budanitsky and Hirst, 2001) and word sense disambiguation (Pat- wardhan, Banerjee, and Pedersen, 2003) have been shown to be sensitive to semantic relatedness. For the medical domain there are no existing sets of related words that have been created by human experts that could be used in our study. As such we created a test bed of pairs of medical terms that were scored by human experts according Figure 2. Test set of 30 medical term pairs sorted in the to their relatedness. We asked a Mayo Clinic order of the averaged physicians’ scores. physician trained in Medical Informatics to generate a set of 120 term pairs representing a As a control, we had 10 of the 13 experts3 anno- range of relatedness from not related at all to very tate the 30 general English term pairs in the closely related. The number 120 reflects the fact Rubenstein and Goodenough’s and Miller and that we asked to generate pairs in four broad Charles’ test set. This was done to make sure the categories, following Rubenstein and Goodenough, experts understood the instructions and the notion with 30 term pairs in each. Subsequently, we had of relatedness. The correlation of the medical index 13 medical index experts annotate each pair with a experts’ judgments with those of the annotators relatedness value on a scale of 1 to 10. The used by Rubenstein and Goodenough was rela-Medical Index consists of a group of people who tively high – 0.84. Similarly, the correlation with are trained to classify clinical diagnoses using the the Miller and Charles’s test set was even better – same HICDA classification system used in the 0.88. Unfortunately, the agreement on the medical Mayo Clinic thesaurus. test set of 120 concept pairs turned out to be fairly The classification is done primarily for subsequent low - 0.51. In order to derive a more reliable test identification of patient cohorts that match the cri- set we extracted only those pairs for which the teria requested by health science research investi- agreement was relatively high. This resulted in a gators who conduct epidemiological studies at the set of 30 concept pairs (displayed in Figure 2) Mayo Clinic. The experts who annotated the test which were then annotated by three physicians and set for this study have had between 5 and 14 years a subset of 9 available medical index experts from of coding experience. Although they do not have the 13 who annotated the original 120 pairs. All formal training in medicine, by virtue of working three physicians are specialists in the area of rheu- with clinical records and terminologies have had a matology. The fact that all of them specialize in the lot of exposure to medical language and we con- same sub-field of medicine can be helpful in get- sidered them as good candidates for this annotation 3 Not all of the experts were always available to us at all times, so the number of annotators changed from one set of annota-tions to the next. No new experts were added, only subtracted based on their availability and work load. ting good inter-rater agreement. Each pair was an- Number of
notated on a 4 point scale: “practically synony- Matrix size
mous, related, marginally related and unrelated.” We have listed the term pairs and the averaged scores assigned by the physicians and the experts in Figure 2. The term pair 20 in bold has been ex- cluded from the test bed because the term “lung infiltrates” was not found in the SNOMED-CT terminology. Thus, the resulting test set consists of 29 pairs; however, we were able to calculate the inter-rater correlation using all 30 pairs. The aver- Table 3. Descriptive training corpus statistics. (fre- age correlation between physicians is 0.68. The quency range = 5-1000) average correlation between experts is 0.78. We also computed the correlation across the two In order to build the matrices, we set the pa- groups after we averaged the scores each member rameters so that we would only count the co- of the respective groups had assigned to each pair occurrence of terms that occurred more than 5 in the test set. The correlation across groups is times and less than 1000 times. Table 3 provides 0.85.
descriptive statistics on the corpus size and the re-sulting co-occurrence matrix size. Experimental Results
Test results relative to the test set of 29 term pairs are shown in Figure 3. The overall trend sug- We implemented two measures of semantic gests that the correlation between relatedness similarity based on the structure of SNOMED-CT: judgments of the context vector measure and those the shortest path algorithm and the Leacock and of human experts improves with larger amounts of Chodorow measure. We implemented three meas- training data where 300K size appears to be the ures that are based on a combination of informa- tion content statistics derived from the Mayo Clinic clinical notes, and the ontology provided by SNOMED-CT. Finally, we implemented our own context vector measure by finding co-occurrence vectors from the Mayo Clinic clinical notes based relation
on the descriptor terms associated with concepts in the Mayo Clinic thesaurus. As such the context vector measure was the only measure that was not First, we tested two hypotheses related to the context vector measure. The first was that the cor- Corpus Size (X1000 notes)
relation between the measure and the experts de-pends on the size of the training corpus. The second hypothesis was that the section type of the Figure 3. Correlation with human experts for the context clinical notes also has an effect on the correlation. vector measure trained on all sections. The log trend In the following sections, we test these two hy- line is showing improvement with corpus size. potheses and show a comparison between all avail- Figure 3 shows correlation with the scores pro- vided by the group of physicians and the experts Corpus Size
separately as well as combined scores averaged across both groups. For these experiments, we used We experimented with training the Context Vec- data from 4 sections of the clinical notes - Chief tor measure on variable amounts of text ranging Complaint (CC), History of Present Illness (HPI), from a corpus of 100,000 clinical notes to 1M Impression/Plan (IP) and Final Diagnosis (DX). notes.
measure can perform at least as well or better than the best of the ontology-based measures. Another We then experimented with the four section interesting observation is that context vector meas- types by keeping the corpus size constant at 100K ure produces a much closer correlation with physi- notes and varying the section type while building cians that with experts. For all other measures, this the context vectors. Table 4 displays the results. is reversed. We hypothesize that this result is due to the nature of the professional training and activi- section physicians experts both N
ties of the two groups – experts are trained in using hierarchical classifications, while physicians are trained to diagnose and treat patients. One possible indication from this observation is that the data contained in the clinical notes may reflect certain Table 4. Correlation results for the context vector meas- kinds of semantic relations between medical con- ure trained on different sections of a 100K notes corpus. cepts in the mind of a physician better than a hand-crafted medical ontology such as SNOMED-CT. The best correlation is achieved on the corpus By all means, more experimentation is necessary in compiled from the IP sections, closely followed by DX. This is not surprising as the IP section con- It is also worth pointing out that the context vec- tains the diagnostic information pertinent to the tor based measure trained on the IP sections of 1M patient’s condition and intuitively should contain notes performs considerably better than the meas-more closely related terms than other sections. The ure trained on all sections of 1M notes, especially DX section is an abstraction of the IP section in on physician’s judgments as shown in the first two that it only contains the diagnoses without addi- We conclude that the most optimal context vec- Conclusions and Future Work
tor measure would result from training on the IP sections of the entire 1M notes corpus. In the fol- The existence of semantic equivalence classes lowing section, we compare the most optimal con- between lexical items in English makes it highly desirable to use thesauri of synonymous or nearly synonymous terms for information (IR) and docu- Comparison with other measures
ment retrieval (DR) applications. The issue is par-ticularly acute in the medical domain due to Table 5 shows the results of comparing all avail- stringent completeness requirements on such IR tasks as patient cohort identification. An epidemi-ologist performing an incidence study would rather Method Phys. Expert Both sift through irrelevant patient records than miss
VECTOR (IP only, 1M notes)
any potentially relevant patients. We believe that VECTOR (All sect, 1M notes)
semantic relatedness can improve the performance of such systems, since being able to map the user’s search query for “congestive heart failure” to in- clude cardiac decompensation, pulmonary edema, ischemic cardiomyopathy and volume overload as Table 5. Comparison of correlations across measures terms related to congestive heart failure. Clearly, with context vector measure trained on the IP sections pulmonary edema does not denote the same or even a similar disorder as congestive heart failure The vector based measure appears to produce but under the patient cohort identification condi- better results than any other measure (by a wide tions it could be considered as an equivalent search margin where physicians are concerned), followed by the information content based measures and In our experiments we have been able to show then the path based measures. This is an interesting the efficacy of adapting WordNet based semantic result as it suggests that an ontology-independent relatedness measures developed for general Eng- lish to a specialized subdomain of biomedicine Chute CG, Crowson DL, Buntrock JD. (1995) Medical represented by SNOMED-CT. We have also de- information retrieval and WWW browsers at Mayo.
termined that an ontology-independent context Proceeding of Annual Symposium on Comput. Appl. vector measure is at least as good as other ontol- ogy-dependent measures, provided that there is a Fellbaum, C. (Ed.) (1998) WordNet: An Electronic large enough corpus of unlabeled training data Lexical Database. MIT Press. Cambridge, MA. available. This finding is important because devel- Hirst, G. and St-Onge, D. (1998) Lexical Chains as oping specialized ontologies such as WordNet, Representations of Context for the Detection and SNOMED or UMLS is a very labor intensive proc- Correction of Malapropisms; In: Fellbaum, C. (Ed.) ess. Also, there are some indications that manually (1998) WordNet: An electronic lexical database; constructed ontology may not fully reflect the real- pages 305-332. MIT Press. Cambridge, MA. ity of semantic relationships in the mind of a prac- Jiang, J. and Conrath, D. (1997) Semantic Similarity ticing physician. The vector based measure can Based on Corpus Statistics and Lexical Taxonomy; help alleviate these problems in addition to the Proceedings of International Conference on Research benefit of rapid adaptation to a new domain. in Computational Linguistics; pages 19-33; Taipei, In the near future, we plan to extend the meas- ures of relatedness to use the UMLS as a source of Leacock, C. and Chodorow, M. (1998) Combining Lo- the ontological relations for path-based measures cal Context and WordNet Similarity for Word Sense as well as glosses for vector based measures. A Identification; In: Fellbaum, C. (Ed.) (1998) Word- fairly complete list of definitions is provided in the Net: An electronic lexical database; pages 265-283; latest version of UMLS 2004AC. We also would like to experiment with applications of semantic Lin, D. (1998) An Information Theoretic Definition of relatedness measures to NLP tasks such as word- Similarity; In Proceedings of the 15th International sense discrimination, information retrieval and Conference on Machine Learning; pages 296-304; Lord, P.W., Stevens, R.D., Brass, A. and Goble, C.A. Acknowledgements
(2003) Investigating Semantic Similarity Measures across the Gene Ontology: the Relationship between This work was supported in part by a grant from Sequence and Annotation. Bioinformatics, the Digital Technology Center (DTC) of the Uni- 19(10):1275-83.
versity of Minnesota under their Digital Technol- McCarthy, D., Koeling, R., Weeds, J. and Carroll, J. ogy Initiative (DTI) program in 2004-2005. (2004) Finding predominant senses in untagged text. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics. Barce- References
Banerjee, S. and Pedersen, T. (2003). Extended Gloss Miller, G. and Charles, W. (1991) Contextual Correlates Overlaps as a Measure of Semantic Relatedness. Pro- of Semantic similarity; Language and Cognitive ceedings of the Eighteenth International Joint Con- Processes; 6(1):1-28.
ference on Artificial Intelligence; pages 805-810; August; Acapulco, Mexico. Patwardhan, S., Banerjee, S., and Pedersen, T. (2003) Using Measures of Semantic Relatedness for Word Budanitsky, A. and Hirst G. (2001). Semantic Distance Sense Disambiguation; Proceedings of the 4th Inter- in WordNet: An Experimental Application Oriented national Conference on Intelligent Text Processing Evaluation of Five Measures; Proceedings of the and Computational Linguistics; pages 241-57; Febru- Workshop on WordNet and other Lexical Resources: Applications, Extensions, and Customizations; pages 29-34; June; Pittsburgh, PA. Rada, R., Mili, H. Bicknell, E. and Blettner, M. (1989) Development and Application of a Metric on Seman- Caviedes, J. and Cimino, J. (2004) Towards the devel- tic Nets; IEEE Transactions on Systems, Man and opment of a conceptual distance metric for the Cybernetics; 19(1):17-30.
UMLS. Journal of Biomedical Informatics 37: 77-85. Resnik, P. (1995) Using Information Content to Evalu- ate Semantic Similarity in a Taxonomy; Proceedings of the 14th International Joint Conference on Artifi-cial Intelligence; pages 448-453; August; Montreal. Rosario, B. and Hearst, M. (2004) Classifying Semantic Relations in Bioscience Texts. Proceedings of the 42nd Annual Meeting of the Association for Compu-tational Linguistics; pages 430-437; Barcelona, Spain.
Rubenstein, H. and Goodenough, J.B. (1965) Contex- tual Correlates of Synonymy; Computational Lin-
Schütze, H. (1998). Automatic Word Sense Discrimina- tion; Computational Linguistics; 24 (1): 97-123.
http://www.snomed.org/snomedct/documents/July04_CTFactSheet.pdf Wu, Z. and Palmer, M. (1994) Verb Semantics and Lexical Selection; Proceedings of the 32nd Annual Meeting of the Association for Computational Lin-guistics; pages 133-138; Las Cruces, NM. UMLS: Unified Medical Language System. Available from http://www.nlm.nih.gov/research/umls/.
Effective January 1, 2012 2012 EMPIRE PLAN FLEXIBLE FORMULARY Administered by UnitedHealthcare The following is a list of the most commonly prescribed generic and brand-name drugs included on the 2012 Empire Plan Flexible Formulary. This is not a complete list of all prescription drugs on the flexible formulary or covered under The Empire Plan. This list and excluded medications are