专利内容由知识产权出版社提供
专利名称:Evaluating distinctiveness of document发明人:Takahiko Kawatani申请号:US10460469申请日:20030613
公开号:US20040006736A1公开日:20040108
专利附图:
摘要:Two document sets are compared in natural language processing and thedistinctiveness of each constituent element (such as a sentence, term or phrase) of onedocument set is evaluated by dividing both the target and comparison documents intodocument segments, constructing the sentence vector of each document segment whose
components are the occurring frequencies of terms occurring in the document segment,and projecting all the sentence vectors of both the documents on a projection axis to finda projection axis which maximizes a ratio equal to: (squared sum of projected valuesoriginating from the target document)/(squared sum of projected values originating fromthe comparison document). Projected values are obtained by projecting the sentencevectors on the projection axis, and the degrees of distinctiveness of the individualsentences of the target document are calculated on the basis of the projected values.
申请人:KAWATANI TAKAHIKO
更多信息请下载全文后查看