Data mining method and system for generating a decision tree classifier for data records based on a minimum description length (MDL) and presorting of records

Number of patents in Portfolio can not be more than 2000

United States of America Patent

PATENT NO 5787274
SERIAL NO

08564694

Stats

ATTORNEY / AGENT: (SPONSORED)

Importance

Loading Importance Indicators... loading....

Abstract

See full text

A method and apparatus are disclosed for generating a decision tree classifier from a training set of records. The method comprises the steps of: pre-sorting the records based on each numeric record attribute, creating a decision tree breadth-first, and pruning the tree based on the MDL principle. Preferably, the pre-sorting includes generating a class list and attribute lists, and independently sorting the numeric attribute lists. The growing of the tree includes evaluating possible splitting criteria and selecting a splitting test for each leaf node, based on a splitting index, and updating the class list to reflect new leaf nodes. In a preferred embodiment, the splitting index is a gini index. The pruning preferably includes encoding the decision tree and splitting tests in an MDL-based code, and determining whether to convert a node into a leaf node, prune its child nodes, or leave the node intact, based on the code length of the node.

Loading the Abstract Image... loading....

First Claim

See full text

Family

Loading Family data... loading....

Patent Owner(s)

Patent OwnerAddress
INTERNATIONAL BUSINESS MACHINES CORPORATIONNEW ORCHARD ROAD ARMONK NY 10504

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Agrawal, Rakesh San Jose, CA 265 9553
Mehta, Manish San Jose, CA 30 707
Rissanen, Jorma Johannes Los Gatos, CA 3 267

Cited Art Landscape

Load Citation

Patent Citation Ranking

Forward Cite Landscape

Load Citation