US Patent No: 6,230,151

Number of patents in Portfolio can not be more than 2000

Parallel classification for data mining in a shared-memory multiprocessor system

Stats

ATTORNEY / AGENT: (SPONSORED)
 

Importance

Loading Importance Indicators... loading....

Abstract

A method and system for generating a decision-tree classifier in parallel in a shared-memory multiprocessor system is disclosed. The processors first generate in the shared memory an attribute list for each record attribute. Each attribute list is assigned to a processor. The processors independently determine the best splits for their respective assigned lists, and cooperatively determine a global best split for all attribute lists. The attribute lists are reassigned to the processors and split according to the global best split into the lists for child nodes. The split attribute lists are again assigned to the processors and the process is repeated for each new child node until each attribute list for the new child nodes includes only tuples of the same record class or a fixed number of tuples.

Loading the Abstract Image... loading....

First Claim

Related Publications

Loading Related Publications... loading....

Patent Owner(s)

Patent OwnerAddressTotal Patents
INTERNATIONAL BUSINESS MACHINES CORPORATIONARMONK, NY68180

International Classification(s)

  • [Classification Symbol]
  • [Patents Count]

Inventor(s)

Inventor Name Address # of filed Patents Total Citations
Agrawal, Rakesh San Jose, CA 252 4601
Ho, Ching-Tien San Jose, CA 21 358
Zaki, Mohammed J Rochester, NY 2 35

Cited Art

Patent Info (Count) # Cites Year
 
INTERNATIONAL BUSINESS MACHINES CORPORATION (7)
5,819,266 System and method for mining sequential patterns in a large database 38 1995
5,668,988 Method for mining path traversal patterns in a web environment by converting an original log sequence into a set of traversal sub-sequences 33 1995
5,742,811 Method and system for mining generalized sequential patterns in a large database 52 1995
5,899,992 Scalable set oriented classifier 53 1997
5,884,305 System and method for data mining from relational data by sieving through iterated relational reinforcement 90 1997
5,884,320 Method and system for performing proximity joins on high-dimensional data points in parallel 41 1997
6,003,029 Automatic subspace clustering of high dimensional data for data mining applications 88 1997
 
FUJITSU LIMITED (1)
5,463,773 Building of a document classification tree by recursive optimization of keyword selection function 73 1993
 
GOOGLE INC. (1)
5,960,446 Parallel file system and method with allocation map 73 1997
 
NCR CORPORATION (1)
4,825,354 Method of file access in a distributed processing computer network 201 1985
 
ORACLE INTERNATIONAL CORPORATION (1)
5,864,839 Parallel system and method for generating classification/regression tree 16 1997
 
SAP AG (1)
5,615,341 System and method for mining generalized association rules in databases 126 1995
 
OTHER [CHECK PATENT PROFILE FOR ASSIGNMENT INFORMATION] (1)
5,875,285 Object-oriented data mining and decision making system 32 1996

Patent Citation Ranking

Forward Cites

Patent Info (Count) # Cites Year
 
INTERNATIONAL BUSINESS MACHINES CORPORATION (11)
6,687,691 Method and system for reconstructing original distributions from randomized numeric data 4 2000
6,546,389 Method and system for building a decision-tree classifier from privacy-preserving data 24 2000
6,954,756 Method and system for detecting deviations in data tables 9 2001
7,231,638 Memory sharing in a distributed data processing system using modified address space to create extended address space for copying data 4 2002
7,499,908 Method for identifying a workload type for a given workload of database requests 5 2003
8,312,464 Hardware based dynamic load balancing of message passing interface tasks by modifying tasks 0 2007
8,234,652 Performing setup operations for receiving different amounts of data while processors are performing message passing interface tasks 0 2007
8,127,300 Hardware based dynamic load balancing of message passing interface tasks 3 2007
8,108,876 Modifying an operation of one or more processors executing message passing interface tasks 4 2007
8,055,681 Data storage method and data storage structure 0 2008
8,392,398 Query optimization over graph data streams 0 2009
 
INSIGHT BIOPHARMACEUTICALS LTD. (3)
7,666,651 Polypeptide having heparanase activity 1 2001
7,339,038 Heparanase specific molecular probes and their use in research and medical applications 0 2003
8,048,993 Heparanase specific molecular probes and their use in research and medical applications 0 2008
 
ORACLE INTERNATIONAL CORPORATION (3)
6,665,684 Partition pruning with composite partitioning 27 1999
7,020,661 Techniques for pruning a data object during operations that join multiple data objects 41 2002
7,174,344 Orthogonal partitioning clustering 6 2003
 
MICROSOFT CORPORATION (2)
7,219,121 Symmetrical multiprocessing in multiprocessor systems 10 2002
7,765,405 Receive side scaling with cryptographically secure hashing 3 2005
 
ACCELRYS SOFTWARE INC. (1)
7,016,887 Methods and systems of classifying multiple properties simultaneously using a decision tree 10 2001
 
AT&T CORP. (1)
6,456,993 Alternating tree-based classifiers and methods for learning them 12 1999
 
CRAY INC. (1)
7,764,629 Identifying connected components of a graph in parallel 0 2005
 
FRANCE TELECOM (1)
7,584,168 Method and device for the generation of a classification tree to unify the supervised and unsupervised approaches, corresponding computer package and storage means 0 2006
 
HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P. (1)
6,952,701 Simultaneous array configuration and store assignment for a data storage system 1 2001
 
INFORMATICA CORPORATION (1)
7,254,590 Set-oriented real-time data processing based on transaction boundaries 12 2003
 
INTEL CORPORATION (1)
8,126,911 System and method for content-based partitioning and mining 1 2006
 
LAWRENCE LIVERMORE NATIONAL SECURITY, LLC (1)
7,007,035 Parallel object-oriented decision tree system 11 2001
 
LG-ERICSSON CO., LTD. (1)
7,171,520 Cache flush system and method thereof 0 2003
 
SALFORD SYSTEMS (1)
7,328,218 Constrained tree structure method and system 1 2005
 
SYBASE, INC. (1)
8,321,476 Method and system for determining boundary values dynamically defining key value bounds of two or more disjoint subsets of sort run-based parallel processing of data from databases 2010
 
SYMANTEC OPERATING CORPORATION (1)
7,606,800 Systems, methods and apparatus for creating stable disk images 0 2006
 
THE BOEING COMPANY (1)
7,447,666 System and method for analyzing a pattern in a time-stamped event sequence 3 2004
 
THE STAYWELL COMPANY, LLC (1)
7,389,277 Machine learning systems and methods 2 2005
 
THE UNIVERSITY OF NORTH CAROLINA AT CHAPEL HILL (1)
7,095,872 Automated digital watermarking methods using neural networks 2 2002
 
VERSATA DEVELOPMENT GROUP, INC. (1)
7,363,593 System and method for presenting information organized by hierarchical levels 7 2001