Piero P. Bonissone's Research Interests:
Model Ensemble and Fusion
|
|
Timeline |
|
|
|
Papers |
|
(2010) [13] P. Bonissone, J. M.
Cadenas, M.C. Garrido, R.A. Diaz, A Fuzzy Random
Forest, The International Journal of Approximate Reasoning, to
appear, doi:10.1016/j.ijar.2010.02.003, 2010. When individual classifiers are combined
appropriately, a statistically significant increase in classification
accuracy is usually obtained. Multiple classifier systems are the result of
combining several individual classifiers. Following Breiman's
methodology, in this paper a multiple classifier system based on a forest
of fuzzy decision trees, i.e., a
(2010) [12] P.
Bonissone, J.M. Cadenas, M.C. Garrido, R.A. Diaz,
Fundamentals for Design and Construction of a Fuzzy Random Forest, in Foundations of Reasoning under
Uncertainty, B. Bouchon-Meunier, Manuel Ojeda, R.R. Yager (eds.),
STUDFUZZ 249,
pp. 23-42, Springer-Verlag Berlin Heidelberg (2010). Following Breiman's
methodology, we propose the fundamentals to design and construct a forest of randomly generated fuzzy decision trees,
i.e., a
(2009) [11] P.
Bonissone, J.M. Cadenas, M.C. Garrido, R.A.
Diaz, R. Martinez, Weighted Decisions in a Fuzzy Random Forest, , 2009
IFSA Word Congress, Lisbon, Portugal, July 20-24, 2009 - [GE GR Technical Report,
2000, GRC850, Sept. 2009 (pdf)]. A multi-classifier
system - obtained by combining several individual classifiers - usually
exhibits a better performance (precision) than any of the original
classifiers. In this work we use a multi-classifier based on a forest of
randomly generated fuzzy decision trees (
(2008) [10] P.
Bonissone, J.M. Cadenas, M.C. Garrido, R.A.
Diaz, Combination Methods in Fuzzy Random Forest, SMC 2008, Singapore, Oct.
12-15,2008 [GE
GR Technical Report, 2008, GRC738, Oct. 2008 (pdf)]. Following Breiman's methodology, we propose a multi-classifier
based on a Forest of randomly generated fuzzy decision trees, i.e., a
(2008) [9] P.
Bonissone, J.M. Cadenas, M.C. Garrido, R.A.
Diaz, A Fuzzy Random Forest: Fundamental for Design and Construction, IPMU
2008, Following Breiman's methodology, we propose a multi-classifier
based on a Forest of randomly generated fuzzy decision trees, i.e., a
(2008) [8] P. Bonissone, F. Xue, and R. Subbu, Fast Meta-models for Local
Fusion of Multiple Predictive Models, Applied Soft Computing Journal, 2008,
doi:10.1016/j.asoc.2008.03.006 - [GE GR Technical Report, 2007GRC832, Fusing the outputs of an ensemble of diverse predictive models usually
boosts overall prediction accuracy. Such fusion is guided by each model's
local performance, i.e., each model's prediction accuracy in the neighborhood
of the probe point. Therefore, for each probe we instantiate a customized
fusion mechanism. The fusion mechanism is a meta-model, i.e. a model that
operates one level above the object-level models whose predictions we want to
fuse. Like these models, such a meta-model is defined by structural and
parametric information. In this paper, we focus on the definition of the
parametric information for a given structure. For each probe point, we either
retrieve or compute the parameters to instantiate the associated meta-model.
The retrieval approach is based on a CART-derived segmentation of the probe's
state space, which contains the meta-model parameters. The computation
approach is based on a runtime evaluation of each model's local performance
in the neighborhood of the probe. We explore various structures for the
meta-model, and for each structure we compare the precompiled (retrieval) or
run-time (computation) approaches. We demonstrate this fusion methodology in
the context of multiple neural network models. However, our methodology is
broadly applicable to other predictive modeling approaches. This fusion
method is illustrated in the development of highly accurate models for emissions,
efficiency, and load prediction in a complex power plant. The locally
weighted fusion method boosts the predictive performance by 30-50% over the
baseline single model approach for the various prediction targets. Relative
to this approach, typical fusion strategies that use averaging or globally
weighting schemes only produce a 2-6% performance boost over the same
baseline.
(2006) [7] F. Xue, R. Subbu, P. Bonissone,
Locally Weighted Fusion of Multiple
Predictive Models, IEEE International Joint Conference on Neural Networks
(IJCNN'06), pp. 2137-2143, Vancouver, BC, Canada, July 16 - 21, 2006 (pdf) - [GE GR Technical
Report, 2006GRC454, Fusing the outputs from an ensemble of models
in an effective way can often boost overall model accuracy. This paper
presents a novel method, called locally weighted fusion, which aggregates the
results of multiple predictive models based on local accuracy measures of
these models in the neighborhood of the probe point for which we want to make
a prediction. While we demonstrate the method in the context of multiple
neural network models, the concepts may be applied to other predictive
techniques as well. This fusion method is applied to develop highly accurate
models for emissions, efficiency, and load prediction in a complex real-world
power plant. The locally weighted fusion method boosts the predictive
performance by 20-40% over the baseline single model approach for the various
prediction targets. Relative to this approach, fusion strategies which apply
averaging or globally weighting only produce a 2-6% performance boost over
the baseline.
(2005) [6] K.
Goebel, P. Bonissone, Prognostic Information Fusion for Constant Load
Systems, Proc. 7th Annual Conference on Information Fusion, Vol. 2
pp. 1247-1255, 2005 (pdf) - [GE GR Tech. Report,
2005GRC333, This paper describes a process for aggregating different information sources to estimate
remaining equipment life. Specifically, the approach presents a rigorous
chain of preprocessing, modeling and post-processing steps that arrive
at the desired prognostic result. The preprocessing steps deal with
data reduction, filtering, and signature amplification. The prediction
model applies ANFIS to the data. The post-processing steps include
recursive trending which implicitly forces the prognostic trend to
be confirmed before updated estimates are reported. Innovative
measures are introduced that help in assessing the performance of the
approach. The method is illustrated using real-life data from
industrial web paper breakage prediction.
(2005) [5] P. Evangelista, M.
Embrechts, P. Bonissone, B. Szymanski, Fuzzy ROC Curves for Unsupervised Nonparametric Ensemble
Techniques, IJCNN 2005, Montreal, Canada, 2005 (pdf) - [GE
GR Tech. Report, 2005GRC254, This paper explores a novel ensemble
technique for unsupervised classification using nonparametric statistics.
Multiple classification systems (MCS), or ensemble techniques, involve
considering several classification methods or multiple outputs from the same
method and devising techniques to reach a decision. The performance of a
binary classification system can be measured on a receiver operating
characteristic (ROC) curve, and the area under the curve (AUC) is exactly the
Wilcoxon Rank Sum or Mann-Whitney U statistic, both
of which are nonparametric statistics based upon ranked data. Successful
performance of an unsupervised ensemble can be measured through the AUC, and
the performance of different aggregation techniques for the combination of the
multiple classification system decision values, or rankings in this paper, is
illustrated. Aggregation techniques are based upon fuzzy logic theory,
creating the fuzzy ROC curve. The one-class SVM is utilized for the
unsupervised classification
(2005) [4] P. Evangelista, P. Bonissone, M. Embrechts, B. K.
Szymanski, Unsupervised Fuzzy Ensembles Applied to Intrusion Detection, Proc.
13th European Symposium on Artificial Neural Networks 2005, pp. 345-350,
Bruges, Belgium, April 27-29, 2005 (pdf) This paper proposes a novel method for
unsupervised ensembles that specifically addresses unbalanced, unsupervised,
binary classification problems. Unsupervised learning often experiences the
curse of dimensionality; however subspace modeling can overcome this problem.
For each subspace created, the classifier produces a decision value. The
aggregation of the decision values occurs through the use of fuzzy logic,
creating the fuzzy ROC curve. The one-class SVM is utilized for unsupervised
classification. The primary source of data for this research is a host based
computer intrusion detection dataset.
(2005) [3] P. Bonissone, N. Eklund, K. Goebel,
Using an Ensemble of Classifiers to Audit a Production Classifier, 6th
International Workshop on Multiple Classifier Systems (MCS 2005), pp.
376-386, Monterey, CA, June 13 -1, 2005 (pdf) After
deploying a classifier in production it is essential to support its
lifecycle. This paper describes the application of an ensemble of classifiers
to support two stages of the lifecycle of an on-line classifier used to
underwrite life insurance applications: the monitoring of its decisions quality and the updating of the production
classifier over time. All combinations of five classification methods and seven fusion methods
were assessed from the perspective of accuracy and pairwise diversity of the classifiers, and accuracy,
precision, and coverage of
the fused classifiers. The proposed architecture consists of three offline classifiers and a fusion module.
(2004) [2] P. Bonissone, Automating the Quality
Assurance of an On-line Knowledge-Based Classifier By Fusing Multiple Off-line
Classifiers, Proc. Conference on Information Processing and Management
of Uncertainty (IPMU) 2004, Perugia, Italy, July 2004 (pdf) - [GE GR Tech. Report, 2004GRC134, We address
two problems in the lifecycle of a production classifier: the monitoring of its decisions quality
and the updating of the
classifier over time. The proposed architecture consists of four off-line
classifiers and an associative fusion module. The fusion is a T-norm based
outer-product of the classifiers' normalized outputs. By attaching a
confidence measure to each output of the fusion, we generate a distribution
of the production classifier's quality. The lower tail of this distribution
identifies the least reliable cases, which become candidates for auditing and manual QA. The upper
tail identifies the most reliable cases, which become candidates for updating the standard reference
data set used to design and tune the production classifier. We illustrate
this approach with an insurance underwriting problem.
(2004) [1] P. Bonissone,
K. Goebel, and W. Yan, Classifier Fusion using Triangular Norms, Proc.
Multiple Classifier Systems (MCS) 2004, pp. 154-163, Cagliari, Italy,
June 2004 (pdf) - [GE GR Tech. Report, 2006GRC143, This paper
describes a method for fusing a collection of classifiers where the fusion
can compensate for some positive correlation among the classifiers. Specifically,
it does not require the assumption of evidential independence of the
classifiers to be fused (such as Dempster Shafer's fusion rule). The proposed
method is associative, which allows fusing three or more classifiers
irrespective of the order. The fusion is accomplished using a generalized
intersection operator (T-norm) that better represents the possible
correlation between the classifiers. In addition, a confidence measure is
produced that takes advantage of the consensus and conflict between classifiers. |
Patents
|
|
(2009) System
And Method For Equipment Life Estimation, K. Goebel, P. Bonissone, W. Yan, N. Eklund, F. Xue, US Patent 7,548,830 (June 16, 2009) A method to
reduce uncertainty bounds of predicting a remaining life of a probe using a
set of diverse models is disclosed. The method includes generating an
estimated remaining life output by each model of the set of diverse models,
aggregating each of the respective estimated remaining life outputs via a
fusion model, and in response to the aggregating, predicting the remaining
life, the predicting having reduced uncertainty bounds based on the
aggregating. The method further includes generating a signal corresponding to
the predicted remaining life of the probe. (2008) System and process for a
fusion classification for insurance underwriting suitable for use by an
automated system, P. Bonissone, K. Aggour, R. Subbu, W. Yan, N. Iyer, A.
Chakraborty, US Patent No. 7,383,239 (Jun 8,
2008). A method
and system for fusing a collection of classifiers used for an automated
insurance underwriting system and/or its quality assurance is described. Specifically,
the outputs of a collection of classifiers are fused. The fusion of the data
will typically result in some amount of consensus and some amount of conflict
among the classifiers. The consensus will be measured and used to estimate a
degree of confidence in the fused decisions. Based on the decision and degree
of confidence of the fusion and the decision and degree of confidence of the
production decision engine, a comparison module may then be used to identify
cases for audit, cases for augmenting the training/test sets for re-tuning
production decision engine, cases for review, or may simply trigger a record
of its occurrence for tracking purposes. The fusion can compensate for the
potential correlation among the classifiers. The reliability of each
classifier can be represented by a static or dynamic discounting factor,
which will reflect the expected accuracy of the classifier. A static
discounting factor is used to represent a prior expectation about the
classifier's reliability, e.g., it might be based on the average past
accuracy of the model, while a dynamic discounting is used to represent a
conditional assessment of the classifier's reliability, e.g., whenever a
classifier bases its output on an insufficient number of points it is not reliable.
(2004)
Fusion classification for risk categorization in
underwriting a financial risk instrument, R. Messmer, P. Bonissone, K.
Aggour, R. Subbu, A system,
process and computer program product for underwriting a financial risk
instrument application represented by at least one risk attribute is
provided. Decision engines examine the at least one risk attribute associated
with the financial risk instrument application and assign the application to
one of a predetermined set of risk classes. A fusion engine compares the risk
classes assigned by each of the decision engines and fuses the assigned risk
classes into an aggregated result representative of the risk of the financial
risk instrument application. The fusion engine includes a first
multi-classifier fusion module that uses an associative function to fuse the
assigned risk classes into a first aggregated result and a second
multi-classifier fusion that uses a non-associative function to fuse the
assigned risk classes into a second aggregated result. A comparison engine
selects one of the first aggregated result generated from the first
multi-classifier fusion module and the second aggregated result generated
from the second multi-classifier fusion module and compares it with a
production result generated from the production decision engine. The
comparison engine generates an underwriting decision for the financial risk
instrument application according to the comparison. (2003)
Methods and systems for automated property valuation, P. Khedkar, P. Bonissone, and D. Golibersuch, US Patent No. 6,609,118 (Aug.
19, 2003) The present
invention is a method and system for automating the process for valuing a
property that produces an estimated value of a subject property, and a
quality assessment of the estimated value, that is based on the fusion of
multiple processes for valuing a property. In one embodiment, three processes
for valuing a subject property are fused. The first process, called LOCVAL,
uses the location and living area to provide an estimate of the subject
property's value. The second process, called AIGEN, is a generative
artificial intelligence method that trains a fuzzy-neural network using a
subset of cases from a case-base, and produces a run-time system to provide
an estimate of the subject property's value. The third process, called
AICOMP, uses a case based reasoning process similar to the sales comparison
approach to determine an estimate of the subject property's value.
|
Supervised MS -
PhD Theses
|
|
- |
Projects
|
|Bonissone
Home Page| GE Research
Computer and Decision Sciences
|
|