event
Thesis Defense :: Solving a Mixed-Integer Programming Formulation of a Classification
Primary tabs
Classification, the development of rules for the allocation of
observations to one or more groups, is a fundamental problem in machine
learning and has been applied to many problems in medicine and business.
We consider aspects of a classification model developed by Gallagher, Lee,
and Patterson that is based on a result by Anderson. The model seeks to
maximize the probability of correct G-group classification, subject to
limits on misclassification probabilities. The mixed-integer programming
formulation of the model is an empirical method for estimating the
parameters of an optimal classification rule, which are identified as
coefficients of linear functions by Anderson.
The model is shown to be a consistent method for estimating the parameters
of the optimal solution to the problem of maximizing the probability of
correct classification subject to limits on inter-group misclassification
probabilities. A polynomial time algorithm is described for two-group
instances. The method is NP-complete for a general number of
groups, and an approximation is formulated as a mixed-integer program
(MIP). The MIP is difficult to solve due to the formulation of
constraints wherein certain variables are equal to the maximum of a set of
linear functions. These constraints are conducive to an ill-conditioned
coefficient matrix. Methods for generating edges of the conflict graph
and conflict hypergraphs are discussed. The conflict graph is employed
for finding cuts in a branch-and-bound framework. This technique and
others lead to improvement in solution time over industry-standard
software on instances generated by real-world data. The classification
accuracy of the model in relation to standard classification methods on
real-world and simulated data is also noted.
Status
- Workflow Status: Published
- Created By: Barbara Christopher
- Created: 10/08/2010
- Modified By: Fletcher Moore
- Modified: 10/07/2016
Categories
Keywords
Target Audience