# Thesis Defense :: Solving a Mixed-Integer Programming Formulation of a Classification

Event Details
• Date/Time:
• Tuesday August 23, 2005
10:00 am - 12:00 am
• Location: Room 404, Groseclose
• Phone:
• URL:
• Email:
• Fee(s):
N/A
• Extras:
Contact
Barbara Christopher
Industrial and Systems Engineering
Contact Barbara Christopher
404.385.3102
Summaries

Summary Sentence: Thesis Defense :: Solving a Mixed-Integer Programming Formulation of a Classification

Full Summary: Thesis Defense :: Solving a Mixed-Integer Programming Formulation of a Classification

Classification, the development of rules for the allocation of
observations to one or more groups, is a fundamental problem in machine
learning and has been applied to many problems in medicine and business.
We consider aspects of a classification model developed by Gallagher, Lee,
and Patterson that is based on a result by Anderson. The model seeks to
maximize the probability of correct G-group classification, subject to
limits on misclassification probabilities. The mixed-integer programming
formulation of the model is an empirical method for estimating the
parameters of an optimal classification rule, which are identified as
coefficients of linear functions by Anderson.

The model is shown to be a consistent method for estimating the parameters
of the optimal solution to the problem of maximizing the probability of
correct classification subject to limits on inter-group misclassification
probabilities. A polynomial time algorithm is described for two-group
instances. The method is NP-complete for a general number of
groups, and an approximation is formulated as a mixed-integer program
(MIP). The MIP is difficult to solve due to the formulation of
constraints wherein certain variables are equal to the maximum of a set of
linear functions. These constraints are conducive to an ill-conditioned
coefficient matrix. Methods for generating edges of the conflict graph
and conflict hypergraphs are discussed. The conflict graph is employed
for finding cuts in a branch-and-bound framework. This technique and
others lead to improvement in solution time over industry-standard
software on instances generated by real-world data. The classification
accuracy of the model in relation to standard classification methods on
real-world and simulated data is also noted.

In Campus Calendar
No
Groups

School of Industrial and Systems Engineering (ISYE)

Invited Audience
No audiences were selected.
Categories
Seminar/Lecture/Colloquium
Keywords
No keywords were submitted.
Status
• Created By: Barbara Christopher
• Workflow Status: Published
• Created On: Oct 8, 2010 - 7:38am
• Last Updated: Oct 7, 2016 - 9:52pm