# PhD Defense by Zhibo Dai

Event Details
• Date/Time:
• Thursday April 16, 2020 - Friday April 17, 2020
2:00 pm - 2:59 pm
• Location: REMOTE: BLUE JEANS
• Phone:
• Email:
• Fee(s):
N/A
• Extras:
Contact
No contact information submitted.
Summaries

Summary Sentence: Spectrum Reconstruction Technique and Improved Naive Bayes Models for Text Classification Problems

Full Summary: No summary paragraph submitted.

I’m Zhibo Dai, the 5th year math PhD student at Georgia Tech in the school of math. I’ll take my defense on 4/16 afternoon between 2pm ET and 3pm ET at Bluejeans meeting 866242745.

My thesis title is Spectrum Reconstruction Technique and Improved Naive Bayes Models for Text Classification Problems. The abstract and committee information are as follows:

Abstract
This thesis studies two topics. In the first part, we study the spectrum reconstruction technique. As is known to all, eigenvalues play an important role in many research fields and are foundation to many practical techniques such like PCA (Principal Component Analysis). We believe that related algorithms should perform better with more accurate spectrum estimation. There was an approximation formula proposed by Prof. Matzinger. However, they didn't give any proof. In our research, we show why the formula works. And when both number of features and dimension of space go to infinity, we find the order of error for the approximation formula, which is related to a constant C-the ratio of dimension of space and number of features.

In the second part, we focus on some applications of Naive Bayes models in text classification problems. Especially we focus on two special situations: 1) there is insufficient data for model training; 2) partial label problem. We choose Naive Bayes as our base model and do some improvement on the model to achieve better performance in those two situations. To improve model performance and to utilize as many information as possible, we introduce a correlation factor, which somehow relax the conditional independence assumption of Naive Bayes. The new estimates are biased estimation compared to the traditional Naive Bayes estimate, but have much smaller variance, which give us a better prediction result.

Committee

• Prof. Heinrich Matzinger – School of Mathematics (advisor)
• Prof. Federico Bonetto– School of Mathematics
• Prof. Wenjing Liao – School of Mathematics
• Prof. Tuo Zhao – School of Industrial and Systems Engineering
• Prof. Ionel Popescu – School of Mathematics

In Campus Calendar
No
Groups