ISyE Statistic Seminar - Chao Zhang

Event Details
  • Date/Time:
    • Monday March 11, 2019
      2:00 pm - 3:00 pm
  • Location: Groseclose 402
  • Phone:
  • URL: ISyE Building
  • Email:
  • Fee(s):
    N/A
  • Extras:
Contact
No contact information submitted.
Summaries

Summary Sentence: Multidimensional Text Mining with Limited Supervision

Full Summary: Abstract: Unstructured text, as one of the most important data forms, plays a crucial role in domains such as cybersecurity, healthcare informatics, and cyber-physical systems. In many emerging applications, people's information need from text data is becoming multidimensional---they demand useful insights along multiple aspects from the given text corpus. However, acquiring multidimensional knowledge from massive text data challenges existing data mining techniques. In this talk, I will present a structuring-and-mining framework for facilitating acquiring multidimensional knowledge from text data. It organizes unstructured text into a multidimensional and multi-granular structure, from which end users can easily select relevant data with declarative queries and apply any data mining primitives thereafter. I will detail two core algorithms in this framework, including (1) a weakly supervised text classification algorithm; and (2) an abnormal event detection algorithm. The algorithms in the framework all require little supervision and are thus particularly appealing in scenarios where labeled data are expensive to acquire.

Title:

Multidimensional Text Mining with Limited Supervision

Abstract:

Unstructured text, as one of the most important data forms, plays a crucial role in domains such as cybersecurity, healthcare informatics, and cyber-physical systems. In many emerging applications, people's information need from text data is becoming multidimensional---they demand useful insights along multiple aspects from the given text corpus. However, acquiring multidimensional knowledge from massive text data challenges existing data mining techniques. In this talk, I will present a structuring-and-mining framework for facilitating acquiring multidimensional knowledge from text data. It organizes unstructured text into a multidimensional and multi-granular structure, from which end users can easily select relevant data with declarative queries and apply any data mining primitives thereafter. I will detail two core algorithms in this framework, including (1) a weakly supervised text classification algorithm; and (2) an abnormal event detection algorithm. The algorithms in the framework all require little supervision and are thus particularly appealing in scenarios where labeled data are expensive to acquire.

Bio:

Chao Zhang is an Assistant Professor at College of Computing, Georgia Institute of Technology. His research area is data mining and machine learning. He is particularly interested in developing label-efficient and robust learning techniques, with applications in text mining and spatiotemporal data mining. Chao has published more than 40 papers in top-tier conferences and journals, such as KDD, WWW, SIGIR, VLDB, and TKDE.  He is the recipient of the ECML/PKDD Best Student Paper Runner-up Award (2015) and the Chiang Chen Overseas Graduate Fellowship (2013). Before joining Georgia Tech, he obtained his Ph.D. degree in Computer Science from University of Illinois at Urbana-Champaign in 2018.

 

Additional Information

In Campus Calendar
Yes
Groups

H. Milton Stewart School of Industrial and Systems Engineering (ISYE)

Invited Audience
Faculty/Staff, Postdoc, Public, Graduate students, Undergraduate students
Categories
Seminar/Lecture/Colloquium
Keywords
No keywords were submitted.
Status
  • Created By: Julie Smith
  • Workflow Status: Published
  • Created On: Mar 5, 2019 - 4:16pm
  • Last Updated: Mar 6, 2019 - 8:59am