Juang Shares Insights on Big Data Research at Georgia Tech

B.H. (Fred) Juang’s research interests focus on digital signal processing; multi-channel signal processing; signal coding and recognition; multimedia communications; natural human-machine communication and interaction; signal modeling and stochastic processes; and intelligent informatics. He is the Motorola Foundation Chair Professor in the School of Electrical and Computer Engineering and a Georgia Research Alliance Eminent Scholar. Juang holds nearly 20 patents, and has published extensively, including co-authoring the textbook Fundamentals of Speech Recognition.

What does “big data” mean to you from the perspective of the research in your center (or in your research group)?

In my view, big data means letting real data, particularly in large quantities, speak for the truth and help us solve problems directly without over relying on the so-called expert surrogates. Big data research, therefore, involves at least the following two inquiries: “How do we find truth in real data?” and “How do we use data to solve problems directly?”

What are the greatest opportunities you see for Georgia Tech in big data?

Georgia Tech has an impressive collection of research groups in various areas, which can take advantage of this paradigm shift, from expert surrogate to big data. For example, people in ISyE have been well recognized for their expertise in coupling data with decision. In ECE, many of us in the Center of Information and Signal Processing (CSIP) have long been engaged in data analysis, modeling, representation, identification, processing and search, particularly related to media data.

The greatest opportunity for Georgia Tech lies in our ability to identify substantial problems and to integrate the aforementioned expertise to formulate overarching problem-solving approaches that can produce impactful solutions directly from big data with measurable performance objectives. Again, the operative words are “problem-solving” and “measurable objective.”

A problem must have a clearly defined and measurable objective. As an illustrating example, enabling a machine to make a decision is not a problem; enabling a machine to make the least number of errors in its decision is. How to use big data to achieve this in useful applications is where research opportunities exist.

What are some of your—or your colleagues’—current research projects or major activities related to big data?

We have a vast number of big-data related projects ongoing in our group. Rather than enumerate them all, I’ll highlight a few.

When data quantity is vast, its organization can be a challenge, including designing methods for easy retrieval, reliable identification, and accurate reconstruction, to name a few. My colleagues are figuring out ways to identify the succinct features of the data for these purposes.

In another telecom related project, we have access to a large set of wireless calling data, from which one can infer the best resource allocation scheme for wireless services based on the user’s perspective of feeling satisfied.

We also have a long tradition in speech and language related research. We ask questions like “Can a machine identify those most important words in a conversation for the sake of understanding the semantic transactions most accurately?” Or in a similar vein, given a lengthy article, “What is the shortest condensed message that retains the meaning of the text?” Even more provocative, “Can a machine translate a sentence or a paragraph in one language into another without understanding its meaning but just by searching for the most relevant expressions from a large set of cross-reference sentences?”

Another project we are working on in geo-signal processing aims at the possibility of inferring the seismic structure based on a vast amount of reference data from the past; e.g., “Where is the oil and how much is present?” or “Will the foundation of my house change due to nearby fracking?”

How do you think a campus-wide initiative in big data could help you enhance your research or develop new collaborations and funding opportunities?

As I alluded to above, if we share the common interest to develop problem-solving capabilities directly from big data, and we are able to integrate the wide and deep expertise currently available at Georgia Tech, we’ll succeed in bringing about a huge impact on society. A campus-wide big data initiative can certainly help with this integration and serve as a catalyst to make this potential a reality.

Media

Biing-Hwang (Fred) Juang