{"690911":{"#nid":"690911","#data":{"type":"event","title":"PhD Defense by Huayi Wang","body":[{"value":"\u003Cp\u003E\u003Cstrong\u003ETitle\u003C\/strong\u003E: Approximate Nearest Neighbor Search for Multiple Distance Metrics: Algorithms, Index Structures, and Applications\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EDate\u003C\/strong\u003E: Wednesday July 8th\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003ETime\u003C\/strong\u003E: 7:30PM \u2013 9PM Eastern Time (US)\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003ELocation\u003C\/strong\u003E: Remote\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003ETeams link\u003C\/strong\u003E: \u003Ca href=\u0022https:\/\/teams.microsoft.com\/l\/meetup-join\/19%3ameeting_ZWI5MTViODEtMDQ5Yi00MGVmLTg1MTMtYmI2YTA0NjJhZjQ2%40thread.v2\/0?context=%7b%22Tid%22%3a%22482198bb-ae7b-4b25-8b7a-6d7f32faa083%22%2c%22Oid%22%3a%222464cae1-fe5f-452b-a77d-1df702edeab7%22%7d\u0022 title=\u0022https:\/\/teams.microsoft.com\/l\/meetup-join\/19%3ameeting_ZWI5MTViODEtMDQ5Yi00MGVmLTg1MTMtYmI2YTA0NjJhZjQ2%40thread.v2\/0?context=%7b%22Tid%22%3a%22482198bb-ae7b-4b25-8b7a-6d7f32faa083%22%2c%22Oid%22%3a%222464cae1-fe5f-452b-a77d-1df702edeab7%22%7d\u0022\u003EHuayi\u0027s Thesis Defense\u003C\/a\u003E\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EThesis Advisory Committee\u003C\/strong\u003E\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EDr. Jun (Jim) Xu\u003C\/strong\u003E\u0026nbsp;- Advisor, Georgia Tech, School of\u0026nbsp;Computational Science\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EDr. Kexin Rong\u0026nbsp;\u003C\/strong\u003E- Georgia Tech, School of Computational Science\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EDr. Joy Arulraj\u0026nbsp;\u003C\/strong\u003E- Georgia Tech, School of Computational Science\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EDr.\u0026nbsp;Mitsunori Ogihara -\u0026nbsp;\u003C\/strong\u003EUniversity of Miami, Department of Computer Science\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EDr. Gromit Yeuk-Yin Chan -\u0026nbsp;\u003C\/strong\u003EAdobe Research, Research Scientist\u003C\/p\u003E\u003Cp\u003E\u0026nbsp;\u003C\/p\u003E\u003Cp\u003E\u003Cstrong\u003EAbstract\uff1a\u003C\/strong\u003E\u003C\/p\u003E\u003Cp\u003EApproximate nearest neighbor search (ANNS) is a fundamental algorithmic problem with numerous applications in many areas of computer science, especially databases and machine learning. An intriguing question is how to build index data structures that support efficient ANNS under various useful distance metrics. However, many practically important distance metrics\u2014such as Hamming distance, edit distance, Manhattan distance, and universal Lp distances\u2014still lack effective index structures for efficient ANN search. Although Euclidean distance has been extensively studied, these other metrics are equally important in real applications but still lack efficient ANNS solutions.\u003C\/p\u003E\u003Cp\u003EThis thesis proposes several ANNS solutions for different distance metrics. For Manhattan distance, we propose MP-RW-LSH, the first multi-probe Locality Sensitive Hashing (LSH) scheme tailored for L1 distance. Compared with the state-of-the-art LSH solution, MP-RW-LSH significantly reduces the number of hash tables required while maintaining similar query accuracy. For Hamming and edit distance, we propose indexable distance estimating codes called iDEC, which extend the error estimating coding (EEC) technique in the networking area to the ANNS problem by treating error estimating coding as a new distance estimation method. We also propose U-HNSW for universal Lp distance metrics. U-HNSW can efficiently answer ANNS queries under the Lp distance without building multiple graph indices for different p values.\u003C\/p\u003E\u003Cp\u003EBeyond ANNS algorithms, this thesis also studies applications where the core principles underlying efficient ANNS\u2014sketching and the avoidance of unnecessary expensive computations\u2014bring broader benefits. We propose OddEEC, a new EEC for wireless networking. OddEEC extends Odd Sketch for symmetric difference cardinality estimation to estimate the bit error rate of the communication channel, achieving much faster estimation time than existing EEC schemes while maintaining similar accuracy. Finally, this thesis studies how these principles can be applied to optimize modern AI workflows. Many AI workflows\u2014ranging from LLM post-training pipelines to agentic reasoning tasks\u2014can be expressed as declarative queries whose expensive predicate is evaluated by a large model or reward function. We propose a query-centric formulation of these workflows and show that classical database techniques, namely approximate query processing (AQP) and proxy-model-based filtering, can substantially reduce the number of expensive model invocations without modifying the underlying models or pipelines.\u003C\/p\u003E","summary":"","format":"limited_html"}],"field_subtitle":"","field_summary":[{"value":"\u003Cp\u003EApproximate Nearest Neighbor Search for Multiple Distance Metrics: Algorithms, Index Structures, and Applications\u003C\/p\u003E","format":"limited_html"}],"field_summary_sentence":[{"value":"Approximate Nearest Neighbor Search for Multiple Distance Metrics: Algorithms, Index Structures, and Applications"}],"uid":"27707","created_gmt":"2026-06-25 12:52:39","changed_gmt":"2026-06-25 12:53:15","author":"Tatianna Richardson","boilerplate_text":"","field_publication":"","field_article_url":"","field_event_time":{"event_time_start":"2026-07-08T19:30:00-04:00","event_time_end":"2026-07-08T21:00:00-04:00","event_time_end_last":"2026-07-08T21:00:00-04:00","gmt_time_start":"2026-07-08 23:30:00","gmt_time_end":"2026-07-09 01:00:00","gmt_time_end_last":"2026-07-09 01:00:00","rrule":null,"timezone":"America\/New_York"},"location":"REMOTE","extras":[],"groups":[{"id":"221981","name":"Graduate Studies"}],"categories":[],"keywords":[{"id":"100811","name":"Phd Defense"}],"core_research_areas":[],"news_room_topics":[],"event_categories":[{"id":"1788","name":"Other\/Miscellaneous"}],"invited_audience":[{"id":"78771","name":"Public"}],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[],"email":[],"slides":[],"orientation":[],"userdata":""}}}