{"675916":{"#nid":"675916","#data":{"type":"event","title":"ISYE Statistic Seminar - Alon Kipnis","body":[{"value":"\u003Cp\u003EAbstract\u003Cbr\u003ESuppose we have two tables of counts, each indexed by the same set of categories, and we wish to determine whether the underlying generating mechanism behind each table might be different. Furthermore, if the mechanism is different, we suspect it likely varies only in a small fraction of the observed categories. This question arises in several applications, including attributing authorship based on word frequencies.\u003C\/p\u003E\u003Cp\u003EWe propose a tool for this problem built on P-values derived from a Binomial Allocation model and the notion of Higher Criticism (Donoho \u0026amp; Jin 2004). Our proposal offers an interpretable and easy-to-apply tool, and our theoretical analysis shows that it is powerful against the aforementioned changes in the generating mechanism. Specifically, under a calibration of the number of categories to rarity and signal intensity parameters, the power of our test experiences a phase transition that matches the phase transition of the likelihood ratio test.\u003C\/p\u003E\u003Cp\u003EOur analytic framework goes beyond contingency tables, encompassing a wide range of rare and weak signal models experiencing departures on a moderate scale. We discuss several interesting new models falling under this category, including the detection of a few edits within text written by a generative language model.\u003C\/p\u003E","summary":"","format":"limited_html"}],"field_subtitle":"","field_summary":[{"value":"\u003Cp\u003EAbstract\u003Cbr\u003ESuppose we have two tables of counts, each indexed by the same set of categories, and we wish to determine whether the underlying generating mechanism behind each table might be different. Furthermore, if the mechanism is different, we suspect it likely varies only in a small fraction of the observed categories. This question arises in several applications, including attributing authorship based on word frequencies.\u003C\/p\u003E","format":"limited_html"}],"field_summary_sentence":[{"value":"Rare and Weak Signal Detection and Authorship Challenges: From the Federalist Papers to ChatGPT\u0022"}],"uid":"36433","created_gmt":"2024-08-12 18:18:11","changed_gmt":"2024-08-13 13:00:44","author":"mrussell89","boilerplate_text":"","field_publication":"","field_article_url":"","field_event_time":{"event_time_start":"2024-08-15T13:30:00-04:00","event_time_end":"2024-08-15T14:30:00-04:00","event_time_end_last":"2024-08-15T14:30:00-04:00","gmt_time_start":"2024-08-15 17:30:00","gmt_time_end":"2024-08-15 18:30:00","gmt_time_end_last":"2024-08-15 18:30:00","rrule":null,"timezone":"America\/New_York"},"location":"ISYE Main 228","extras":["free_food"],"groups":[{"id":"1242","name":"School of Industrial and Systems Engineering (ISYE)"}],"categories":[],"keywords":[],"core_research_areas":[],"news_room_topics":[],"event_categories":[{"id":"1795","name":"Seminar\/Lecture\/Colloquium"}],"invited_audience":[{"id":"78761","name":"Faculty\/Staff"},{"id":"174045","name":"Graduate students"},{"id":"177814","name":"Postdoc"},{"id":"78751","name":"Undergraduate students"}],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[],"email":[],"slides":[],"orientation":[],"userdata":""}}}