{"686422":{"#nid":"686422","#data":{"type":"news","title":"Ph.D. Student\u2019s Framework Used to Bolster Nvidia\u2019s Cosmos Predict-2 Model","body":[{"value":"\u003Cp\u003EA new deep learning architectural framework could boost the development and deployment efficiency of autonomous vehicles and humanoid robots. The framework will lower training costs and reduce the amount of real-world data needed for training.\u003C\/p\u003E\u003Cp\u003EWorld foundation models (WFMs) enable physical AI systems to learn and operate within\u0026nbsp;synthetic worlds created by generative artificial intelligence (genAI). For example, these models use predictive capabilities to generate up to 30 seconds of video that accurately reflects the real world.\u003C\/p\u003E\u003Cp\u003EThe new framework, developed by a Georgia Tech researcher, enhances the processing speed of the neural networks that simulate these real-world environments from text, images, or video inputs.\u003C\/p\u003E\u003Cp\u003EThe neural networks that make up the architectures of large language models like ChatGPT and visual models like Sora process contextual information using the \u201cattention mechanism.\u201d\u003C\/p\u003E\u003Cp\u003EAttention refers to a model\u2019s ability to focus on the most relevant parts of input.\u003C\/p\u003E\u003Cp\u003EThe Neighborhood Attention Extension (NATTEN) allows models that require GPUs or high-performance computing systems to process information and generate outputs more efficiently.\u003C\/p\u003E\u003Cp\u003EProcessing speeds can increase by up to 2.6 times, said \u003Ca href=\u0022https:\/\/alihassanijr.com\/\u0022\u003E\u003Cstrong\u003EAli Hassani\u003C\/strong\u003E\u003C\/a\u003E, a Ph.D. student in the School of Interactive Computing and the creator of NATTEN. Hassani is advised by Associate Professor \u003Ca href=\u0022https:\/\/www.humphreyshi.com\/\u0022\u003E\u003Cstrong\u003EHumphrey Shi\u003C\/strong\u003E\u003C\/a\u003E.\u003C\/p\u003E\u003Cp\u003EHassani is also a research scientist at Nvidia, where he introduced NATTEN to \u003Ca href=\u0022https:\/\/www.nvidia.com\/en-us\/ai\/cosmos\/\u0022\u003E\u003Cstrong\u003ECosmos\u003C\/strong\u003E\u003C\/a\u003E \u2014 a family of WFMs the company uses to train robots, autonomous vehicles, and other physical AI applications.\u003C\/p\u003E\u003Cp\u003E\u201cYou can map just about anything from a prompt or an image or any combination of frames from an existing video to predict future videos,\u201d Hassani said. \u201cInstead of generating words with an LLM, you\u2019re generating a world.\u003C\/p\u003E\u003Cp\u003E\u201cUnlike LLMs that generate a single token at a time, these models are compute-heavy. They generate many images \u2014 often hundreds of frames at a time \u2014 so the models put a lot of work on the GPU. NATTEN lets us decrease some of that work and proportionately accelerate the model.\u201d\u003C\/p\u003E","summary":"","format":"limited_html"}],"field_subtitle":"","field_summary":[{"value":"\u003Cp\u003EGeorgia Tech Ph.D. student Ali Hassani developed the Neighborhood Attention Extension (NATTEN), a deep learning architectural framework that is being integrated into Nvidia\u0027s Cosmos Predict-2 world foundation model. NATTEN enhances the processing speed of neural networks that simulate real-world environments for physical AI systems, which are used to train autonomous vehicles and humanoid robots.\u0026nbsp;\u003C\/p\u003E","format":"limited_html"}],"field_summary_sentence":[{"value":"A new deep learning architectural framework, Neighborhood Attention Extension (NATTEN), is being used by Nvidia to  increase the processing speed of their Cosmos Predict-2 Model for training autonomous vehicles and humanoid robots."}],"uid":"36530","created_gmt":"2025-11-13 21:13:58","changed_gmt":"2025-11-13 21:14:58","author":"Nathan Deen","boilerplate_text":"","field_publication":"","field_article_url":"","location":"Atlanta, GA","dateline":{"date":"2025-11-03T00:00:00-05:00","iso_date":"2025-11-03T00:00:00-05:00","tz":"America\/New_York"},"extras":[],"hg_media":{"678621":{"id":"678621","type":"image","title":"2X6A3487.jpg","body":null,"created":"1763068473","gmt_created":"2025-11-13 21:14:33","changed":"1763068473","gmt_changed":"2025-11-13 21:14:33","alt":"Humprhey Shi and Ali Hassani","file":{"fid":"262676","name":"2X6A3487.jpg","image_path":"\/sites\/default\/files\/2025\/11\/13\/2X6A3487.jpg","image_full_path":"http:\/\/hg.gatech.edu\/\/sites\/default\/files\/2025\/11\/13\/2X6A3487.jpg","mime":"image\/jpeg","size":93105,"path_740":"http:\/\/hg.gatech.edu\/sites\/default\/files\/styles\/740xx_scale\/public\/2025\/11\/13\/2X6A3487.jpg?itok=axfoqv8i"}}},"media_ids":["678621"],"groups":[{"id":"47223","name":"College of Computing"},{"id":"1188","name":"Research Horizons"},{"id":"50876","name":"School of Interactive Computing"}],"categories":[{"id":"153","name":"Computer Science\/Information Technology and Security"},{"id":"194609","name":"Industry"},{"id":"152","name":"Robotics"}],"keywords":[{"id":"192863","name":"go-ai"},{"id":"193860","name":"Artifical Intelligence"},{"id":"194701","name":"go-resarchnews"},{"id":"9153","name":"Research Horizons"},{"id":"14549","name":"nvidia"},{"id":"191138","name":"artificial neural networks"},{"id":"97281","name":"autonomous vehicles"}],"core_research_areas":[{"id":"193655","name":"Artificial Intelligence at Georgia Tech"}],"news_room_topics":[],"event_categories":[],"invited_audience":[],"affiliations":[],"classification":[],"areas_of_expertise":[],"news_and_recent_appearances":[],"phone":[],"contact":[],"email":[],"slides":[],"orientation":[],"userdata":""}}}