{"id":1368,"date":"2025-10-22T17:59:23","date_gmt":"2025-10-22T17:59:23","guid":{"rendered":"https:\/\/nanocad.ee.ucla.edu\/?page_id=1368"},"modified":"2025-10-22T17:59:59","modified_gmt":"2025-10-22T17:59:59","slug":"architecture-aware-performance-model-compression","status":"publish","type":"page","link":"https:\/\/nanocad.ee.ucla.edu\/?page_id=1368","title":{"rendered":"Architecture-aware Performance &amp; Model Compression"},"content":{"rendered":"\n<p><strong>GPU Performance and Memory Modeling<\/strong><\/p>\n\n\n\n<p>Student: Lime Yao<br><\/p>\n\n\n\n<p>Research in this area focuses on enhancing the predictive accuracy and practical utility of STCO for large language model (LLM) workloads through abstract performance modeling of GPU-based and heterogeneous compute systems. STCO enables the co-optimization of architectural features and manufacturing technologies, allowing designers to explore how decisions made at the architectural and technology level can have non-obvious effects on system-level performance. Achieving this requires developing abstractions that balance modeling accuracy and computational tractability, enabling effective STCO-driven design exploration. Key topics include GPU memory hierarchy and performance bottlenecks, AI accelerator microarchitecture, KV cache behavior, and NUMA effects of chiplet-based or multi-GPU systems.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>GPU Performance and Memory Modeling Student: Lime Yao Research in this area focuses on enhancing the predictive accuracy and practical utility of STCO for large&hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":259,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-1368","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/nanocad.ee.ucla.edu\/index.php?rest_route=\/wp\/v2\/pages\/1368","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nanocad.ee.ucla.edu\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/nanocad.ee.ucla.edu\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/nanocad.ee.ucla.edu\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nanocad.ee.ucla.edu\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1368"}],"version-history":[{"count":1,"href":"https:\/\/nanocad.ee.ucla.edu\/index.php?rest_route=\/wp\/v2\/pages\/1368\/revisions"}],"predecessor-version":[{"id":1369,"href":"https:\/\/nanocad.ee.ucla.edu\/index.php?rest_route=\/wp\/v2\/pages\/1368\/revisions\/1369"}],"up":[{"embeddable":true,"href":"https:\/\/nanocad.ee.ucla.edu\/index.php?rest_route=\/wp\/v2\/pages\/259"}],"wp:attachment":[{"href":"https:\/\/nanocad.ee.ucla.edu\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1368"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}