Neural networks research has largely focused on understanding a single model or training on a single dataset. However, little is known about the characteristics and relationships between different models, especially those trained or tested on different datasets. To address this gap, this talk delves into the weight and loss spaces and how they are interconnected through the mapping between models’ weights and performance.
Different fine-tuned language models based on their weights and performance were compared, with the result being that models trained in similar ways have similar weights, which fall in the same region of the weight space. Therefore, models trained on the same dataset form a tight cluster, and models trained on the same task form larger clusters.
The inverse is also shown: weights in certain regions represent models with high performance. Therefore, it is possible to traverse from one model to another in the same region and reach new models that perform comparably or even better, even on tasks that the original models were not fine-tuned for.
The findings provide insight into the relationships between models, showing that a model located between two similar models can gain the knowledge of both. This finding can improve efficient fine-tuning by choosing a model from the center of a region.
Neural networks research has largely focused on understanding a single model or training on a single dataset. However, little is known about the characteristics and relationships between different models, especially those trained or tested on different datasets. To address this gap, this talk delves into the weight and loss spaces and how they are interconnected through the mapping between models’ weights and performance.
Different fine-tuned language models based on their weights and performance were compared, with the result being that models trained in similar ways have similar weights, which fall in the same region of the weight space. Therefore, models trained on the same dataset form a tight cluster, and models trained on the same task form larger clusters.
The inverse is also shown: weights in certain regions represent models with high performance. Therefore, it is possible to traverse from one model to another in the same region and reach new models that perform comparably or even better, even on tasks that the original models were not fine-tuned for.
The findings provide insight into the relationships between models, showing that a model located between two similar models can gain the knowledge of both. This finding can improve efficient fine-tuning by choosing a model from the center of a region.
8:45 | Reception |
---|---|
9:30 | Opening words by WiDS TLV ambassador Nitzan Gado and by Lily Ben Ami, CEO of the Michal Sela Forum |
9:50 | Prof. Bracha Shapira – Data Challenges in Recommender Systems Research: Insights from Bundle Recommendation |
10:20 | Juan Liu – Accounting Automation: Making Accounting Easier So That People Can Forget About It |
10:50 | Break |
11:00 | Lightning talks |
12:20 | Lunch & poster session |
---|---|
13:20 | Roundtable session & poster session |
14:05 | Roundtable closure |
14:20 | Break |
14:30 | Merav Mofaz – “Every Breath You Take and Every Move You Make…I'll Be Watching You:” The Sensitive Side of Smartwatches |
14:50 | Reut Yaniv – Ad Serving in the Online Geo Space Along Routes |
15:10 | Rachel Wities - It’s Not Just the Doctor’s Handwriting: Challenges and Opportunities in Healthcare NLP |
15:30 | Closing remarks |
15:40 | End |
WiDS Tel Aviv is an independent event that is organized by Intuit’s WiDS TLV ambassadors as part of the annual WiDS Worldwide conference, the WiDS Datathon, and an estimated 200 WiDS Regional Events worldwide. Everyone is invited to attend all WiDS conference and WiDS Datathon Workshop events which feature outstanding women doing outstanding work.
© 2018-2023 WiDS TLV – Intuit. All rights reserved.
Scotty – By Nir Azoulay
Design: Sharon Geva