BEGIN:VCALENDAR
VERSION:2.0
PRODID:ILLC Website
X-WR-TIMEZONE:Europe/Amsterdam
BEGIN:VTIMEZONE
TZID:Europe/Amsterdam
X-LIC-LOCATION:Europe/Amsterdam
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700329T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701025T030000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
UID:/NewsandEvents/Archives/2022/newsitem/13762/28
 -June-2022-Computational-Linguistics-Seminar-Hila-
 Chefer
DTSTAMP:20220623T133551
SUMMARY:Computational Linguistics Seminar, Hila Ch
 efer
ATTENDEE;ROLE=Speaker:Hila Chefer (Tel Aviv Univer
 sity)
DTSTART;VALUE=DATE:20220628
LOCATION:ILLC seminar room F1.15, Science Park 107
 , Amsterdam / online via Zoom
DESCRIPTION:Transformers have revolutionized deep 
 learning research across many disciplines, startin
 g from NLP and expanding to vision, speech, and mo
 re. In my talk, I will explore several milestones 
 toward interpreting all families of Transformers, 
 including unimodal, bi-modal, and encoder-decoder 
 Transformers. I will present working examples and 
 results that cover some of the most prominent mode
 ls, including CLIP, BERT, LXMERT, and ViT. I will 
 then present our recent explainability-driven fine
 -tuning technique that significantly improves the 
 robustness of Vision Transformers (ViTs). The loss
  we employ ensures that the model bases its predic
 tion on the relevant parts of the input, rather th
 an supportive cues (e.g., background). This can be
  done with very little added supervision in the fo
 rm of foreground masks, or without any such superv
 ision.
X-ALT-DESC;FMTTYPE=text/html:\n  <p>Transformers h
 ave revolutionized deep learning research across m
 any disciplines, starting from NLP and expanding t
 o vision, speech, and more. In my talk, I will exp
 lore several milestones toward interpreting all fa
 milies of Transformers, including unimodal, bi-mod
 al, and encoder-decoder Transformers. I will prese
 nt working examples and results that cover some of
  the most prominent models, including CLIP, BERT, 
 LXMERT, and ViT. I will then present our recent ex
 plainability-driven fine-tuning technique that sig
 nificantly improves the robustness of Vision Trans
 formers (ViTs). The loss we employ ensures that th
 e model bases its prediction on the relevant parts
  of the input, rather than supportive cues (e.g., 
 background). This can be done with very little add
 ed supervision in the form of foreground masks, or
  without any such supervision.</p>\n
URL:https://projects.illc.uva.nl/LaCo/CLS/
END:VEVENT
END:VCALENDAR
