default search action
"GQKVA: Efficient Pre-training of Transformers by Grouping Queries, Keys, ..."
Farnoosh Javadi et al. (2023)
- Farnoosh Javadi, Walid Ahmed, Habib Hajimolahoseini, Foozhan Ataiefard, Mohammad Hassanpour, Saina Asani, Austin Wen, Omar Mohamed Awad, Kangling Liu, Yang Liu:
GQKVA: Efficient Pre-training of Transformers by Grouping Queries, Keys, and Values. CoRR abs/2311.03426 (2023)
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.