Google proposes new architecture for RNN RG-LRU

Google
3月05Day

Researchers at Google have proposed a new gated linear loop layer called the RG-LRU layer and designed a new loop block around it as an alternative to multi-query attention (MQA). They used this recurrent block to construct two new models: a model Hawk that combines MLP and the recurrent block, and a model Griffin that combines MLP, the recurrent block, and localized attention.By over-training Hawk and Griffin on 300B tokens for a range of different model sizes, it was found that Hawk-3B outperformed MQA on the downstream task outperforms Mamba-3B, but with half the number of tokens trained. In addition, Griffin-7B and Griffin-14B have comparable performance to Llama-2 but train 1/7th the number of tokens, respectively.

Link to paper:
https://arxiv.org/pdf/2402.19427.pdf

TOP1

Sora: AI video generation tool, AI video generation model released by OpenAI
13 hours ago
TOP2

AI leads to surge in electricity use, study shows power needed for data centers across the U.S. is expected to nearly triple over the next three years
17 hours ago
TOP3

AI video tools in the end which is good to use? Which one is cost-effective?6 domestic and international AI generation video tools comparison test
22 hours ago
Andrew Ng launches free course to take you through OpenAI inference modeling o1
22 hours ago
Italy hits hard: ChatGPT AI data privacy violations, OpenAI fined €15 million
22 hours ago
Google Expands Gemini AI Depth Research Model to Support 40+ Languages, Including Chinese
22 hours ago
o3 Pressing the Stage: OpenAI Rolls Up the Reasoning AI Modeling Storm, Toward a New Peak of AGI
22 hours ago
OpenAI Updates ChatGPT Client for macOS: Reads System Memo Apps, Analyzes Multiple IDEs Simultaneously
22 hours ago

❯

Checking in, please wait

Click for today's check-in bonus!

You have earned {{mission.data.mission.credit}} points today!

Check-in

Leaderboard

{{item.credit}}

Lasted {{item.count}} days

My Coupons

_￥_Coupons

Limitation of useExpired and Unavailable

Limitation of use
before

Limitation of usePermanently valid

Coupon ID:
×

Available for the following products: Available for the following products categories: Unrestricted use:

[{{ct.name}}]

Available for all products and product types

No coupons available!

Cart

×

Delete

Shopping Cart is Empty!

Empty Cart Checkout

You have a new message

No new messages

Write a new message More

{{userData.name}}Verify

Google proposes new architecture for RNN RG-LRU

Sora: AI video generation tool, AI video generation model released by OpenAI

AI leads to surge in electricity use, study shows power needed for data centers across the U.S. is expected to nearly triple over the next three years

AI video tools in the end which is good to use? Which one is cost-effective?6 domestic and international AI generation video tools comparison test

Andrew Ng launches free course to take you through OpenAI inference modeling o1

Italy hits hard: ChatGPT AI data privacy violations, OpenAI fined €15 million

Google Expands Gemini AI Depth Research Model to Support 40+ Languages, Including Chinese

o3 Pressing the Stage: OpenAI Rolls Up the Reasoning AI Modeling Storm, Toward a New Peak of AGI

OpenAI Updates ChatGPT Client for macOS: Reads System Memo Apps, Analyzes Multiple IDEs Simultaneously

AI Weibo

AI Applications

5000+ AI applications! Updated daily

AIAICLUB

Highly recommended! Official brand Weibo

AI Tutorials

Tons of tutorials to read

AI Basic Training Camp

Zero-based entry, leading you to become an AI expert

1ai tiktok

1ai master

TikTok account: 1ai.net

1ai master

TikTok account: 1ai.net

1ai WeChat

Five minutes a day

Become a master in one year

Scan the QR code to follow