国产av日韩一区二区三区精品,成人性爱视频在线观看,国产,欧美,日韩,一区,www.成色av久久成人,2222eeee成人天堂

Home Technology peripherals AI The strongest model Llama 3.1 405B is officially released, Zuckerberg: Open source leads a new era

The strongest model Llama 3.1 405B is officially released, Zuckerberg: Open source leads a new era

Jul 24, 2024 pm 08:23 PM
meta industry

Just now, the long-awaited Llama 3.1 has been officially released!

Meta officially issued the voice of "Open source leads a new era".
最強(qiáng)模型Llama 3.1 405B正式發(fā)布,扎克伯格:開源引領(lǐng)新時(shí)代
In the official blog, Meta said: "Until today, open source large language models have mostly lagged behind closed models in terms of functionality and performance. Now, we are ushering in a new era led by open source. We publicly release Meta Llama 3.1 405B, we believe this is the largest and most powerful open source base model in the world, with more than 300 million downloads of all Llama versions to date, and we are just getting started."

Founder of Meta. , CEO Zuckerberg also personally wrote a long article "Open Source AI Is the Path Forward", explaining why open source is a good thing for all developers, Meta, and the world.
最強(qiáng)模型Llama 3.1 405B正式發(fā)布,扎克伯格:開源引領(lǐng)新時(shí)代
Highlights from this release include:

  • The latest series of models extends context length to 128K, adds support for eight languages, and includes the top open source model Llama 3.1 405B;
  • Llama 3.1 405B is in a league of its own, and Meta officially says it is comparable to the best closed source models;
  • This release also provides more components (including reference systems) to be used with the model to make Llama a One system;
  • Users can experience Llama 3.1 405B through WhatsApp and meta.ai.
最強(qiáng)模型Llama 3.1 405B正式發(fā)布,扎克伯格:開源引領(lǐng)新時(shí)代
Address: https://llama.meta.com/

You can download it and try it out.

Llama 3.1 Introduction

Llama 3.1 405B is the first publicly available model that is comparable to top AI models in terms of common sense, manipulability, mathematics, tool usage and multi-language translation. .

Meta says the latest generation of Llama will inspire new applications and modeling paradigms, including leveraging synthetic data generation to boost and train smaller models, as well as model distillation - an approach never before seen in the open source space. ability to achieve.

At the same time, Meta has also launched upgraded versions of the 8B and 70B models, supporting multiple languages, with a context length of 128K and stronger reasoning capabilities. The latest models support advanced use cases such as long-form text summarization, multilingual conversational agents, and coding assistants.

For example, Llama 3.1 can translate stories into Spanish:

最強(qiáng)模型Llama 3.1 405B正式發(fā)布,扎克伯格:開源引領(lǐng)新時(shí)代

When the user asks "There are 3 shirts, 5 pairs of shorts and 1 dress, suppose you want to travel for 10 days. Prepare the clothes Is it enough? "The model can perform inference quickly.

最強(qiáng)模型Llama 3.1 405B正式發(fā)布,扎克伯格:開源引領(lǐng)新時(shí)代

Long context: For uploaded documents, Llama 3.1 is able to analyze and summarize large documents up to 8k tokens.

最強(qiáng)模型Llama 3.1 405B正式發(fā)布,扎克伯格:開源引領(lǐng)新時(shí)代

Coding Assistant, for user requirements, you can quickly write code:

最強(qiáng)模型Llama 3.1 405B正式發(fā)布,扎克伯格:開源引領(lǐng)新時(shí)代

In addition, the developer of Llama 3.1 405B also tweeted "spoiler", stating that the development of a model that integrates voice and visual capabilities like GPT-4o is still under development.
最強(qiáng)模型Llama 3.1 405B正式發(fā)布,扎克伯格:開源引領(lǐng)新時(shí)代
Meta has also made changes to the open source license to allow developers to use the output of Llama models (including 405B) to improve other models. Additionally, in keeping with its open source commitment, starting today, Meta is making these models available to the community for download at llama.meta.com and Hugging Face.

Download address:

  • https://huggingface.co/meta-llama
  • https://llama.meta.com/

Model evaluation

Meta is evaluated on more than 150 benchmark datasets, in addition, they also conduct extensive human evaluation.

Experimental results show that the flagship model Llama 3.1 405B is competitive with leading base models including GPT-4, GPT-4o and Claude 3.5 Sonnet across a range of tasks. Furthermore, the 8B and 70B small models are competitive with closed-source and open-source models with similar numbers of parameters.
最強(qiáng)模型Llama 3.1 405B正式發(fā)布,扎克伯格:開源引領(lǐng)新時(shí)代
最強(qiáng)模型Llama 3.1 405B正式發(fā)布,扎克伯格:開源引領(lǐng)新時(shí)代
最強(qiáng)模型Llama 3.1 405B正式發(fā)布,扎克伯格:開源引領(lǐng)新時(shí)代
Model Architecture

As Meta’s largest model to date, training Llama 3.1 405B using more than 15 trillion tokens is a major challenge. To enable training at this scale, Meta optimized the entire training stack and trained on over 16,000 H100 GPUs, making this model the first Llama model to be trained at this scale.
最強(qiáng)模型Llama 3.1 405B正式發(fā)布,扎克伯格:開源引領(lǐng)新時(shí)代
To solve this problem, Meta has made the following design choices, focusing on keeping the model development process scalable and simple.

  • A standard decoder Transformer model architecture with only minor adjustments was chosen instead of a hybrid expert model to maximize training stability.
  • Adopts an iterative post-training procedure, using supervised fine-tuning and direct preference optimization at each round. This enables Meta to create the highest quality synthetic data for every round and improve the performance of every feature.

Compared with previous versions of Llama, Meta has improved the quantity and quality of data used for pre-training and post-training, such as developing more careful pre-processing and management pipelines for pre-training data and post-training data. Develop more stringent quality assurance and filtration methods.

As expected from language model scaling laws, Meta’s new flagship model outperforms smaller models trained using the same procedure. Meta also uses 405B parameter models to improve the post-training quality of smaller models.

In order to support the large-scale inference output of 405B models, Meta quantized the model from 16 bits (BF16) to 8 bits (FP8), effectively reducing the required computing requirements and allowing the model to run on a single server node .

Command and Chat Tweaks

Llama 3.1 405B strives to improve the usefulness, quality and detailed instruction following of models in responding to user instructions, while ensuring a high level of security.

In the post-training phase, the research team built the final chat model by performing several rounds of alignment on the basis of the pre-trained model. Each round involves supervised fine-tuning (SFT), rejection sampling (RS), and direct preference optimization (DPO).

The research team uses synthetic data generation to produce the vast majority of SFT examples, and iterates multiple times to generate increasingly higher quality synthetic data across all features. Additionally, the research team employed multiple data processing techniques to filter these synthetic data to the highest quality and fine-tune the data volume across functional scalability.

Llama System

The Llama model has always existed as part of an AI system and can coordinate multiple components, including calling external tools. Meta is designed to go beyond the base model and give developers the flexibility to design and create custom products that fit their vision.

To responsibly develop AI beyond the model layer, Meta has released a complete reference system that includes multiple example applications as well as new components such as Llama Guard 3, a multilingual security model and Prompt Guard (a prompt injection filter). These sample applications are open source and can be built by the open source community.

In order to collaborate more broadly with industry, startups, and the open source community to help better define the interfaces of components, Meta has published a comment request for "Llama Stack" on GitHub. Llama Stack is a set of standardized interfaces for building canonical toolchain components (fine-tuning, synthetic data generation) and agent applications. This helps achieve interoperability more easily. 最強(qiáng)模型Llama 3.1 405B正式發(fā)布,扎克伯格:開源引領(lǐng)新時(shí)代
Unlike closed models, Llama model weights are available for download. Developers can fully customize the model to their needs and applications, train on new datasets, and perform additional fine-tuning.

Developed using Llama 3.1 405B

For ordinary developers, deploying such a large-scale model as 405B is undoubtedly a challenge, and it requires a lot of computing resources and professional skills. In communicating with the developer community, Meta realized that the development of generative AI is more than just giving input prompts to the model. They expect all developers to exploit the full potential of Llama 3.1 405B in the following areas:

  • Real-time and batch inference
  • Supervised fine-tuning
  • Testing and evaluating model performance in specific applications
  • Continuous pre-training
  • Retrieval Augmented Generation (RAG)
  • Function call
  • Synthetic data generation

Released from now on, Llama 3.1 40 All advanced features of the 5B model are will be open and developers can get started immediately. Developers can also explore higher-order workflows, such as synthetic data generation based on model distillation. In this upgrade, Meta also seamlessly integrates solutions provided by partners AWS, NVIDIA and Databricks to achieve more efficient retrieval augmentation generation (RAG). In addition, Groq has been optimized for low-latency inference for deploying models in the cloud, and similar performance improvements have been made for local systems.

Meta has also built-in a "tool gift package" for Llama 3.1 405B this time, including key projects such as vLLM, TensorRT and PyTorch, from model development to deployment "out of the box", all in one step.

Reference link: https://ai.meta.com/blog/meta-llama-3-1/

The above is the detailed content of The strongest model Llama 3.1 405B is officially released, Zuckerberg: Open source leads a new era. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undress AI Tool

Undress AI Tool

Undress images for free

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

PHP Tutorial
1502
276
DeepMind robot plays table tennis, and its forehand and backhand slip into the air, completely defeating human beginners DeepMind robot plays table tennis, and its forehand and backhand slip into the air, completely defeating human beginners Aug 09, 2024 pm 04:01 PM

But maybe he can’t defeat the old man in the park? The Paris Olympic Games are in full swing, and table tennis has attracted much attention. At the same time, robots have also made new breakthroughs in playing table tennis. Just now, DeepMind proposed the first learning robot agent that can reach the level of human amateur players in competitive table tennis. Paper address: https://arxiv.org/pdf/2408.03906 How good is the DeepMind robot at playing table tennis? Probably on par with human amateur players: both forehand and backhand: the opponent uses a variety of playing styles, and the robot can also withstand: receiving serves with different spins: However, the intensity of the game does not seem to be as intense as the old man in the park. For robots, table tennis

The first mechanical claw! Yuanluobao appeared at the 2024 World Robot Conference and released the first chess robot that can enter the home The first mechanical claw! Yuanluobao appeared at the 2024 World Robot Conference and released the first chess robot that can enter the home Aug 21, 2024 pm 07:33 PM

On August 21, the 2024 World Robot Conference was grandly held in Beijing. SenseTime's home robot brand "Yuanluobot SenseRobot" has unveiled its entire family of products, and recently released the Yuanluobot AI chess-playing robot - Chess Professional Edition (hereinafter referred to as "Yuanluobot SenseRobot"), becoming the world's first A chess robot for the home. As the third chess-playing robot product of Yuanluobo, the new Guoxiang robot has undergone a large number of special technical upgrades and innovations in AI and engineering machinery. For the first time, it has realized the ability to pick up three-dimensional chess pieces through mechanical claws on a home robot, and perform human-machine Functions such as chess playing, everyone playing chess, notation review, etc.

Claude has become lazy too! Netizen: Learn to give yourself a holiday Claude has become lazy too! Netizen: Learn to give yourself a holiday Sep 02, 2024 pm 01:56 PM

The start of school is about to begin, and it’s not just the students who are about to start the new semester who should take care of themselves, but also the large AI models. Some time ago, Reddit was filled with netizens complaining that Claude was getting lazy. "Its level has dropped a lot, it often pauses, and even the output becomes very short. In the first week of release, it could translate a full 4-page document at once, but now it can't even output half a page!" https:// www.reddit.com/r/ClaudeAI/comments/1by8rw8/something_just_feels_wrong_with_claude_in_the/ in a post titled "Totally disappointed with Claude", full of

Li Feifei's team proposed ReKep to give robots spatial intelligence and integrate GPT-4o Li Feifei's team proposed ReKep to give robots spatial intelligence and integrate GPT-4o Sep 03, 2024 pm 05:18 PM

Deep integration of vision and robot learning. When two robot hands work together smoothly to fold clothes, pour tea, and pack shoes, coupled with the 1X humanoid robot NEO that has been making headlines recently, you may have a feeling: we seem to be entering the age of robots. In fact, these silky movements are the product of advanced robotic technology + exquisite frame design + multi-modal large models. We know that useful robots often require complex and exquisite interactions with the environment, and the environment can be represented as constraints in the spatial and temporal domains. For example, if you want a robot to pour tea, the robot first needs to grasp the handle of the teapot and keep it upright without spilling the tea, then move it smoothly until the mouth of the pot is aligned with the mouth of the cup, and then tilt the teapot at a certain angle. . this

Hongmeng Smart Travel S9 and full-scenario new product launch conference, a number of blockbuster new products were released together Hongmeng Smart Travel S9 and full-scenario new product launch conference, a number of blockbuster new products were released together Aug 08, 2024 am 07:02 AM

This afternoon, Hongmeng Zhixing officially welcomed new brands and new cars. On August 6, Huawei held the Hongmeng Smart Xingxing S9 and Huawei full-scenario new product launch conference, bringing the panoramic smart flagship sedan Xiangjie S9, the new M7Pro and Huawei novaFlip, MatePad Pro 12.2 inches, the new MatePad Air, Huawei Bisheng With many new all-scenario smart products including the laser printer X1 series, FreeBuds6i, WATCHFIT3 and smart screen S5Pro, from smart travel, smart office to smart wear, Huawei continues to build a full-scenario smart ecosystem to bring consumers a smart experience of the Internet of Everything. Hongmeng Zhixing: In-depth empowerment to promote the upgrading of the smart car industry Huawei joins hands with Chinese automotive industry partners to provide

Distributed Artificial Intelligence Conference DAI 2024 Call for Papers: Agent Day, Richard Sutton, the father of reinforcement learning, will attend! Yan Shuicheng, Sergey Levine and DeepMind scientists will give keynote speeches Distributed Artificial Intelligence Conference DAI 2024 Call for Papers: Agent Day, Richard Sutton, the father of reinforcement learning, will attend! Yan Shuicheng, Sergey Levine and DeepMind scientists will give keynote speeches Aug 22, 2024 pm 08:02 PM

Conference Introduction With the rapid development of science and technology, artificial intelligence has become an important force in promoting social progress. In this era, we are fortunate to witness and participate in the innovation and application of Distributed Artificial Intelligence (DAI). Distributed artificial intelligence is an important branch of the field of artificial intelligence, which has attracted more and more attention in recent years. Agents based on large language models (LLM) have suddenly emerged. By combining the powerful language understanding and generation capabilities of large models, they have shown great potential in natural language interaction, knowledge reasoning, task planning, etc. AIAgent is taking over the big language model and has become a hot topic in the current AI circle. Au

New affordable Meta Quest 3S VR headset appears on FCC, suggesting imminent launch New affordable Meta Quest 3S VR headset appears on FCC, suggesting imminent launch Sep 04, 2024 am 06:51 AM

The Meta Connect 2024event is set for September 25 to 26, and in this event, the company is expected to unveil a new affordable virtual reality headset. Rumored to be the Meta Quest 3S, the VR headset has seemingly appeared on FCC listing. This sugge

ACL 2024 Awards Announced: One of the Best Papers on Oracle Deciphering by HuaTech, GloVe Time Test Award ACL 2024 Awards Announced: One of the Best Papers on Oracle Deciphering by HuaTech, GloVe Time Test Award Aug 15, 2024 pm 04:37 PM

At this ACL conference, contributors have gained a lot. The six-day ACL2024 is being held in Bangkok, Thailand. ACL is the top international conference in the field of computational linguistics and natural language processing. It is organized by the International Association for Computational Linguistics and is held annually. ACL has always ranked first in academic influence in the field of NLP, and it is also a CCF-A recommended conference. This year's ACL conference is the 62nd and has received more than 400 cutting-edge works in the field of NLP. Yesterday afternoon, the conference announced the best paper and other awards. This time, there are 7 Best Paper Awards (two unpublished), 1 Best Theme Paper Award, and 35 Outstanding Paper Awards. The conference also awarded 3 Resource Paper Awards (ResourceAward) and Social Impact Award (

See all articles