Almost exactly a month after the debut of Gemini 3 Pro in November, Google has begun rolling out the more efficient Flash ...
MLCommons today released AILuminate, a new benchmark test for evaluating the safety of large language models. Launched in 2020, MLCommons is an industry consortium backed by several dozen tech firms.
Instead of a single, massive LLM, Nvidia's new 'orchestration' paradigm uses a small model to intelligently delegate tasks to a team of tools and specialized models.
AUSTIN, Texas & OSLO, Norway--(BUSINESS WIRE)--Cognite, the global leader in AI for industry, today announced the launch of the Cognite Atlas AI™ LLM & SLM Benchmark Report for Industrial Agents. The ...
The development of DeepSeek v2.5 involved the fusion of two highly capable models: DeepSeek version 2 0628 and DeepSeek Coder version 2 0724. By combining the strengths of these models, DeepSeek v2.5 ...
After reviewing thousands of benchmarks used in AI development, a Stanford team found that 5% could have serious flaws with ...
Dec. 4, 2024 — MLCommons today released AILuminate, a safety test for large language models. The v1.0 benchmark – which provides a series of safety grades for the most widely-used LLMs – is the first ...
When Meta, the parent company of Facebook, announced its latest open-source large language model (LLM) on July 23rd, it claimed that the most powerful version of Llama 3.1 had “state-of-the-art ...
Google LLC today introduced a new large language model, Gemini 2.5 Flash-Lite, that can process prompts faster and more cost-efficiently than its predecessor. The algorithm is rolling out as part of a ...
A team of researchers in Japan released Fugaku-LLM, a large language model with enhanced Japanese language capability, using the RIKEN supercomputer Fugaku. A team of researchers in Japan released ...