China NEW Open Source AI Models BREAK the Industry (AI WAR With OpenAI and Google)

China NEW Open Source AI Models BREAK the Industry (AI WAR With OpenAI and Google) 

Tencent’s Hanyun 813B: A Sparse Giant With a Memory Like an Elephant

Alright, here’s what we’ve got today. Tencent just dropped a new AI model that’s way faster and smarter than it should be, with a crazy long memory and even a fast and slow thinking mode. Then Baidu went wild building a search engine with actual reasoning, using four different AI agents working together.

And just when things couldn’t get hotter, Baidu and Huawei both open sourced their biggest models yet, like hundreds of billions of parameters, basically starting an open source war with open AI and the rest. Let’s get into it. So let’s talk about Tencent first.

They just dropped a bombshell with Hanyun 813B, and honestly, the specs read like somebody hammered together a Formula One car and a hybrid SUV. The model lives inside an 80 billion parameter shell, yet at runtime, only 13 billion parameters actually wake up. That’s the sparse mixture of experts’ trick, one shared expert, 64 non-shared, 8 of them lighting up per forward pass.

It means you get the punch of a heavyweight without paying a heavyweight’s electricity bill. Under the hood are 32 transformer layers, swiglu activations, a chunky 128,000 vocab, and group query attention, so it chews through long prompts without the KV cache ballooning into oblivion. And yes, the context window stretches all the way to 256,000 tokens.

Basically, its contextual memory holds more than most people process in a lifetime. Facts. The training process behind this model was massive.

Tencent fed it a jaw-dropping 20 trillion tokens. That’s more data than most models ever see. After that, they fine-tuned it with a method called fast-annealing, then gradually expanded its memory, first to 32,000 tokens, and finally to that insane 256,000 token range.

Hanyun’s Dual-Mode Thinking and Tool Mastery Set a New Benchmark

To make sure everything runs smoothly at that scale, they used something called NTK-Aware Positional Encoding, which basically helps the model keep track of where things are in super long inputs, without the output getting shaky or confused. But the real twist? They added a clever switch to control how the model thinks. Add slash no think to your prompt, and it flies through the answer in high-speed mode.

You slash think, and it slows down to reason step-by-step, perfect for complex tasks where shortcuts just don’t cut it. Once that core training was done, they added layers of supervised fine-tuning and reinforcement learning to sharpen the model’s skills. But the standout part is what they did with tools.

They trained the model using over 20,000 different tool-use scenarios. Everything from editing spreadsheets and writing code to running searches and checking rules mid-conversation. It even learns from its mistakes.

Bad answers get penalized, especially if it messes up something technical like a SQL query. The results speak for themselves. It performs right up there with much larger models on tough benchmarks like Math, CMATH, and GPQA.

It scored 89.1 on BBH for Logical Reasoning, 84.7 on ZebraLogic, and did great in coding too, with scores like 83.9 on MBPP and 69.3 on MultiPLE. When it comes to agent tasks where models have to act and reason with tools, it leads the pack, hitting 78.3 on BFCL version 3 and 61.2 on ComplexFunkBench. And in long-context stress tests like Penguin Scrolls and Ruler, it holds up impressively well even at 64,000 and 128,000 token lengths, while some bigger models start to fall apart.

From Raw Speed to Smart Search: Tencent Deploys Hanyun While Baidu Reinvents Reasoning

And they didn’t just stop at building the model. They made it deployable too. It works out of the box with popular AI-serving platforms like VLLM, SGLang, and TensorRT LLM.

You can run it in different formats, including W16a16 for balance, W88 for lightweight devices, and even FP8kv cache to keep things fast and efficient without using up all your GPU memory. In real-world terms, that means you can run 32 conversations at once, each starting with over 2,048 tokens and continuing with up to 14,336 more, all while pushing nearly 2,000 tokens per second. That’s fast enough for real-time summarization or live applications.

And yes, it’s completely open source. You can grab it on HuggingFace or GitHub, with a license that makes it usable in everything from school projects to startup tools without worrying about legal red tape. While Tencent was showing off raw horsepower, Baidu’s research squad rolled out something that tackles a different pain point, search that actually reasons.

Traditional retrieval, augmented generation pipelines look neat until a query needs chain deductions, or worse, conflicting sources. Most setups snag a few docs, glue them together, and call it a day. Ask who outlived whom between Julius Caesar and Emperor Wu of Han, and they serve you two birthdates and shrug.

Baidu’s new AI search paradigm slices that problem into a four-agent relay race. A master agent watches the incoming query, sniffs out the complexity, then hands off to a planner when it smells multi-step logic. The planner explodes the request into a directed acyclic graph of subtasks, chooses tools from an internal marketplace, they call them MCP servers, and passes everything to an executor.

Baidu’s Reasoning Engine and Surprise Open Source Pivot Shake Up the AI Game

The executor fires the calls, retries if a tool hiccups, patches gaps in retrieved data, and streams partial answers back. Finally, a writer agent filters contradictions, stitches sentences, and spits a coherent answer. In practice, that Emperor vs.

Caesar demo unfolds in three hops. Fetch both birth years, calculate lifespans, compare the numbers. The system reports Wu of Han lived 69 years, Caesar 56, so there’s a 13-year spread.

No manual calculator, no hallucinated dates. Depending on question difficulty, the framework scales down to just the writer, or adds executor, or brings in the full planner layer. The result is a search flow that plans, replans, and keeps rolling even when a retrieval tool returns garbage.

That dynamic behavior is exactly what early RAG adopters have been begging for. But hold on, the fireworks really went off on June 30th. The same company that once swore its Ernie line would stay proprietary went full open They pushed 10 Ernie 4.5 variants to Hugging Face, starting at a featherweight 300 million parameter model and topping out at a monster 424 billion parameter multimodal beast.

Last year, Robinly declared Ernie would outpace open models precisely because it was closed. This year, he’s handing out the weights, SDKs, and inference tricks. Industry watchers are split on how much that rattles Western providers, but the pricing angle is impossible to ignore.

Open Source Shockwave: Baidu Slashes Costs, Spurs Global AI Pricing Crisis

Analysts already peg open weights as slashing deployment costs by 60-80%, and Baidu is bragging that its latest Ernie X1 matches DeepSeq R1 performance at half the price. If you’re an enterprise CTO, counting GPU hours, that’s a very loud sales pitch. Sam Altman has been hinting that OpenAI needs a fresher open source strategy, yet GPT-4 remains locked behind the API tollbooth.

Now Baidu’s move, plus Tencent’s, plus DeepSeq’s earlier stun of releasing one codebase a day for an entire week, crank up the pressure. Sean Wren from USC put it bluntly, every time a heavyweight lab open sources a top-tier model, the bar for the whole field shifts up. Alex Strasmore went even spicier, calling Baidu’s release a Molotov cocktail for price.

His argument is that if Chinese labs flood the market with free or dirt-cheap generative engines, everyone else has to justify champagne pricing in a world suddenly stocked with decent table wine. Of course, not everyone is lining up for a sip. United States enterprises remember the kerfuffle when some governments banned DeepSeq, and they worry that an API powered by a Chinese model could be a surveillance backdoor.

Cliff Jerkowitz, who wrangles strategy at Phenom, said most folks stateside still struggle to spell Baidu, let alone trust it with customer data. Wren also flagged that open weights do not equal open training data. Transparency about scraped sources, consent, and compensation still lags.

Still, cost savings talk pretty loudly when GPU budgets keep ballooning, and let’s be real, Cliff’s trust argument sounds more like a lazy stereotype than an actual technical critique. Meanwhile, Baidu’s models are outperforming and open sourcing at a pace companies like Phenom can’t even dream of matching. While Baidu lit up GitHub, Huawei clearly didn’t want to play second fiddle.

Huawei Joins the Fray as China’s Open Source Blitz Redraws the AI Power Map

On the same day, they open sourced two Pangu models, a 7-billion-parameter vanilla version and a 72-billion-parameter Pangu Pro, sporting its own mixture of experts core. Both ship with inference tweaks custom fit for Huawei’s Ascend chips, basically giving anyone with that hardware stack something approaching plug and play. Commercial Times reports that between Baidu, Huawei, Minimax, Alibaba, Moonshot AI, and DeepSeek, Chinese vendors have carved enough efficiencies that running large models is now between three-fifths and one-fifth of last year’s sticker price.

That open source wave is clearly strategic. Beijing’s New Generation AI development plan is gunning for global leadership by 2030, and compute self-sufficiency analysts already forecast China could hit 90% within five years. Every weight drop on hugging-face chips away at Western proprietary dominance nudges cloud costs lower and builds a local talent pipeline.

Meanwhile, on the United States side, OpenAI, Google, and Anthropic stay closed, at least for their flagship systems. That proprietary stance saves them from some of the supply chain and security skepticism Baidu faces, yet it also invites the argument that open models adapt faster, localize better, and unlock new research faster simply because anyone can poke holes in them. So yeah, the Chinese open source surge didn’t just copy Western LLM playbooks, it shuffled the deck.

DeepSeek launched the week-long codebase giveaway back in February, Minimax, Alibaba, and Moonshot keep throwing smaller models onto GitHub, and now Tencent, Baidu, and Huawei are swinging giant mixture of experts hammers. If the estimates are right and 85% of enterprises will fold open models into production by the end of next year, the licensing map we’ve been used to is about to look very different, very fast. So now the question is, how long can closed models keep charging premium prices when the open source world is moving this fast?

Drop your thoughts in the comments, make sure to subscribe for more updates like this, and if you found this useful, give it a like.Thanks for Reading. Catch you in the next one.

  • China NEW Open Source AI Models BREAK the Industry (AI WAR With OpenAI and Google)
  • China NEW Open Source AI Models BREAK the Industry (AI WAR With OpenAI and Google)
  • China NEW Open Source AI Models BREAK the Industry (AI WAR With OpenAI and Google)
  • China NEW Open Source AI Models BREAK the Industry (AI WAR With OpenAI and Google)

en.wikipedia.org

Also Read:- Unitree Just Gave Its ROBOT a BRAIN and It’s Already Acting HUMAN + More AI & Robotics News

Hi đź‘‹, I'm Gauravzack Im a security information analyst with experience in Web, Mobile and API pentesting, i also develop several Mobile and Web applications and tools for pentesting, with most of this being for the sole purpose of fun. I created this blog to talk about subjects that are interesting to me and a few other things.

Leave a Comment