This Week in AI: The Eastern Video Flood, DeepSeek’s Return, and The Death of the Browser

If you thought last week was loud, you haven’t been paying attention.
We are living through a timeline where "weeks" feel like "decades." Just days ago, we broke down the internal panic at OpenAI—the Code Red triggered by Google’s dominance and the frantic rush to ship Project Garlic. (If you missed our deep dive on why OpenAI is scrambling and what it means for your business, read the full breakdown here).
But while the US giants are busy fighting over "reasoning" benchmarks and internal leaks, a different kind of war has erupted. The floodgates have opened in China. The "Video Generation" moat—once held securely by Western darlings like Runway and Sora—hasn't just been breached; it’s been flooded.
From infinite-streaming avatars to "thinking" models that cost pennies, here is everything you need to know about the power shifts in AI this week.
The Eastern Video Flood (While the West Teases, the East Ships)
For months, we’ve been waiting for the "ChatGPT moment" of video. This week, it arrived—not from San Francisco, but from Beijing and Shenzhen. While US companies are still teasing waitlists, Chinese labs just dropped an entire arsenal of tools that you can use right now.
Kling AI didn't just drop an update; they dropped a five-day event. The standout is Kling 1.0, an "omnimodal" model that is essentially the "Nano Banana" of video. It allows for complex editing, background replacements, and mixing text/image/video inputs in a way we haven't seen before. They followed up with Kling 2.6, which finally brings native audio to the table. No more generating a video and slapping a sound effect file on top. The model understands the physics of the scene—crashes, dialogue, ambience—and generates the audio in sync.
Meanwhile, Tencent changed the math on rendering. High-quality video usually requires a render farm, but their new Hunyuan Video 1.5 (Distilled) cuts generation steps from 50 down to 8. You can now generate 720p, high-motion video on a consumer-grade RTX 4090 in about 75 seconds. When the community can run state-of-the-art video generation at home, the "moat" for closed-source video models evaporates.
And then there is Alibaba. If you’ve ever tried to generate a long AI video, you know the "decay" problem—after 10 seconds, the face melts. Alibaba’s new Live Avatar solves this using "rolling row positional encoding." It can stream infinite-length video with native audio at 20fps. The era of the 24/7 AI streamer is no longer theoretical; the tech is here.
Not to be outdone, Runway teased Gen 4.5. The demos look incredible—better physics, better "slice of life" coherence. But there’s a catch: It’s a tease. While Kling and Tencent are putting tools in users' hands, Runway is still playing the "coming soon" game. In 2025, the speed of shipping matters as much as the quality of output.
The Intelligence Wars: The Whale vs. The Giant
While video is flashy, the battle for "intelligence" is where the real money is. And this week, the hierarchy got messy.
DeepSeek is the "Robin Hood" of the LLM wars. They just dropped DeepSeek V3.2, and it is terrifyingly good. It rivals GPT-5.1 and Gemini 3 Pro on reasoning, math, and coding, but does it at a fraction of the cost. DeepSeek has achieved "Gold Medal" status in international math and coding Olympiads. For enterprise users, this is a wake-up call: You no longer need to pay the "OpenAI Tax" to get state-of-the-art reasoning.
Google isn't taking this lying down. They finally released Gemini 3 Deep Think to users on the Ultra plan. This is currently the "smartest" model you can access, crushing the "Humanity’s Last Exam" benchmark. But it costs $250/month. We are seeing a bifurcation in the market: "Good Enough" intelligence is becoming free (DeepSeek), while "God Tier" reasoning is becoming a luxury product.
(Note: For the open-source purists wondering about the new Mistral 3 family—we see you. We are testing the models right now and will have a dedicated breakdown dropping tomorrow. Watch this space.)
The Agentic Shift: The Browser is Dead
We talk a lot about "Agents"—AI that does things, rather than just talking about them. This week, the infrastructure for that future arrived.
Flowith OS is a glimpse into the post-browser future. It’s an operating system designed for agents. It doesn't just "search the web"; it autonomously executes workflows. Need to download a GitHub repo, install dependencies, and verify the code runs? Flowith does it without you touching the terminal. This is the difference between a "Chatbot" and a "Digital Employee."
Simultaneously, Google Workspace Studio is quietly building an agent army inside your email. It allows business users to build no-code agents using natural language. You prompt it: "If I get an email with a question, draft a reply and ping me on Chat," and it builds the workflow. This threatens the entire ecosystem of third-party automation tools. Why pay for Zapier when your email can program itself?
And to make these agents feel human, Microsoft Vibe Voice has cut latency to 300 milliseconds. It streams audio while the LLM is thinking. This is the tech that makes AI agents feel like real people on the phone, rather than robotic assistants.
The Creative Toolkit: Animation Democratized
The "Starving Artist" narrative is dead; the "Augmented Creator" is the new reality. This week, three tools dropped that fundamentally change how we create.
The absolute standout is Steady Dancer. If you have ever tried to animate a character, you know the pain of keyframes and rigging. Steady Dancer deletes that workflow. You feed it a single static image of a character and a reference video of someone dancing (or sweeping, or kicking a ball), and it transfers the motion instantly.
Unlike previous attempts (like Animate Anyone), this doesn't break when things get weird. It handles complex hand gestures, 3D characters like Wukong, and even characters with irregular body proportions without glitching. It is open-source, it runs locally, and it makes high-end character animation accessible to anyone with a GPU.
Then we have ByteDance Cream 4.5. While it’s proprietary, it solves the two biggest headaches in AI art: text and editing. It renders perfect typography on posters and allows you to edit specific elements—like removing a person or changing lighting—without hallucinating a new image.
Finally, Poster Copilot is effectively replacing the design intern. It’s an agentic workflow where you drag and drop assets, and the AI handles the composition, background generation, and layout. It even handles "multi-round refinement," meaning you can tweak the aspect ratio or color scheme instantly without starting from scratch.
The Wildcards: Memory Tokens & Ad Creep
But where is the money going? Apple quietly dropped research on Clara, a system that compresses massive documents into tiny "memory tokens," hinting at how they plan to run powerful AI on your iPhone without melting the battery.
And finally, the bubble might be bursting on the "ad-free" AI dream. Code snippets in the ChatGPT Android beta reveal references to "Search Ad Carousels." OpenAI needs to pay for those massive GPU clusters, and it looks like your "helpful assistant" is about to start suggesting you shop at Target. The question is: Will users accept ads in their "intelligence," or will this drive them to ad-free open-source alternatives like DeepSeek?
The gap between "The West" and "The East" has vanished. In video, China is arguably winning. In reasoning, they are matching the best for a fraction of the price. OpenAI’s "Code Red" isn't paranoia; it’s a rational response to a market that is moving faster than even Sam Altman can control.
Bangkok8-AI: We'll show you the fire - so you don't get burned.