Brewing...

Industry Insights

March 2 AI Dispatch: DeepSeek Goes Multimodal, Nvidia Bets on Photons, Apple Cuts the AI Price Floor

Seven moves that compress costs at the application layer while raising them in the substrate. DeepSeek V4 drops this week as a full multimodal model. Nvidia puts $4B into photonics. Apple puts Apple Intelligence in a $599 phone. The stack is repricing from both ends.

Sean McLellan

Lead Architect & Founder

March 2, 20265 min read

Seven developments hit the feed today. Read them in order and a thread emerges at the end.

1. DeepSeek V4 is multimodal and arriving this week

Sources at TechNode and the Financial Times confirmed that DeepSeek plans to release V4 this week -- its first major model drop since January 2025. The significant detail: V4 is a full multimodal model capable of generating text, images, and video, not just text. The timing is deliberate, landing just before China's annual "Two Sessions" parliamentary meetings begin March 4.

More important than the modality jump is the chip story underneath it. DeepSeek worked with Huawei and Cambricon to optimize V4 for their latest hardware -- a direct signal that the Ascend ecosystem is mature enough to train and serve frontier-class models. If you are an agency tech lead who dismissed Chinese AI infrastructure as a geopolitical footnote, this is the week to revisit that assumption.

What to watch: Benchmark comparisons against GPT-5.2 and Claude on multimodal tasks. If V4 matches on vision and undercuts on cost-per-token, the routing decision becomes obvious.

2. GLM-5 shipped last month -- and you probably missed it

Released February 11, Zhipu AI's GLM-5 has been getting real analysis this week. The architecture is notable: 744B total parameters in a mixture-of-experts structure with only 44B active parameters per forward pass, a 200K context window, and 77.8% on SWE-bench Verified. It is MIT-licensed and trained on Huawei Ascend chips.

That SWE-bench number puts it in the same neighborhood as the top Western coding models. The MIT license means you can run it commercially without a usage agreement. The Ascend training story adds another data point to item 1 above.

For teams self-hosting LLMs for code review, documentation generation, or internal tooling, GLM-5 is worth a serious benchmark pass this week.

3. OpenAI quietly retired SWE-bench Verified

The benchmark OpenAI created in 2024 to measure AI performance on real-world software engineering tasks? OpenAI stopped reporting scores against it as of February 23, per a note in TechByJohan's Week 10 recap. No public announcement, just a policy change buried in a report.

This is worth tracking carefully. When a lab stops publishing scores on a benchmark it created, two interpretations are plausible: the benchmark no longer differentiates frontier models, or the scores stopped flattering. Either way, the standardized evaluation landscape just got thinner. If you are selecting AI coding tools for a client engagement, you now have one fewer reliable shared reference point.

4. Nvidia puts $4B into photonics infrastructure

Nvidia announced $2B investments in each of two photonics companies -- Coherent and Lumentum -- to support US manufacturing and R&D on optical interconnects. The market reaction was immediate: Coherent jumped ~8%, Lumentum ~7%, Nvidia dipped 1.2%.

Photonics matters because the next bottleneck in AI infrastructure is not raw GPU compute -- it is how fast chips talk to each other and to memory. Optical interconnects move data at the speed of light with dramatically lower power consumption than copper. Nvidia is buying its way into the substrate layer before the constraint becomes visible at the application layer.

For agency operators, this move does not change anything you do today. It does confirm that Nvidia is playing a 5-10 year game, not a 12-month one. The companies betting on photonics being irrelevant to AI infrastructure are taking the wrong side.

5. AWS commits another EUR 18B to Spain

Amazon announced an additional 18 billion euros in Spanish data center investment through 2035, pushing total committed spend in the country to 33.7 billion euros. The stated purpose: scaling AWS capacity for cloud computing and AI workloads.

The European angle matters more than the raw number. Spain joins a short list of EU countries with enough grid capacity, permitting speed, and political appetite to host hyperscale AI compute. For developers running multi-region architectures or clients with EU data residency requirements, lower-latency GPU access in southern Europe is a practical near-term benefit -- not a distant infrastructure promise.

6. Apple ships the iPhone 17E at $599 with a full A19 chip

Apple announced the iPhone 17E today, starting at $599 for 256GB. The A19 chip inside it is the same generation powering the higher-end iPhone 17 line, which means on-device Apple Intelligence features run without the cutbacks that defined previous budget iPhone tiers.

The Personalized Siri integration -- which keeps conversational context across sessions using Google's Gemini models as the backend -- is expected to arrive later this year. That combination of mass-market price point and full AI capability matters for anyone building iOS apps that depend on on-device inference or Apple's AI APIs.

The practical upshot: the install base of capable AI hardware just got a lot larger. If you have been deprioritizing AI feature development because of device fragmentation concerns, that calculus shifted today.

7. MWC 2026 opens with telecoms pitching AI-native networks

Mobile World Congress kicked off in Barcelona today, with carrier and device announcements centering on AI embedded at the network level rather than sitting on top of it. MediaTek outlined a roadmap spanning 6G, edge AI, automotive connectivity, and data center infrastructure -- a signal that the company is positioning itself as an AI silicon play, not just a smartphone chip vendor.

The operator-level AI push from telecoms is commercially meaningful: carriers are actively shopping for AI that reduces operations costs and creates new enterprise revenue streams. For software teams building in the edge AI or telecom space, this is a procurement signal.

The thread

Today's seven stories compress in two directions at once. At the application layer, costs are dropping: GLM-5 is MIT-licensed and cheap to run, DeepSeek V4 is about to add multimodal capability at competitive inference prices, and Apple just put a full AI chip in a $599 device that 80 million more people will buy.

At the substrate layer, costs are rising and consolidating: Nvidia is locking in photonic supply chains with $4B bets, AWS is committing tens of billions to data center real estate, and DeepSeek's chip story is a reminder that GPU access is still a geopolitical asset.

The agencies and operators who navigate this moment well are the ones who use the cheap application layer to build revenue now, while understanding that the infrastructure underneath it is not cheap, not stable, and not finished.

One question to hold: if DeepSeek V4 matches GPT-5 on multimodal benchmarks at lower cost-per-token, which provider contract renewals does that affect in your stack?

Practical AI Workflow Notes

Want more practical AI operations ideas?

Get short notes on applying AI inside real small-business workflows — from document handling and customer follow-up to internal reporting, compliance, and automation guardrails.

Share this post

Share on X Share on LinkedIn Share on Bluesky

Google Hosts Siri, Drops Gemini Flash-Lite, and DeepSeek V4 Arrives on Chinese Silicon

March 3, 2026

OpenAI Publishes Its Pentagon Red Lines, Apple Retires Core ML, and DeepSeek V4 Ships This Week

March 1, 2026

The MCP token tax no one quoted: 44,000 tokens to check one repo language

March 16, 2026

Turn this idea into a pilot

Which workflow should go first?

Use the readiness check to compare impact, effort, risk, owner, and next step before booking a call.

3-5 minutes
Deterministic score
No sensitive data

Check workflow readiness

Keep Reading

Google Hosts Siri, Drops Gemini Flash-Lite, and DeepSeek V4 Arrives on Chinese Silicon

March 3, 2026

OpenAI Publishes Its Pentagon Red Lines, Apple Retires Core ML, and DeepSeek V4 Ships This Week

March 1, 2026

The MCP token tax no one quoted: 44,000 tokens to check one repo language

March 16, 2026

Industry Insights

March 2 AI Dispatch: DeepSeek Goes Multimodal, Nvidia Bets on Photons, Apple Cuts the AI Price Floor

Sean McLellan

Lead Architect & Founder

March 2, 20265 min read

Seven developments hit the feed today. Read them in order and a thread emerges at the end.

1. DeepSeek V4 is multimodal and arriving this week

What to watch: Benchmark comparisons against GPT-5.2 and Claude on multimodal tasks. If V4 matches on vision and undercuts on cost-per-token, the routing decision becomes obvious.

2. GLM-5 shipped last month -- and you probably missed it

For teams self-hosting LLMs for code review, documentation generation, or internal tooling, GLM-5 is worth a serious benchmark pass this week.

3. OpenAI quietly retired SWE-bench Verified

4. Nvidia puts $4B into photonics infrastructure

5. AWS commits another EUR 18B to Spain

6. Apple ships the iPhone 17E at $599 with a full A19 chip

7. MWC 2026 opens with telecoms pitching AI-native networks

The thread

One question to hold: if DeepSeek V4 matches GPT-5 on multimodal benchmarks at lower cost-per-token, which provider contract renewals does that affect in your stack?

Practical AI Workflow Notes

Want more practical AI operations ideas?

Get short notes on applying AI inside real small-business workflows — from document handling and customer follow-up to internal reporting, compliance, and automation guardrails.

Share this post

Share on X Share on LinkedIn Share on Bluesky

Google Hosts Siri, Drops Gemini Flash-Lite, and DeepSeek V4 Arrives on Chinese Silicon

March 3, 2026

OpenAI Publishes Its Pentagon Red Lines, Apple Retires Core ML, and DeepSeek V4 Ships This Week

March 1, 2026

The MCP token tax no one quoted: 44,000 tokens to check one repo language

March 16, 2026