How to Build Your First On-Device AI Feature with Google's Gemma 4 AICore Developer Preview

Google just made on-device AI accessible to every Android developer. As of April 3rd, the AICore Developer Preview is live with Gemma 4 — their most capable open model yet. And here's the kicker: any code you write today will automatically work when Gemini Nano 4 ships on flagship devices later this year.

No cloud API costs. No network dependency. No per-request billing. The AI runs entirely on the user's device.

As someone who's been building mobile apps with React Native, this changes the game for what's possible without a backend AI service.

What Is Gemma 4 and Why Should You Care?

Gemma 4 is Google's latest open-source AI model family. It's not a cloud service — it's a model that runs directly on the phone. Think of it as the engine behind Gemini Nano, but available to developers right now for prototyping.

It comes in two sizes:

E4B — higher reasoning power, better for complex tasks (content moderation, multi-step logic)
E2B — 3x faster than E4B, optimized for speed-critical use cases (autocomplete, quick classifications)

Both models support:

140+ languages — real multilingual support out of the box
Multimodal understanding — text, images, and audio processing
4x faster than previous versions, 60% less battery

Getting Started: The Quick Setup

Step 1: Sign up for the AICore Developer Preview. This gets you access to download the Gemma 4 models directly to your test device.

Step 2: Open Android Studio. You can now refine your prompts and build using the ML Kit Prompt API with the familiar IDE experience.

Step 3: Use the ML Kit Prompt API to integrate. Here's the core pattern — you specify which model variant you want (E2B fast or E4B full), check availability, and generate:

val config = generationConfig { modelConfig = ModelConfig { releaseTrack = ModelReleaseTrack.PREVIEW; preference = ModelPreference.FULL } }

val model = GenerativeModel.getClient(config)

val response = model.generateContent("Your prompt here")

That's it. Three lines to get AI inference running on-device. No API key, no billing, no network required.

What Can You Actually Build?

Google highlighted four areas where Gemma 4 shows the biggest improvements:

Reasoning: Chain-of-thought prompts work reliably now. Content moderation, rule-based classification, conditional logic — all doable on-device.
Math: Budget calculators, expense splitters, financial projections — the model handles practical math accurately now.
Time understanding: Calendar-aware features, smart reminders, event scheduling — the model can reason about dates and durations reliably.
Image understanding: OCR, chart analysis, handwriting recognition, receipt scanning — all processed locally without sending images to a server.

Real Use Cases I'm Thinking About

As a developer who builds apps for the Indian market, the privacy angle alone is massive. Users in India are increasingly privacy-conscious, and the ability to say "your data never leaves your phone" is a serious competitive advantage.

Here are some concrete ideas:

Smart prescription scanner — scan handwritten prescriptions with OCR, extract medication names and dosages. Perfect for health apps.
Offline content moderation — moderate user comments in community apps without sending every message to a cloud API.
Smart expense tracker — photograph receipts, extract amounts and categories, auto-categorize spending. All on-device.
Multilingual chat assistant — with 140+ languages, build assistants that work in Hindi, Telugu, Tamil, or any regional language without translation APIs.

The Catch (There's Always One)

A few things to know before you dive in:

Device support is limited right now. The preview runs on AICore-enabled devices with Google, MediaTek, and Qualcomm AI accelerators. On other devices, it falls back to CPU — which won't represent final performance.
Tool calling, structured output, and system prompts aren't available yet. They're coming during the preview period, but right now you're working with basic prompt → response.
This is a preview, not production. Don't ship apps relying on it yet. Use it to prototype and refine prompts. Production support comes with the official Gemini Nano 4 launch.

Why This Matters for the Industry

We're watching a fundamental shift. Both Apple (Foundation Models in iOS 26) and Google (Gemma 4 AICore) are betting hard on on-device AI. The cloud-first era of AI is giving way to a hybrid model where the phone does the heavy lifting and the cloud handles only what it must.

For developers, this means:

Lower infrastructure costs (no per-request AI API billing)
Better user privacy (data stays on device)
Offline capability (works without internet — huge for emerging markets)
Faster response times (no network round trip)

Get Started Now

Sign up for the AICore Developer Preview, download the models to your device, and start experimenting. If you don't have an AICore-enabled device, you can also test via the AI Edge Gallery app.

I'll be building a prototype with this over the next few weeks and sharing what I learn. Follow along on my blog or check out my projects.