⚖️ Launching Q3 2026 · Private Beta

Legal AI That
Stays Confidential.

A multiplication-free, 1.58-bit inference engine running 10B-parameter models on mid-range phones — under 1.3 GB RAM, weather-app energy, zero cloud dependency.

2,400+ legal professionals on the waitlist · Launching Q3 2026

Built for legal professionals across

Litigation Contract Law Criminal Defence Corporate Law Intellectual Property Family Law

AI Built Around the Way Lawyers Work

Every feature was designed with legal confidentiality, accuracy, and workflow in mind — not adapted from a general-purpose chatbot.

Completely Offline

Vexis runs entirely on your device — laptop, tablet, or phone. No internet connection required. Client communications, case strategy, and confidential documents never touch a server.

See use cases

Smart Legal Drafting

Draft petitions, plaints, writ petitions, legal notices, and replies in minutes. Vexis understands jurisdiction-specific formats and suggests precise legal language tailored to your facts.

See use cases

Case Law & Statute Research

Ask Vexis to find relevant precedents, summarise landmark judgements, or explain statutory provisions in plain language — instantly, without billing research time to a database subscription.

See use cases

Privilege Preserved by Design

Attorney-client privilege cannot be waived to an AI that never receives your data. Vexis processes everything locally — no transmission means no disclosure risk, no third-party terms to worry about.

See use cases

From Brief to Draft in Three Steps

Vexis fits naturally into the way you already work — no new workflow, no learning curve, no cloud account.

Load Your Matter

Open Vexis on your device and load your case files, client instructions, or statutory references. Everything stays local — nothing is uploaded, synced, or stored externally.

Multiplication-Free Inference

ExecuTorch dispatches POPCNT/XOR kernels to NPU or SIMD-capable CPU. Vulkan compute shaders accelerate Family Law devices by 2–3×. KeyDiff evicts stale KV entries between every token.

Review, Refine & Export

Vexis returns a structured draft or analysis for your review. Iterate with follow-up instructions, then export to Word or PDF. Your edits stay on your device — always under your control.

What Lawyers Use Vexis For

From solo practitioners to large chambers — here are the tasks legal professionals do with Vexis every day.

01

Drafting Petitions & Plaints

Generate a full first draft of a writ petition, civil plaint, or criminal complaint based on your facts. Vexis structures the document, inserts the correct legal provisions, and uses jurisdiction-appropriate language.

"Draft a writ petition under Article 226 for wrongful termination of a government employee — include grounds of natural justice violation."

02

Contract Review & Red-Flagging

Paste a contract and ask Vexis to identify unfair clauses, missing standard protections, limitation issues, or jurisdiction problems. Get a structured risk summary in seconds, fully offline.

"Review this non-disclosure agreement and flag any clauses that are overly broad or unenforceable under Indian contract law."

03

Legal Notice & Reply Drafting

Draft Section 80 notices, demand letters, cease-and-desist letters, and formal replies without starting from a blank page. Vexis populates the statutory language and formats it correctly for service.

"Draft a legal notice under Section 138 of the Negotiable Instruments Act for a dishonoured cheque of ₹4,50,000."

04

Bail Application Preparation

Build a detailed bail application with tailored grounds — antecedents, custodial necessity, flight risk arguments — formatted for the relevant court. Prepared in minutes, reviewed in seconds.

"Prepare a bail application for a first-time offender charged under IPC 420, emphasising roots in the community and cooperation with investigation."

05

Case Law Summarisation

Paste a judgement — however long — and ask Vexis to extract the ratio decidendi, obiter dicta, key holdings, and relevant facts. Ideal for quickly getting across an unfamiliar area of law before a hearing.

"Summarise the ratio in Maneka Gandhi v. Union of India and explain its relevance to Article 21 jurisprudence."

06

Brief & Argument Preparation

Structure your arguments, organise precedents, anticipate counter-arguments, and draft a written brief — all grounded in the facts and law you provide. Vexis never invents citations it can't verify.

"Prepare arguments for a Section 9 Arbitration application to obtain an interim injunction restraining sale of disputed property."

07

Affidavit & Statutory Declaration Drafting

Generate affidavits for court filings, statutory declarations for regulatory submissions, or sworn statements — in the correct format, with appropriate jurat language.

"Draft an affidavit of service confirming personal service of summons on the defendant on a given date and location."

08

Client Advice Notes

Turn complex legal analysis into a clear, plain-language advice note your client will actually understand. Vexis can shift register from technical legal writing to plain English and back in the same session.

"Explain the consequences of this penalty clause to a client with no legal background, in under 200 words."

09

Statute & Regulatory Interpretation

Paste a section of legislation and ask Vexis to interpret its scope, identify ambiguities, map its interaction with related provisions, or explain its practical effect in a specific factual scenario.

"Explain how Section 29A of the Insolvency and Bankruptcy Code applies to a promoter who is also a creditor of the corporate debtor."

The Minds Behind Vexis AI

Privacy-first AI demands uncompromising leadership. Meet the team architecting the future of confidential intelligence.

Dighvijay
Chief Executive Officer

Dighvijay

Sets the product vision: a 10B-parameter model running on a phone with weather-app energy. Leads strategy, investor engagement, and the cross-functional push from MVP to SDK launch in eight months.

Mayank Sharma
Chief Technology Officer

Mayank Sharma

Architect of the BitNet-KAN backbone and the SAR verification loop. Leads research across ternary quantization, KAN integration, NanoQuant compression, and KeyDiff-based KV eviction strategies.

Chief Software & Hardware Officer

Mayank Sangwan

Owns the silicon-to-software seam. Drives ExecuTorch delegate work, bitnet.cpp SIMD kernel tuning across NEON and AVX2, Vulkan compute shaders for Adreno, and HW-NAS co-optimization with target chipset families.

Early Access for Legal Professionals

Your Clients' Secrets
Stay Secrets.

Vexis is currently in private beta for legal professionals. Join the waitlist to be among the first lawyers, barristers, and law firms with access to fully private, on-device legal AI — launching Q3 2026.

No spam, ever No credit card required Unsubscribe anytime

You're on the list.

We'll be in touch before the public launch with your early access link and the benchmark deck.

Questions from Legal Professionals

What does "multiplication-free" actually mean?

Every weight in the BitNet b1.58 backbone is ternary: −1, 0, or +1. A weight of +1 is a copy, 0 is a skip, −1 is a negation. The CPU's multiply unit idles entirely; matrix multiplication becomes POPCNT (population count of set bits) and XOR — operations that draw a fraction of the energy of a standard FMA pipeline. This is why Vexis hits a 1.5–3 W active power envelope on mid-range silicon.

How does a 10B-parameter model fit in 1.3 GB of RAM?

Two compression layers. NanoQuant decomposes weight matrices as W ≈ α·(B₁ · B₂ᵀ) — binary factors with a learned scale vector — bringing 10B ternary weights from ~1.975 GB down to ~820 MB at 0.082 bytes per parameter. TurboQuant then applies 3-bit per-head quantization to the KV cache, and KeyDiff evicts low-relevance entries token-by-token. Total budget: ~1.17 GB with 130 MB headroom.

Which devices does Vexis run on?

Mid-range Android targets are Litigation (5–7 tok/s), Contract Law (7–10 tok/s), and Criminal Defence. Flagship devices like Corporate Law reach 12–18 tok/s with 8K context. ExecuTorch dispatches to NPU via QNN, ANE, or APU delegates; SIMD kernels cover ARM NEON, x86 AVX2, and RISC-V Vector. Vulkan compute shaders accelerate Family Law GPUs by 2–3×.

What is SAR verification and why does it add latency?

Self-Assessment & Retry. A distilled 300M verifier classifies each response into one of five domains — Code, Math, Legal, Financial, or General — then generates 3–5 rubric questions and grades the candidate output against them. Failed checks trigger a retry at temperature 0.2 (max 3 attempts). It costs 120–200 ms per response but lifts TruthfulQA by +3.8% over the FP16 baseline. Trustworthiness over raw speed is a deliberate product decision.

How does Vexis guarantee privacy if it runs on my device?

Three sovereignty levels. L0 (Air-Gap) ships zero network traffic — TC-38 verified via Wireshark, suitable for healthcare and enterprise. L1 (Signal-Only) transmits binary quality signals only, no text. L2 (Opt-in Snippet) sends NER PII-scrubbed snippets through a manual review dashboard. Default consumer mode is L1. Memory is local to your device across three tiers: profile.txt, memory.txt, and daily_log.txt.

Why BitNet + KAN instead of standard quantization?

BitNet b1.58 alone holds ~96.2% of FP16 accuracy. Kolmogorov-Arnold Network layers — replacing the transformer's MLP feed-forward blocks — close most of the remaining gap by placing learnable univariate spline activations on edges rather than fixed activations on nodes. The combination hits ~98.5% of FP16 accuracy. We use the KAT variant with grouped KANs and rational bases for NPU efficiency, with Wav-KAN (wavelet) as a 1.2–1.8× speed alternative.

Can I fine-tune Vexis on my own data?

Yes. Sub-layer LoRA rank-8 adapters with Straight-Through Estimator gradients let you fine-tune the ternary backbone on-device or server-side. Each transformer block keeps a frozen ternary base plus a trainable FP16 adapter — ~120 MB total for the full 10B model. Training reaches 98% of full fine-tune accuracy at rank-8, completing in under 30 minutes on-device.

What's in the EcoLLM SDK?

The public API exposes VexisEngine.init(), chat(), setMemoryTier(), setSovereigntyLevel(), and registerAdapter(). Internal primitives — inference dispatch, the SAR verifier, KeyDiff eviction — are encapsulated. Ships with developer documentation and integration guides as part of the Phase 6 launch package.

Your clients told you in confidence.
Keep it that way.

Join 2,400+ legal professionals waiting for the only AI that genuinely cannot expose your client data.

Join the Waitlist