08Journal

We write what we learn.

Product · Security · June 2026

Red-teaming Arabic LLMs before production — red cards, not satisfaction polls.

Post-launch satisfaction surveys surface pain too late. Red-teaming forces adversarial prompts, your corpora, and a numeric acceptance gate before Compliance signs any path touching citizens or contracts.

June 20268 min read

Security · Operations · June 2026

AI model supply chain — where weights came from and who stops the CVE.

A model is not an abstract file; it is a product flowing through mirrors, builds, signatures, and security updates. This article gives GCC security and compliance teams an operational checklist before a path is labelled "approved production".

June 20268 min read

Data policy · Product · June 2026

Synthetic data and LLM training — when PDPL risk drops and Arabic quality dies.

Generated corpora are not automatically "clean" legally or linguistically. This article separates safe synthetic use for pipeline testing from the fantasy of training without real-data governance under Gulf frameworks.

June 20267 min read

New series · Open-source projects

Twenty open-source AI tools — GitHub links before the star count.

Same batch order, roughly: one repo link per name, then one sentence on fit and one on risk. Stars do not satisfy PDPL — but the link starts the clock on a real eval [1].

12 min read

Open-source dev tools on an Omani engineer's desk: measured by compliance, not trends.

Five tools dissected in depth — plus a fifteen-project radar — all mapped to one question GitHub's trending page never asks: does this repo answer the auditor?

12 min read

All articles

Infrastructure · Sovereignty · June 2026
Where to run LLM inference in the GCC — latency, residency, one invoice.
The decision is not only GPU versus API; it is round-trip time, processor-data coupling, and whether contracts permit log inspection. This matrix helps teams spanning Oman, UAE, and Saudi in one chain.
June 20268 min read
Operations · Security · June 2026
After an LLM incident — a 48-hour GCC playbook spanning logs and notice.
Prompt leakage, toxic outputs, or brittle integrations are not "pure tech" incidents; they are compliance timing decisions. This timeline gives Ops, IT, and Legal shared checkpoints inside forty-eight hours.
June 20267 min read
Product · RAG Ops
The weekly RAG scorecard before blaming the frontier model.
Four KPIs — recall@k, citation accuracy, p95 latency, drift — stamped every Monday keeps retrieval honest.
June 20269 min read
Infrastructure · Power
GPU power budgets in Gulf data centers.
PUE, kWh tariffs, and summer peaks belong in the capex memo next to NVIDIA list price.
June 202610 min read
Compliance · Banking
GenAI and AML case handling in Oman — assistant lane only.
Summaries shave minutes; signatures still sit with humans — AML standards plus PDPL require auditable RACI lanes before live alerts ingest model text.
June 202612 min read
Operations · Change
Rolling out an enterprise AI assistant inside Omani firms.
Champions beat launch parties — five disciplined weeks tying real workloads to RACI lanes build habit before compliance pays the fallout bill.
June 202610 min read
Governance · Procurement
Arabic LLM evaluation before you sign implementation.
Three tasks, two hundred rows, one numeric acceptance line — before a clean leaderboard convinces procurement the wrong corpus is safe.
June 202611 min read
VISION · Special issue · Taste & algorithms · May 2026
Algorithms and taste when similarity becomes acceptance.
Taste and algorithms collide when personalization is scored as screen time rather than layered meaning. This essay is not a crusade against models — it insists your judgment arrives before dashboards rename it engagement [1].
May 20267 min read
PRODUCT · Open source
Twenty open-source AI tools — GitHub links before the star count.
Same batch order, roughly: one repo link per name, then one sentence on fit and one on risk. Stars do not satisfy PDPL — but the link starts the clock on a real eval [1].
May 202612 min read
VISION · GCC Policy
Why the Gulf still does not ship one federated Arabic ChatGPT — honestly.
It is sovereignty seams, sovereign wealth magnetism toward US hyperscalers, GPU scarcity politics, procurement theatre—before the brand halo consolidates.
May 20267 min read
SECURITY · Threats
What prompt injection actually is — before you flip on tools.
A blocklist stops neither an adversary nor a clever employee paste. Strings merge in one stream; attackers hide instructions inside email your assistant ingests quietly.
May 20267 min read
INFRASTRUCTURE · Facilities
Tier III facilities for inference in the GCC — plain language.
The badge is not latency magic; it is path diversity for failover before your first SLA conversation with CFO and regulator alike.
May 20267 min read
ECONOMICS · Oman
Running an LLM in Oman — year-one economics without the theater.
Hardware, colocation, industrial power, three operator roles, GPU failure—then compare with an API line that still respects PDPL and cross-border reality.
May 202612 min read
OPERATIONS · Observability
Grafana for LLM stacks — what you must chart before you blame the GPU.
HTTP 200 is not cognition. Separate edge latency from inference backlog, KV pressure, retrieval lag, then token-dollar math on one executive wall.
May 20267 min read
Opinion · Project Management
Why AI projects fail in the Middle East.
Repeated failure patterns across MENA AI procurement — and an execution path that stops the bleeding before the budget does
May 202613 min read
Analysis · GCC
The Falcon lesson — UAE, OpenAI, and building a Gulf model.
From TII’s open Falcon line to G42’s OpenAI alliance: why research-scale LLMs and commercial distribution are different games
May 202613 min read
Product · ML design
RAG vs Fine-Tuning: Which Wins in 2026?
Choosing knowledge refresh vs weight refresh is not a brainstorm — it is a table that respects data cost, change cadence, and compliance.
May 20267 min read
Comparison · Models
Qwen2.5-72B vs GPT-4o — which wins for Arabic.
Internal benchmark snapshot on Arabic office reality: GPT-4o strength on fusḥā and numerics, open-weight upside on sovereignty and throughput — with one chart to align execs.
May 202614 min read
PRODUCT · Open source
Open-source dev tools on an Omani engineer's desk: measured by compliance, not trends.
Five tools dissected in depth — plus a fifteen-project radar — all mapped to one question GitHub's trending page never asks: does this repo answer the auditor?
May 202612 min read
Opinion · Security
Your Omani data on a US server — what actually happens.
CLOUD Act legal reach plus Oman PDPL realities: why pretty region pins do not replace custody maps
May 202613 min read
VISION · Encyclopedia
Nuqta AI encyclopedia: Oman ecosystem directory hub.
This is not a promise to ship a dashboard next Friday — an editorial scaffold that holds two inseparable layers in our work: startup, government, and lab maps, plus deep essays on cost, compliance, and operations. The journal becomes a deliberate entry point into AI engineering, not a scattered headline feed.
May 20267 min read
Operations · MLOps
MLOps vs DevOps for LLM Production: Where the Difference Starts.
Shipping a container once is not operating AI — real ops means model versions, data roll-forward, drift monitoring, and rollbacks unlike a classic API redeploy.
May 20267 min read
Economics · Infrastructure
The full calculation: LLM year-one cost of ownership.
$365K — the complete breakdown of what you pay in year one to run a large language model on-premise in Oman
May 202614 min read
Infrastructure · Inference
What Is KV Cache in LLM Inference and How Does It Eat VRAM?
The GPU is not the whole truth — part of inference speed is reusing intermediate keys and values instead of recomputing layers for every token.
May 20266 min read
Operations · Government procurement · May 2026
Government AI procurement in the GCC — Terms of Reference that stop POC theater.
A thick technical annex does not prevent year-one failure; TOR that binds data scope, compliance evidence, and acceptance metrics before commercial opening does. This article gives a TOR gate a technical committee can defend to vendors and external auditors alike.
May 20268 min read
SECURITY · LLM threats
Enterprise Prompt Injection: Defence Layers Beyond Word Blocklists.
A word list won’t stop instructions hidden in innocent sentences — real defence separates privileges, judges retrieval, and logs manipulation like classic intrusions.
May 20267 min read
VISION · Data sovereignty
CLOUD Act and AI Data in Oman: A Data Controller’s Decision Map.
This is not fear-mongering about US cloud — a map for where contractual warranties end and jurisdictional compulsion may begin, and how that intersects with Oman’s PDPL when you store model conversations.
May 20267 min read
VISION · Sovereign Investment
Oman's OIA bets on Neuralink: sovereign capital inside the human skull.
On May 6, 2026, the Oman Investment Authority officially backed Neuralink — Elon Musk's company building direct interfaces between the human brain and electronic devices. This is not a diversification trade. It is a declaration that Oman intends to be inside the room where the next civilisational technology is decided.
May 20266 min read
AI · Infrastructure
Oman's Special AI Zone: From COMEX Stage to Royal Decree.
On April 29, 2026, Sultan Haitham bin Tarik signed Royal Decree 50/2026 — formally establishing the Special AI Zone in Muscat Governorate. In one signature, a COMEX announcement became enforceable law: approximately 104,000 square metres, three defined sectors, and a binding economic framework. This is what the decree means for companies that want to build now.
May 20268 min read
AI · Infrastructure
What is vLLM — and why production teams use it.
vLLM is an open inference engine for LLMs: scheduling, continuous batching, and KV memory designs such as [PagedAttention](/en/journal/what-is-pagedattention-llm-serving-2026). The point is not a thin API wrapper — it is raising useful throughput under real traffic [1].
April 20267 min read
AI · Infrastructure
L40S vs A100 vs H100 — which GPU for which job.
The question is not the fastest SKU on a slide. It is workload fit: heavy training, broad inference, or cost-per-watt chat serving? One matrix places L40S, A100, and the [H100 reference](/en/journal/nvidia-h100-gpu-ai-standard-2026) on the same decision axis — without hand-waving in procurement [1].
April 20268 min read
AI · Models
Inference vs training for LLMs — who pays for what.
Training might run once (or for many hours) and you pay a cluster bill. Inference runs forever and turns a model into a per-token Opex line. This article separates the two checkbooks so pilot budgets are not mixed with product bills [1].
April 20267 min read
AI · Security
Who owns your embeddings? Fine-tuning and PDPL reality.
Embeddings and fine-tuned weights are not ordinary files. They are processing outputs that can redefine what your data means — and contracts often discuss the base model while ignoring what was generated for you.
April 202610 min read
AI · Models
What is LoRA — and how it cuts fine-tuning cost.
When people say fine-tuning, many still picture updating billions of weights in an expensive full pass. LoRA freezes the base and injects a low-rank delta into selected linear paths — often enough to shift behavior on a narrow task without shipping a full weight copy. This article explains the idea without hype, and when savings move from slides to investment [1].
April 20267 min read
AI · Infrastructure
When a small on-prem model beats a cloud API subscription.
This is not anti-cloud. It is a spreadsheet: when an open small or medium model on your own GPU wins on three-year TCO and compliance — and year-one math lies if you ignore context and labor.
April 202611 min read
Security · Governance · April 2026
Shadow AI — governing unsanctioned use in GCC enterprises.
This is not a lecture aimed at employees. It is what happens when the consumer assistant becomes the default way to work — with no processing record, no approved alternative, and no checkpoint linking IT to compliance.
April 20267 min read
Security · Retrieval · April 2026
Prompt injection and corpus poisoning — the RAG gap vendors smooth over.
A normal-looking document hides instructions that derail policy or leak index content. This is not sci-fi — it is a realistic attack pattern that needs operational defense, not a marketing disclaimer.
April 20268 min read
AI · Quality · April 2026
Hallucinated citations — auditing RAG source links before you trust the UI.
The UI shows a "source" while the paragraph is missing, truncated, or from the wrong page. This article gives a practical audit path before you ship the assistant to staff or customers.
April 20267 min read
AI · Integration
Model Context Protocol at work: the bridge is not the border.
MCP explains how tools plug into an LLM — it does not replace decisions on where data is processed, who owns logs, or whether inference leaves your network.
April 20269 min read
AI · Operations
Five RAG metrics to check before you blame the LLM.
Before you raise model spend or switch vendors, measure retrieval, chunks, and escalation. Most production hallucination starts in documents and indexes — not parameter count.
April 20268 min read
Product · Retrieval · April 2026
Enterprise AI agents vs a RAG-first pipeline — when orchestration is theater.
Most "agents" in production are solid retrieval + a few tools + policies — not a self-driving orchestrator making unsupervised decisions. This article gives a blunt product decision before you multiply complexity.
April 20267 min read
Procurement · Operations · April 2026
POC theater — how vendor AI demos are designed never to fail.
Proofs are staged: clean data, rehearsed questions, and none of the governance you will run in production. This article unpacks the polite trap and gives a measurement frame that fails early — before the signature.
April 20267 min read
AI · Procurement
AI contract clauses you cannot leave blank in Oman.
A procurement pack without data and liability clauses is buying a promise. This framework ties contracts to Oman PDPL — it is not a substitute for legal review.
April 202610 min read
AI · Infrastructure
What is RAG — and why your company bot answers like a stranger.
A practical guide to Retrieval-Augmented Generation: how your bot reads documents before answering, and why it costs 10× less than fine-tuning.
April 20267 min read
AI · Search
The end of traditional search — what happens to Google in 2026.
This is not a funeral for Google. It is an operating description of a market shift: who owns the click, who owns the answer, and why keyword budgets alone no longer explain what changed in 2026.
April 20267 min read
AI · Retrieval
Hybrid search — combining lexical and vector retrieval.
This is not a vendor badge. It is an architecture decision: when token overlap saves you, when embedding similarity saves you, and how to fuse both without doubling cost with nothing to measure.
April 20268 min read
AI · Language
Why most Arabic AI bots fail.
It is not the model. It is that we train it on Arabic no one actually speaks, then act surprised when no one understands it back.
April 20268 min read
AI · Infrastructure
What is PagedAttention — and what it changed in LLM serving.
Serving bottlenecks were not always raw GPU speed; they were often KV cache waste. PagedAttention changed the equation by treating KV memory as pageable blocks instead of large contiguous reservations, cutting waste and lifting throughput on the same hardware.
April 20269 min read
AI · Models
How the Transformer works — a plain-language guide.
"Attention Is All You Need" changed the industry, but it does not belong in a product review meeting. This is the version for builders: one mechanism called attention, reweighting importance between tokens based on context — without a single equation.
April 202610 min read
AI · Policy
Oman Vision 2040 and AI — what changed in 2026.
For years, AI in Oman was mostly discussed as part of digital-transformation rhetoric. In 2026, the frame shifted toward executable programs: economic targets, national platforms, and governance tied to delivery. The question is no longer "should we adopt AI?" but "where does AI create measurable value now?"
April 20269 min read
AI · Policy
Oman's Personal Data Protection Law (2022) and its impact on AI.
AI does not run in a legal vacuum. Oman's PDPL (Royal Decree 6/2022) changed how teams collect data, train models, and move personal data across borders. The key question is no longer only "is the model accurate?" but also "is its data lifecycle lawful?"
April 20269 min read
AI · Infrastructure
What is the H100 GPU — and why it became AI's reference hardware.
It is not a gaming card in a tower PC. It is the unit cloud bills and SLAs often anchor to when they say "GPU hour." H100 is not magic — it became a shared reference because hardware, software, and hyperscaler catalogs aligned on it for a full training era.
April 202610 min read
AI · Startups
AI startups in Muscat — who is building what.
Muscat’s AI startup scene is no longer a loose set of demos. It is becoming a clearer market map: vertical product builders, model-language teams, integration players, and AI operations tools. The core question is no longer "who has AI" but "who ships measurable value."
April 20269 min read
AI · Models
GPT-4 vs Claude vs Gemini — an objective comparison.
This is not a popularity vote. It is a decision frame: what differentiates each family, where each leads, where each weakens, and how to choose without buying the myth of a single "best" model.
April 20269 min read
AI · Models
What is fine-tuning — and how it differs from prompting.
Half the meetings say "we will tune the model" while they mean "we will rewrite the prompt." The two complement each other — but one changes the text going in, and the other can change the model's weights. That distinction clarifies the decision and saves you from training costs you did not need.
April 20269 min read
AI · Logistics
AI in Omani ports — Port of Salalah as a case.
Port competitiveness is no longer won by geography alone. It is won by operational decision speed: berth allocation, yard flow, and maintenance before breakdown. In that context, Salalah illustrates how AI turns delayed reports into live operating decisions.
April 20269 min read
AI · Digital Government
AI in Omani e-government services.
Government AI is no longer a tech slogan. In Oman, the practical question is now: can AI make services faster, clearer, and cheaper while preserving trust and privacy? Success is measured by real transaction performance, not initiative count.
April 20269 min read
AI · Policy
AI in Middle East healthcare — regulatory challenges.
Health AI is accelerating technically, but regulation remains the harder gate: sensitive data, medical-software classification, cross-border transfer constraints, and clinical accountability. In the Middle East, successful health AI starts with compliance architecture, not model demos.
April 202610 min read
AI · Tourism
AI and tourism in Oman — smart recommendation or marketing noise.
Almost every tourism platform now claims to be "AI-powered." The real test is simple: did recommendations lift conversion, improve visitor experience, and respect data boundaries? In Oman, the difference between value and hype is now measurable.
April 20269 min read
AI · Models
What is a large language model — complete guide for 2026.
This is not a glossary entry. It is the operating calculation behind LLM decisions in 2026: how the model works, where it fails, and how to choose the right deployment path.
April 202612 min read
Sovereignty · Infrastructure
Digital sovereignty: why your data should stay in Oman.
When you send your customers' data to a server in Frankfurt or Virginia, you are not hosting it. You are handing it over. The difference is not technical.
March 20267 min read
Founders · Muscat
Building a startup from Muscat — what we learned in year one.
There is no playbook for building a tech company from Oman. This is not a playbook. These are notes from one year, honest, including what we regret.
February 20269 min read