Abdoulaye Diack

Author: Abdoulaye

Visualizing equations and functions using Gemini and Three.js (Vibe coded )

Visualizing Machine Learning: An Interactive 3D Guide to Gradient Descent & SVMs

From Gaussian Curves to the Heat Equation

May 4, 2025
TxGemma Release: AI Models for Therapeutics Development 🧪🔬
Google DeepMind has released TxGemma, a set of open-weight AI models designed for therapeutic development. These models, based on the Gemma architecture, are trained to analyze and predict characteristics of therapeutic entities during drug discovery. 💊

The release includes ‘chat’ variants (9B and 27B) that can engage in dialogue and provide explanations for their predictions. Additionally, Agentic-Tx demonstrates the integration of TxGemma into an agentic system for multi-step research questions. 🤖

A fine-tuning notebook is available for custom task adaptation:
- https://github.com/google-gemini/gemma-cookbook/blob/main/TxGemma/%5BTxGemma%5DFinetune_with_Hugging_Face.ipynb
Execution is possible on a free T4 GPU after license acceptance and Hugging Face token provision:
- https://huggingface.co/google/txgemma-2b-predict
If you encounter issues with the provided fine-tuning notebook, you can check my pre-configured Colab notebook:
- https://colab.research.google.com/github/adiack/Gemma/blob/main/_txgemma_finetune_with_hugging_face.ipynb
Further resources:
- Inference (TxGemma-Chat) notebook: https://github.com/google-gemini/gemma-cookbook/blob/main/TxGemma/%5BTxGemma%5DQuickstart_with_Hugging_Face.ipynb
- Agent (Agentic-TX) notebook: https://github.com/google-gemini/gemma-cookbook/blob/main/TxGemma/%5BTxGemma%5DAgentic_Demo_with_Hugging_Face.ipynb
- Detailed information: https://developers.googleblog.com/en/introducing-txgemma-open-models-improving-therapeutics-development/
Credit for this release: Shekoofeh Azizi and other contributors. 🎉
March 29, 2025
Gemma 3: Massive Context, 35+ Languages, and Multimodal Capabilities

🚨 Gemma 3 is out! It’s a family of open AI models (1B-27B parameters) featuring a 128k token context window (can work with very long documents and conversations), multilingual support (35+ languages, trained on 140+), and single GPU/TPU compatibility. I’m excited about its potential to increase accessibility to advanced AI models, especially in resource-constrained settings, and the multimodal capabilities that can enable diverse applications.

Blog: https://blog.google/technology/developers/gemma-3/

Technical report: https://storage.googleapis.com/deepmind-media/gemma/Gemma3Report.pdf

Developer guide: https://developers.googleblog.com/en/introducing-gemma3/

March 16, 2025
Spatial Queries on Hout Bay Data Using Gemini ‘s DataScience Agent

I tested the Gemini Datascience agent with the Hout Bay (Cape Town, South Africa) building data footprint, asking simple spatial questions, “show me small houses” and “identify crowded areas” “what about large houses with few neighbors”. The agent generates interesting visualizations and can select various algorithms, for example it picked k-Nearest Neighbors (k-NN) to detect houses with adjacent neighbors. I spent wayyy too much time on this, but I really liked the interactive aspect to make refinements iteratively by just making suggestions and asking for alternatives, kind of chatting with a Datascience expert :). I guess you would call this Conversational geospatial data analysis?

March 9, 2025
Colab Updates: Julia Support and Gemini Data Science

Google Colab has been updated with interesting new features. Julia is now supported natively, so no more need for workarounds! Plus, the Gemini Data Science agent is now more widely accessible. This agent lets you query data through simple prompts, like asking for trend visualizations or model comparisons. It aims to reduce the time spent on tasks like data loading and library imports. This can, for example, contribute to faster prototyping and more efficient data exploration.

Blog on the Gemini Data Science Agent: https://developers.googleblog.com/en/data-science-agent-in-colab-with-gemini/

March 9, 2025
10Gbps Over 1km: Taara’s Incredible Silicon Photonics Breakthrough

I find this simply incredible. This new Taara chip is smaller than a fingernail, yet it can transmit data at 10 gigabits per second over a 1KM DISTANCE! 🤯🤯🤯

“In tests at the Moonshot Factory labs, our team has successfully transmitted data at 10 Gbps (gigabits per second) over distances of 1 kilometer outdoors using two Taara chips. We believe this is the first time silicon photonics chips have transmitted such high-capacity data outdoors at this distance. And this is just the beginning. We plan to extend both the chip’s range and capacity by creating an iteration with thousands of emitters.”

The previous version of Taara, the light bridge, steered light beams mechanically using mirrors and sensors. Now, they’ve shrunk it to the size of a coin, replacing much of the hardware with software.
I’ve been a huge fan of this project for many years, and it’s exciting to see this ‘moonshot’ turning into reality. It can bring high-speed internet to underserved regions, change how data centers operate and so much more. Huge congrats to the Taara team!

https://x.company/blog/posts/taara-chip

March 1, 2025
Managing ML Projects: A Guide for Beginners and Professionals

How do you manage ML projects? 🤔 A question I hear often!
Working in research over the years, I often got asked about the day-to-day of managing machine learning projects. That’s why I’m excited about Google’s new, FREE “Managing ML Projects” guide which I can now point to going forward. it’s only 90 minutes but a good start!

It can be useful for:

* Those entering the ML field 🚀: Providing a clear, structured approach.
* Professionals seeking to refine their ML project management skills.
* Individuals preparing for ML-related interviews: Offering practical insights and frameworks.

This guide covers:

* ML project lifecycle management.
* Applying established project management principles to ML.
* Navigating traditional and generative AI projects.
* Effective stakeholder collaboration.

If you’re curious about ML project management, or want to level up your skills, take a look!

https://developers.google.com/machine-learning/managing-ml-projects

March 1, 2025
SigLIP 2: Multilingual Vision-Language Encoders Released
Google DeepMind has released SigLIP 2, a family of Open-weight (Apache V2) vision-language encoders trained on data covering 109 languages, including Swahili. The released models are available in four sizes: ViT-B (86M), L (303M), So400m (400M), and g (1B).

Why is this important?

This release offers improved multilingual capabilities, covering 109 languages, which can contribute to more inclusive and accurate AI systems. It also features better image recognition and document understanding. The four model sizes offer flexibility and potentially increased accessibility for resource-constrained environments.

Models: https://github.com/google-research/big_vision/blob/main/big_vision/configs/proj/image_text/README_siglip2.md

Paper: SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

https://arxiv.org/pdf/2502.14786

HuggingFace Blog and Demo: https://huggingface.co/blog/siglip2

Google Colab: https://colab.research.google.com/github/google-research/big_vision/blob/main/big_vision/configs/proj/image_text/SigLIP2_demo.ipynb
```
Credits:  "SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features" by Michael Tschannen, Alexey Gritsenko, Xiao Wang, Muhammad Ferjad Naeem, Ibrahim Alabdulmohsin, Nikhil Parthasarathy, Talfan Evans, Lucas Beyer, Ye Xia, Basil Mustafa, Olivier Hénaff, Jeremiah Harmsen, Andreas Steiner, and Xiaohua Zhai (2025).
```
February 22, 2025
SMOL: New Open-Source Dataset for Low-Resource Language Machine Translation
🎉 My colleagues and members of the language community have released SMOL, a new open-source dataset (CC-BY-4) designed for machine translation research. SMOL includes professionally translated parallel text for over 115 low-resource languages, with a significant representation of over 50 African languages. This dataset is intended to provide a valuable resource for researchers working on machine translation for under-represented languages.

Kindly check the paper for more details including limitations of this dataset.

Paper: https://arxiv.org/pdf/2502.12301
Dataset: https://huggingface.co/datasets/google/smol

List of languages:

Afar
Acoli
Afrikaans
Alur
Amharic
Bambara
Baoulé
Bemba (Zambia)
Berber
Chiga
Dinka
Dombe
Dyula
Efik
Ewe
Fon
Fulfulde
Ga
Hausa
Igbo
Kikuyu
Kongo
Kanuri
Krio
Kituba (DRC)
Lingala
Luo
Kiluba (Luba-Katanga)
Malagasy
Mossi
North Ndebele
Ndau
Nigerian Pidgin
Oromo
Rundi
Kinyarwanda
Sepedi
Shona
Somali
South Ndebele
Susu
Swati
Swahili
Tamazight
Tigrinya
Tiv
Tsonga
Tumbuka
Tswana
Twi
Venda
Wolof
Xhosa
Yoruba
Zulu

Credits:
```
Isaac Caswell and Elizabeth Nielsen and Jiaming Luo and Colin Cherry and Geza Kovacs and Hadar Shemtov and Partha Talukdar and Dinesh Tewari and Baba Mamadi Diane and Koulako Moussa Doumbouya and Djibrila Diane and Solo Farabado Cissé. SMOL: Professionally translated parallel data for 115 under-represented languages.
```
February 22, 2025
Small Language Models: Notes from the past couple of weeks 🤖🤯

The past few days have brought interesting developments in small language models that could expand mobile computing and low-resource environment applications.

Here’s what caught my attention:

• Microsoft’s Phi was made fully open source (MIT license) and has been improved by Unsloth AI. 🚀🔓 Blog: https://unsloth.ai/blog/phi4

• Kyutai Labs based in Paris 🇫🇷 introduced Helium-1 Preview, a 2B-parameter multilingual base LLM designed for edge and mobile devices.

Model: https://huggingface.co/kyutai/helium-1-preview-2b

Blog: https://kyutai.org/2025/01/13/helium.html

• OpenBMB from China 🇨🇳, released MiniCPM-o 2.6, an 8B-parameter multimodal model that matches the capabilities of several larger models. Model: https://huggingface.co/openbmb/MiniCPM-o-2_6

• Moondream2 added gaze 👀 detection functionality with intestesting application for human-computer interaction and market research applications.

Blog: https://moondream.ai/blog/announcing-gaze-detection

• OuteTTS, a series of small Text-To-Speech model variants expanded to support 6 languages and punctuation for more natural sounding speech synthesis. 🗣️

Model: https://huggingface.co/OuteAI/OuteTTS-0.3-1B

These developments suggest continued progress in making language models more efficient and accessible and we’re likely to see more of this in 2025.

Note: Views on this post are my own opinion.

January 17, 2025