Abdoulaye Diack

Category: Google & AI

Colab Updates: Julia Support and Gemini Data Science

Google Colab has been updated with interesting new features. Julia is now supported natively, so no more need for workarounds! Plus, the Gemini Data Science agent is now more widely accessible. This agent lets you query data through simple prompts, like asking for trend visualizations or model comparisons. It aims to reduce the time spent on tasks like data loading and library imports. This can, for example, contribute to faster prototyping and more efficient data exploration.

Blog on the Gemini Data Science Agent: https://developers.googleblog.com/en/data-science-agent-in-colab-with-gemini/

March 9, 2025
Managing ML Projects: A Guide for Beginners and Professionals

How do you manage ML projects? 🤔 A question I hear often!
Working in research over the years, I often got asked about the day-to-day of managing machine learning projects. That’s why I’m excited about Google’s new, FREE “Managing ML Projects” guide which I can now point to going forward. it’s only 90 minutes but a good start!

It can be useful for:

* Those entering the ML field 🚀: Providing a clear, structured approach.
* Professionals seeking to refine their ML project management skills.
* Individuals preparing for ML-related interviews: Offering practical insights and frameworks.

This guide covers:

* ML project lifecycle management.
* Applying established project management principles to ML.
* Navigating traditional and generative AI projects.
* Effective stakeholder collaboration.

If you’re curious about ML project management, or want to level up your skills, take a look!

https://developers.google.com/machine-learning/managing-ml-projects

March 1, 2025
SigLIP 2: Multilingual Vision-Language Encoders Released
Google DeepMind has released SigLIP 2, a family of Open-weight (Apache V2) vision-language encoders trained on data covering 109 languages, including Swahili. The released models are available in four sizes: ViT-B (86M), L (303M), So400m (400M), and g (1B).

Why is this important?

This release offers improved multilingual capabilities, covering 109 languages, which can contribute to more inclusive and accurate AI systems. It also features better image recognition and document understanding. The four model sizes offer flexibility and potentially increased accessibility for resource-constrained environments.

Models: https://github.com/google-research/big_vision/blob/main/big_vision/configs/proj/image_text/README_siglip2.md

Paper: SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

https://arxiv.org/pdf/2502.14786

HuggingFace Blog and Demo: https://huggingface.co/blog/siglip2

Google Colab: https://colab.research.google.com/github/google-research/big_vision/blob/main/big_vision/configs/proj/image_text/SigLIP2_demo.ipynb
```
Credits:  "SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features" by Michael Tschannen, Alexey Gritsenko, Xiao Wang, Muhammad Ferjad Naeem, Ibrahim Alabdulmohsin, Nikhil Parthasarathy, Talfan Evans, Lucas Beyer, Ye Xia, Basil Mustafa, Olivier Hénaff, Jeremiah Harmsen, Andreas Steiner, and Xiaohua Zhai (2025).
```
February 22, 2025
SMOL: New Open-Source Dataset for Low-Resource Language Machine Translation
🎉 My colleagues and members of the language community have released SMOL, a new open-source dataset (CC-BY-4) designed for machine translation research. SMOL includes professionally translated parallel text for over 115 low-resource languages, with a significant representation of over 50 African languages. This dataset is intended to provide a valuable resource for researchers working on machine translation for under-represented languages.

Kindly check the paper for more details including limitations of this dataset.

Paper: https://arxiv.org/pdf/2502.12301
Dataset: https://huggingface.co/datasets/google/smol

List of languages:

Afar
Acoli
Afrikaans
Alur
Amharic
Bambara
Baoulé
Bemba (Zambia)
Berber
Chiga
Dinka
Dombe
Dyula
Efik
Ewe
Fon
Fulfulde
Ga
Hausa
Igbo
Kikuyu
Kongo
Kanuri
Krio
Kituba (DRC)
Lingala
Luo
Kiluba (Luba-Katanga)
Malagasy
Mossi
North Ndebele
Ndau
Nigerian Pidgin
Oromo
Rundi
Kinyarwanda
Sepedi
Shona
Somali
South Ndebele
Susu
Swati
Swahili
Tamazight
Tigrinya
Tiv
Tsonga
Tumbuka
Tswana
Twi
Venda
Wolof
Xhosa
Yoruba
Zulu

Credits:
```
Isaac Caswell and Elizabeth Nielsen and Jiaming Luo and Colin Cherry and Geza Kovacs and Hadar Shemtov and Partha Talukdar and Dinesh Tewari and Baba Mamadi Diane and Koulako Moussa Doumbouya and Djibrila Diane and Solo Farabado Cissé. SMOL: Professionally translated parallel data for 115 under-represented languages.
```
February 22, 2025
Getting Started with Gemini 2.0 on Linux and MacOS 💻
This guide provides instructions for setting up the Gemini 2.0 web console on Linux and MacOS, including solutions to common issues.

Prerequisites ✅
- Node.js and npm: Ensure you have supported versions installed. Use node -v and npm -v to check.
- Gemini AI Key: Obtain your API key from https://aistudio.google.com/apikey. 🗝️
- Google Chrome: Recommended for optimal performance. 🌐
Installing Node.js and npm on Linux (e.g Ubuntu) 🐧

You can install Node.js and npm on Ubuntu using the following methods:

Using apt:
1. Update package lists: sudo apt update 🔄
2. Install Node.js and npm: sudo apt install nodejs npm
Installing Node.js and npm on MacOS 🍎

If necessary, install Node.js and npm using one of these methods:

Official Installer:
1. Download the macOS installer from https://nodejs.org. ⬇️
2. Run the installer.
3. Verify installation with node -v and npm -v. ✅
nvm:
1. Install Node.js: nvm install 22 (restart your terminal if needed). 🔄
2. Verify versions: node -v and npm -v. ✅
Homebrew:
1. Install Node.js: brew install node@22. 🍺
2. Verify versions: node -v and npm -v. ✅
Setting Up the Gemini Web Console 🕹️
1. Clone the repository:git clone https://github.com/google-gemini/multimodal-live-api-web-console.git cd multimodal-live-api-web-console
2. Update environment variables:
  - In the .env file, add your API key:# create your own API KEY at https://aistudio.google.com/apikey REACT_APP_GEMINI_API_KEY='<YOUR_GEMINI_API_KEY>'
    
    Activate the key variable:source .env
3. Update public/index.html:
  - Add the following meta tag within the <head> section to enable the web console scripts to run:
  <meta http-equiv="Content-Security-Policy" content="script-src 'self' 'wasm-unsafe-eval' 'inline-speculation-rules' http://localhost:3000 chrome-extension://* blob:;">
  - Important: This modifies the Content Security Policy (CSP) to allow the web console to function. To understand the security implications and best practices for CSP, it’s highly recommended to read more about it here: https://developer.chrome.com/docs/privacy-security/csp
  - Note: This CSP configuration is suitable for local development and testing. When deploying a production application, you should carefully review and adjust the CSP to ensure proper security and privacy measures are in place.
4. Install dependencies:npm install npm audit --force
5. Start the app:npm start
6. Access the console: Open localhost:3000 in your browser. 💻
You can now use the Gemini 2.0 API web console. See the documentation for examples and further information: https://github.com/google-gemini/multimodal-live-api-web-console
January 3, 2025
Google Translate Expands to 110 New Languages, Including 31 from Africa
Google Translate has taken a significant step towards greater inclusivity by adding 110 new languages to their service, including 31 from Africa. This expansion means that millions of people who previously lacked access to translation services now have a tool to communicate and connect with a wider world.

This achievement is the result of a concerted multi years effort by Isaac Caswell, the Google Translate Research team, and numerous community collaborators. They faced unique challenges, as building high-quality translations for languages with limited digital resources is complex. To address this, they developed an approach that relies on “monolingual” data – text in a single language – instead of solely relying on translated text. This method, called “zero-shot learning,” allows for the creation of translations for languages not explicitly trained on, though it’s important to remember that this is still a developing technology.

What does this mean?
- More Accessibility: People speaking these newly included languages now have a tool to access information, communicate, and break down language barriers.
- Language Preservation: This expansion helps preserve and promote less commonly used languages, which is crucial for maintaining linguistic diversity.
- New Opportunities: The inclusion of languages like Punjabi and Romani opens doors for those communities, aiding in digital navigation and accessing information.
While this is a significant step, it’s important to note that Google Translate, while utilizing powerful AI technology, is not a replacement for the expertise of professional translators, however the app is useful for millions of people. This new approach, powered by the Palm 2 language model will require ongoing refinement and feedback to improve accuracy.

The languages added are significant:

A few notables ones:
- Punjabi (Shahmukhi): This language, written in the Shahmukhi script, is spoken by millions in Pakistan and India. Its inclusion expands communication and access to information for a large community.
- Romani: Romani, the language of romani communities with presence in many european countries, has historically been underrepresented in technology. Its inclusion in Google Translate is a step towards recognizing and supporting this community.
- N’ko: Created in the 1940s, N’ko uses a unique script to unify Manding languages in West Africa. This addition supports literacy and cultural preservation efforts.
- Tamazight (with Tifinagh script): Tamazight is spoken by millions of Berber people in North Africa. Its inclusion acknowledges their cultural diversity and language heritage.
This expansion is a positive step towards a more inclusive digital world, but it’s important to be aware of its limitations. While AI is an exciting tool, it requires continual development and feedback to refine its capabilities. The addition of these new languages is a testament to the potential of technology to foster communication and understanding, but it’s crucial to remember that it’s a journey, not a destination.

Languages by Region

APAC
- Southern Asia
  - Bhutan: Dzongkha
  - India: Awadhi, Bodo, Khasi, Kokborok, Marwadi, Santali, Tulu
  - Nepal: Nepalbhasa (Newari)
  - Pakistan: Baluchi, Punjabi (Shahmukhi)
- Eastern Asia
  - China: Cantonese, Tibetan
  - Hong Kong: Cantonese
  - Tibet: Tibetan
- Southeast Asia
  - East Timor: Tetum
  - Indonesia: Acehnese, Balinese, Batak Karo, Batak Simalungun, Batak Toba, Betawi, Iban, Madurese, Makassar, Minang
  - Malaysia: Malay (Jawi)
  - Myanmar: Hakha Chin, Jingpo, Shan
  - Philippines: Bikol, Hiligaynon, Kapampangan, Pangasinan, Waray
- Melanesia
  - Fiji: Fijian
  - Papua New Guinea: Tok Pisin
- Micronesia
  - Guam: Chamorro
  - Micronesia: Chuukese
  - Marshall Islands: Marshallese
- Central Asia
  - Mongolia: Buryat
- Polynesia
  - Tahiti: Tahitian
  - Tonga Islands: Tongan
EMEA
- Western Asia
  - Afghanistan: Dari
- Northern Africa
  - Algeria: Tamazight
  - Morocco: Tamazight, Tamazight (Tifinagh)
  - Sudan: Acholi, Dinka, Luo, Nuer
- Eastern Europe
  - Austria: Romani
  - Bosnia and Herzegovina: Romani
  - Denmark: Romani
  - Finland: Romani
  - Germany: Romani
  - Hungary: Romani
  - Kosovo: Romani
  - Montenegro: Romani
  - North Macedonia: Romani
  - Poland: Romani, Silesian
  - Romania: Romani
  - Russia: Avar, Bashkir, Buryat, Chechen, Chuvash, Crimean Tatar, Komi, Meadow Mari, Ossetian, Tuvan, Udmurt, Yakut
  - Serbia: Romani
  - Slovakia: Romani
  - Sweden: Romani
  - Ukraine: Crimean Tatar
- Western Africa
  - Benin: Fon
  - Burkina Faso: Dyula
  - Côte d’Ivoire: Baoulé, Dyula
  - Gabon: Fon
  - Gambia: Wolof
  - Ghana: Dyula, Fon, Ga
  - Guinea: N’Ko, Susu
  - Guinea-Bissau: N’Ko, Susu
  - Mali: Dyula, N’Ko
  - Mauritania: Wolof
  - Nigeria: Fon, Tiv
  - Senegal: Wolof
  - Sierra Leone: Susu
  - Togo: Fon
- Southern Africa
  - Botswana: Tswana
  - Eswatini: Swati
  - Lesotho: Swati
  - South Africa: Ndebele (South), Swati, Tswana, Venda
- Eastern Africa
  - Burundi: Rundi
  - Ethiopia: Afar, Luo, Nuer
  - Kenya: Luo
  - Malawi: Tumbuka
  - Mauritius: Mauritian Creole
  - Mozambique: Ndau, Swati, Venda
  - Rwanda: Kiga
  - Seychelles: Seychellois Creole
  - South Sudan: Acholi, Dinka, Luo, Nuer
  - Tanzania: Bemba, Luo, Tumbuka
  - Uganda: Acholi, Alur, Kiga, Luo
  - Zambia: Bemba, Dombe, Tumbuka
  - Zimbabwe: Dombe, Ndau, Venda
- Middle Africa
  - Central African Republic: Sango
  - Chad: Sango
  - Congo: Kituba, Kikongo
  - DRC: Alur, Bemba, Kituba, Kikongo, Luba, Sango
- Northern Europe
  - Faroe Islands: Faroese
  - Isle of Man: Manx
  - Latvia: Latgalian
  - Norway: Sami (North), Romani
- Western Europe
  - France: Breton, Occitan
  - Netherlands: Limburgish, Papiamento
- Southern Europe
  - Italy: Friulian, Ligurian, Lombard, Sicilian, Venetian
  - Portugal: Portuguese (Portugal)
LATAM

Jamaica: Jamaican Patois

Central America

Guatemala: Q’eqchi’, Mam

Mexico: Nahuatl (Eastern Huaste), Q’eqchi’, Mam, Yucatec Maya, Zapotec

Belize: Q’eqchi’

South America

Brazil: Hunsrik

Caribbean

Caribbean Netherlands: Papiamento
June 27, 2024