Zoltan's FLOPs – GPU mini-grant, 1st iteration

Zoltan's FLOPs is a mini-grant program for GPU computing projects. The goal is to provide small-scale funding to help researchers and developers access GPU compute time on modern hardware.

Rules

Apply here.
Applications closed. Watch this site for results and future iterations.

Why?

A kid in a poor, post-communist country with a used 486 could pick up coding and create programs just like anyone with better means. Today, computing is becoming less accessible to the masses. AI and machine learning require expensive hardware and significant energy resources.

Even top universities struggle to keep up, falling behind for-profit organizations. Students and researchers find it difficult to get GPU time for their projects. Talented kids interested in AI may never get the compute they need.

Carbon footprint

To offset the impact of the GPU usage, 280kg of CO₂ was ordered to be permanently removed using Climeworks (2025-09-15-345359). My estimate of the computational carbon footprint is based on [1].

Results

This iteration received 29 applications. The following projects were awarded GPU credit.

Skin‑lesion CNN by Vansh Malani

Sophisticated CNNs will classify 14 000 dermoscopic images into the seven standard skin‑cancer diagnoses. GPU credits let the team train, tune and deploy a high‑accuracy model that could accelerate melanoma screening and aid clinicians worldwide.

Author’s statement of outcomes: With the credits, we trained two different models. The first was a lightweight MobileNet model trained on about 12,000 dermoscopic images. It reached around 83 percent validation accuracy while staying under 8 MB in size, which makes it practical for mobile use.

The second was a more comprehensive multimodal model that combined image features with patient metadata such as age, lesion location, and clinical notes. On a dataset of about 4,500 cases, this model achieved roughly 89 percent validation accuracy and showed strong potential for server-side deployment. We are now in the process of drafting a research paper based on these results.

SAR Image Coloring by Shamba Chowdhury

Builds a custom diffusion model to transform grayscale Synthetic Aperture Radar data into realistic optical‑style images, giving scientists an intuitive view where photography is impossible. High‑VRAM GPUs power diffusion training and possible custom CLIP work, culminating in an open‑source tool.

Author’s statement of outcomes: A core contribution of this project was the development of a new, large-scale dataset specifically for colorizing Synthetic Aperture Radar (SAR) images. This dataset contains over 280,000 paired images from the Sentinel-1 (SAR) and Sentinel-2 (optical) missions. The data was meticulously curated, selecting Sentinel-1 images captured in the Interferometric Wide (IW) swath mode with dual VV and VH polarization to ensure a balance of coverage and detail. Each image pair is enriched with metadata detailing its geographical temperature zone and the specific season, which was determined using a custom algorithm based on historical weather data. This careful preparation, including preprocessing steps like orthorectification and speckle filtering, created a high-quality foundation for training text-conditioned models.

This dataset was then used in a comparative study of three different Latent Diffusion Models (LDMs), chosen for their ability to use text for contextual guidance. The models included a fine-tuned Stable Diffusion v1.5 (using the InstructPix2Pix method), an 859M parameter LDM, and a 117M parameter LDM with a VQVAE, with the latter two being trained from scratch. Despite significant time constraints, the results were promising, with the fine-tuned Stable Diffusion model showing the best performance, achieving an average Structural Similarity Index (SSIM) of 0.3, compared to 0.1 and 0.2 for the other models. The dataset is openly available for browsing at its persistent DOI: https://doi.org/10.34740/kaggle/dsv/12113126. The complete source code for the models will be made available on GitHub after additional cleanup and documentation are completed.

Medical ML Workbench by Rounak Kumar Manjhi

A student team is tackling Alzheimer’s MRI, ECG‑based heart‑disease detection and stroke prediction on million‑record datasets. Deep networks run 50–500 epochs; daily experiments outstrip local hardware. GPU credits will cut each training cycle from hours to minutes and speed three forthcoming papers.

Author’s statement of outcomes: The grant provided essential computational resources for training and evaluating my machine learning models on different dataset variants. Using this support, I developed and tested hybrid deep learning architectures for heart disease prediction. The models achieved around 95% classification accuracy, which is a very promising outcome. Since the work is part of an upcoming research paper, I cannot disclose full details yet due to publication requirements, but I can confirm that the FLOPs grant played an important role in reaching this stage.

Prime‑Number Soundscape by Suryansh Shekhawat

GPUs sieve primes up to 10¹², spot gap “constellations” (twins, quadruplets, …) and map each pattern to notes or chords, generating a 30‑second audio piece plus statistics on constellation frequencies. The project turns number theory into music you can hear.

Author’s statement of outcomes: With the FLOPs grant I explored two directions: first, running experiments with Ulam polynomials in the search for prime-splitting families, and second, creating a geometrical description (using bouncing diagonals) of classical sieving methods such as the Sieve of Eratosthenes. The computational resources allowed me to generate and store data for further post-processing, which I hope to develop into publishable work. Beyond the math, I also sought to express this numerical jungle through visuals and music. https://www.instagram.com/p/DNdRdsdpbKY/

RNA Structure Prediction by Soltani Abdellatif

Aims to bring AlphaFold‑style advances to RNA by assembling augmented secondary‑structure data and testing new deep architectures that work despite sparse alignments. GPU power enables large‑scale experiments toward accurate 2‑D and 3‑D RNA folding, unlocking insights into gene regulation and therapeutics.

Author’s statement of outcomes: The credits were used to advance my study on RNA structure prediction. In the case of RNA secondary structure prediction, the grant allowed for the development of new methods that achieved state-of-the-art results compared to all existing approaches. For RNA 3D structure prediction, we were not able to scale training to the same level of data size and model size as other models in the field due to the high computational demands. However, within comparable settings, we trained a baseline without the new method additions and demonstrated clear improvements over it, which gave us valuable insights into the potential of our approach for 3D modeling as well.

Fast Gravitational-Wave detection through CNN by Yuvraj Sharma

Optimizes a separable‑CNN to spot eccentric binary‑black‑hole merger signatures in LIGO images, reducing detection latency for future alerts. Training on 450 k simulated‑waveform images with CUDA‑accelerated TensorFlow demands significant GPU capacity.

Author’s statement of outcomes: The CNN-based approaches showed strong performance in both binary detection and three-class classification tasks. While non-eccentric signals were easier to identify, eccentric signals resembling noise required more sophisticated architectures. Incorporating residual connections, attention mechanisms, and careful learning rate scheduling improved robustness. GPU acceleration and distributed training played a key role in enabling these large-scale experiments.

CLIP-Powered Design Search by Naman Singhal

Creates an intelligent image-search engine for a large Google Drive design library. CLIP embeddings are stored in a vector database and served via web UI, enabling queries like “red abstract pattern” without manual tags. GPUs accelerate bulk embedding generation and future collection updates.

Author’s statement of outcomes: This project built an intelligent image-search engine for large design libraries powered by CLIP embeddings. Instead of relying on manual tagging, the system stores embeddings in a vector database and serves them through a web interface, allowing users to run natural language queries as well as image-to-image similarity searches and instantly retrieve relevant designs.

With the help of the FLOPs grant, the model was successfully trained on two large datasets. This significantly improved its performance, leading to more accurate and wide-ranging search outcomes.

Deep‑Learning Intrusion Detection by Madhi Madhan Libonce

Builds a hybrid supervised‑plus‑unsupervised deep‑learning engine that learns normal network behaviour and flags both known and novel cyber‑attacks in real time. Designed for cloud, enterprise and IoT networks, it adapts continuously to new threat patterns, beating static rule‑based IDS by cutting response times and preventing breaches.

The author decided to withdraw their grant application.

Hierarchical Text Generation by Hermann Nyuykonge Kumbong

Develops a novel learning paradigm and architecture for generating text hierarchically, mirroring human writing by starting with abstract concepts and refining them into detailed expressions. Utilizes multi-resolution language modelling to predict text from low to high levels of representation, and optimizes GPU implementations for efficient training and inference.

Awarded due to residual funding in the 1st iteration.

Zoltan's FLOPs GPU mini-grant

1st iteration

What is this?