Suchir Agarwal

suchir_agarwal.png

I’m a master’s student at Stanford University studying Computer Science with a focus in artificial intelligence. I previously completed my undergraduate degree in computer science and pure mathematics from the University of California, Berkeley.

I’m part of the Stanford Vision and Learning Lab (SVL) under Fei Fei Li. Previously at Berkeley, I was in Jennifer Listgarten’s ML for protein engineering lab.

research

  1. preprint
    gpic.jpeg
    GPIC: A Giant Permissive Image Corpus for Visual Generation
    arXiv preprint, 2026

    Studying scalable methods for visual generative modeling requires large, accessible, and stable datasets. We introduce GPIC, a Giant Permissive Image Corpus of approximately 28 trillion pixels. GPIC comprises diverse internet images captioned by a state-of-the-art vision-language model, including 100M training, 200K validation, and 1M test examples. Moreover, all GPIC images are permissively licensed for both research and commercial use. GPIC is safety-filtered, deduplicated, and centrally hosted on Hugging Face. We provide a benchmarking protocol for generative modeling on GPIC. Finally, we provide a reference baseline for pixel-space flow matching on GPIC.