Technion – Israel Institute of Technology
Text-to-Image (TTI) models generate images based on text prompts, which often leave certain aspects of the desired image ambiguous. When faced with these ambiguities, TTI models have been shown to exhibit biases in their interpretations. These biases can have societal impacts, e.g., when showing only a certain race for a stated occupation. They can also affect user experience when creating redundancy within a set of generated images instead of spanning diverse possibilities. Here, we introduce MineTheGap – a method for automatically mining prompts that cause a TTI model to generate biased outputs. Our method goes beyond merely detecting bias for a given prompt. Rather, it leverages a genetic algorithm to iteratively refine a pool of prompts, seeking those that expose biases. This optimization process is driven by a novel bias score, which ranks biases according to their severity, as we validate on a dataset with known biases. For a given prompt, this score is obtained by comparing the distribution of generated images to the distribution of LLM-generated texts that constitute variations on the prompt.
MineTheGap is a genetic algorithm-based approach that optimizes over the high-dimensional discrete space of valid prompts for a TTI model. Since the space of prompts is non-differentiable, we formulate this as a gradient-free optimization problem, refining a population of candidate prompts through iterative selection and mutation, while injecting random candidates to introduce genetic diversity into the population and allow the algorithm to explore more areas of the solution space. Our general framework for optimizing over prompts, illustrated in the animation below, is applicable to any objective, while for our goal we use it with our novel bias score which we describe next.
To guide the selection process in our mining framework, we require a bias ranking function that assigns a score to each prompt that reflects the extent to which the TTI model exhibits bias when generating images for that prompt. Here, we propose an efficient, fully automatic approach that does not require specifying the nature of the bias in advance and provides an interpretable ranking of prompts. We overcome a key challenge in quantifying bias, which is determining the expected image distribution for a given prompt, by levraging an LLM to generate a set of diverse textual variations of the prompt. This set of variations explicitly models the different plausible meanings embedded within a single prompt, and is used as a reference for evaluating the TTI model’s outputs.
@article{cohen2025minethegap, title={MineTheGap: Automatic Mining of Biases in Text-to-Image Models}, author={Cohen, Noa and Spingarn-Eliezer, Nurit and Huberman-Spiegelglas, Inbar and Michaeli, Tomer}, journal={arXiv preprint arXiv:2512.13427}, year={2025} }
This webpage is based on the template made by Matan Kleiner with the
help of Hila Manor.
Icons are taken from Font Awesome and Academicons.