MineTheGap: Automatic Mining of Biases in Text-to-Image Models

Technion – Israel Institute of Technology

Arxiv

Text-to-Image (TTI) models generate images based on text prompts, which often leave certain aspects of the desired image ambiguous. When faced with these ambiguities, TTI models have been shown to exhibit biases in their interpretations. These biases can have societal impacts, e.g., when showing only a certain race for a stated occupation. They can also affect user experience when creating redundancy within a set of generated images instead of spanning diverse possibilities. Here, we introduce MineTheGap – a method for automatically mining prompts that cause a TTI model to generate biased outputs. Our method goes beyond merely detecting bias for a given prompt. Rather, it leverages a genetic algorithm to iteratively refine a pool of prompts, seeking those that expose biases. This optimization process is driven by a novel bias score, which ranks biases according to their severity, as we validate on a dataset with known biases. For a given prompt, this score is obtained by comparing the distribution of generated images to the distribution of LLM-generated texts that constitute variations on the prompt.

Mined Prompts

Mining open-set biases

TTI Model: Flux.1 Schnell

TTI Model: SD 3

TTI Model: SD 2.1

TTI Model: SD 1.4

Mining specific biases

Mined bias: Socio-demographic

TTI Model: Flux.1 Schnell

Mined bias: Socio-demographic

TTI Model: SD 3

Mined bias: Style

TTI Model: SD 1.4

Mined bias: Objects

TTI Model: Flux.1 Schnell

Mined bias: Socio-demographic

TTI Model: SD 1.4

Method Overview

MineTheGap is a genetic algorithm-based approach that optimizes over the high-dimensional discrete space of valid prompts for a TTI model. Since the space of prompts is non-differentiable, we formulate this as a gradient-free optimization problem, refining a population of candidate prompts through iterative selection and mutation, while injecting random candidates to introduce genetic diversity into the population and allow the algorithm to explore more areas of the solution space. Our general framework for optimizing over prompts, illustrated in the animation below, is applicable to any objective, while for our goal we use it with our novel bias score which we describe next.

Bias Score

To guide the selection process in our mining framework, we require a bias ranking function that assigns a score to each prompt that reflects the extent to which the TTI model exhibits bias when generating images for that prompt. Here, we propose an efficient, fully automatic approach that does not require specifying the nature of the bias in advance and provides an interpretable ranking of prompts. We overcome a key challenge in quantifying bias, which is determining the expected image distribution for a given prompt, by levraging an LLM to generate a set of diverse textual variations of the prompt. This set of variations explicitly models the different plausible meanings embedded within a single prompt, and is used as a reference for evaluating the TTI model’s outputs.

Bibtex

@article{cohen2025minethegap,
	title={MineTheGap: Automatic Mining of Biases in Text-to-Image Models},
	author={Cohen, Noa and Spingarn-Eliezer, Nurit and Huberman-Spiegelglas, Inbar and Michaeli, Tomer},
	journal={arXiv preprint arXiv:2512.13427},
	year={2025}
}

Acknowledgements

This webpage is based on the template made by Matan Kleiner with the help of Hila Manor.
Icons are taken from Font Awesome and Academicons.