https://arxiv.org/pdf/2302.01308.pdf
Title: Large language models predict human sensory judgments across six modalities
Authors: Raja Marjieh, Ilia Sucholutsky, Pol van Rijn, Nori Jacoby, and Thomas L. Griffiths
2366 words
Estimated read time: 7 minutes 58 seconds
Source code: https://tinyurl.com/fudaby5p
Supporting link: https://computational-audition.github.io/LLM-psychophysics/all-modalities.html
Summary: This research investigates whether large language models can predict human sensory and perceptual judgments across six modalities: pitch, loudness, colors, consonants, taste and timbre.
The researchers prompt language models like GPT-3 and GPT-4 to predict similarity judgments between sensory stimuli in each modality. They find that the language models are able to provide judgments that correlate significantly with human data, recovering known perceptual representations like the color wheel and pitch spiral.
The researchers also use a color naming task to show that language models can exhibit language-dependent representations, replicating cross-linguistic differences found in humans between English and Russian color naming.
Applicability to LLM development: This research demonstrates that large language models capture surprisingly rich perceptual and sensory information from textual data alone. This has important implications for applications involving:
• Interpreting and debugging model behavior: analyzing the representations models have learned can provide insights into potential biases and limitations.
• Multi-modal modeling: combining textual and visual or auditory data may further improve model performance for perceptual tasks.
• Cross-cultural modeling: modeling human representations across languages can help make applications more inclusive and robust.
• Prompt engineering: fine-tuning prompts can increase the sensitivity of language models to particular psychological phenomena.
In summary, understanding how language models represent perceptual concepts can inform the development of more human-like and reliable artificial intelligence.