Machine Learning - Theory

26

3

Towards Measuring the Representation of Subjective Global Opinions in Language Models (lemmy.intai.tech)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

"LLM responses tend to be more similar to the opinions of certain populations, such as the USA, some European and South American countries, highlighting the potential for biases."

Paper

27

3

GLUE: A MULTI-TASK BENCHMARK AND ANALYSIS PLATFORM FOR NATURAL LANGUAGE UNDERSTANDING (lemmy.intai.tech)

submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]

0 comments fedilink

28

3

Learning to Prove Theorems via Interacting with Proof Assistants (lemmy.intai.tech)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

29

5

EXTENDING CONTEXT WINDOW OF LARGE LANGUAGE MODELS VIA POSITION INTERPOLATION (lemmy.intai.tech)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

https://arxiv.org/pdf/2306.15595.pdf

30

8

Scaling MLPs - A Tale of Inductive Bias (lemmy.intai.tech)

submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]

0 comments fedilink

paper

31

6

Density Uncertainty Layers for Reliable Uncertainty Estimation (lemmy.intai.tech)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

https://arxiv.org/pdf/2306.12497.pdf

32

4

Semantic reconstruction of continuous language from non-invasive brain recordings (lemmy.intai.tech)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

https://www.biorxiv.org/content/10.1101/2022.09.29.509744v1.full.pdf

33

5

Inflection-1: Pi’s Best-in-Class LLM (inflection.ai)

submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]

1 comments fedilink

https://twitter.com/Teknium1/status/1672781142284701698

34

4

Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers (lemmy.intai.tech)

submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]

0 comments fedilink

https://arxiv.org/pdf/2212.10559.pdf

https://github.com/microsoft/lmops

35

4

Lmao these new AI influencers are so redacted. This is the literal opposite of the foundations of DL. D (lemmy.intai.tech)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

https://twitter.com/cauchyfriend/status/1672110742127153153

when wading through twitter you run across things that look great then we find there is debate, the question is, is it proper academic disagreement or are we re-posting a nutter?

this looks academic thankfully.

36

3

Direct Preference Optimization - Your Language Model is Secretly a Reward Model (lemmy.intai.tech)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

https://arxiv.org/pdf/2305.18290.pdf

37

3

We're still not using The One True Embedding, XPOS. (lemmy.intai.tech)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

https://twitter.com/teortaxesTex/status/1672060406813032449

38

4

LMFlow - An Extensible Toolkit for Finetuning and Inference of Large Foundation Models (lemmy.intai.tech)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

[Tweet](https://twitter.com/SourabhAgr03/status/1671884579203063812}

paper

39

3

AI that uses sketches to detect objects within an image could boost tumor detection, and search for rare bird species (www.sciencedaily.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

40

7

Textbooks Are All You Need (lemmy.intai.tech)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

Found here

Paper - Textbooks Are All You Need

41

3

Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4 (lemmy.intai.tech)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

https://arxiv.org/pdf/2305.00118.pdf

42

3

You all heard about the new paper claiming that you can train on 1000 samples & nearly match GPT-4. The public FOMO nowadays often take such claims out of context for papers like this one and cast a (lemmy.intai.tech)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

43

5

Deep learning: the unintuitive relationship between overparameterization, overfitting and generalization (brain.harvard.edu)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

cross-posted from: https://sh.itjust.works/post/225391

See also: the phenomenon of double descent.

44

3

What's the kernel trick? (explanation) (eranraviv.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

cross-posted from: https://sh.itjust.works/post/223997

A nice visualization/example of the kernel trick. A more mathematical explanation can be found here.

45

4

Inverse Scaling - When Bigger Isn’t Better (lemmy.intai.tech)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

https://arxiv.org/pdf/2306.09479.pdf

46

4

CAJun - Continuous Adaptive Jumping using a Learned Centroidal Controller (lemmy.intai.tech)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

https://twitter.com/_akhaliq/status/1670650476004556800 https://arxiv.org/pdf/2306.09557.pdf

47

3

OCTScenes - A Versatile Real-World Dataset of Tabletop Scenes for Object-Centric Learning (lemmy.intai.tech)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

https://arxiv.org/pdf/2306.09682.pdf

48

4

Demystifying GPT Self-Repair for Code Generation (lemmy.intai.tech)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

https://arxiv.org/pdf/2306.09896.pdf

49

2

Chatbots to ChatGPT in a Cybersecurity Space - Evolution, Vulnerabilities, Attacks, Challenges, and Future Recommendations (lemmy.intai.tech)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

https://arxiv.org/pdf/2306.09255.pdf

50

2

Large language models predict human sensory judgments across six modalities (lemmy.intai.tech)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

https://arxiv.org/pdf/2302.01308.pdf

Title: Large language models predict human sensory judgments across six modalities

Authors: Raja Marjieh, Ilia Sucholutsky, Pol van Rijn, Nori Jacoby, and Thomas L. Griffiths

2366 words

Estimated read time: 7 minutes 58 seconds

Source code: https://tinyurl.com/fudaby5p

Supporting link: https://computational-audition.github.io/LLM-psychophysics/all-modalities.html

Summary: This research investigates whether large language models can predict human sensory and perceptual judgments across six modalities: pitch, loudness, colors, consonants, taste and timbre.

The researchers prompt language models like GPT-3 and GPT-4 to predict similarity judgments between sensory stimuli in each modality. They find that the language models are able to provide judgments that correlate significantly with human data, recovering known perceptual representations like the color wheel and pitch spiral.

The researchers also use a color naming task to show that language models can exhibit language-dependent representations, replicating cross-linguistic differences found in humans between English and Russian color naming.

Applicability to LLM development: This research demonstrates that large language models capture surprisingly rich perceptual and sensory information from textual data alone. This has important implications for applications involving:

• Interpreting and debugging model behavior: analyzing the representations models have learned can provide insights into potential biases and limitations.

• Multi-modal modeling: combining textual and visual or auditory data may further improve model performance for perceptual tasks.

• Cross-cultural modeling: modeling human representations across languages can help make applications more inclusive and robust.

• Prompt engineering: fine-tuning prompts can increase the sensitivity of language models to particular psychological phenomena.

In summary, understanding how language models represent perceptual concepts can inform the development of more human-like and reliable artificial intelligence.

Machine Learning - Theory | Research

Communities

Useful links