Singularity

244 readers

1 users here now

The technological singularity—or simply the singularity—is a hypothetical future point in time at which technological growth becomes uncontrollable and irreversible, resulting in unforeseeable changes to human civilization. According to the most popular version of the singularity hypothesis, I. J. Good's intelligence explosion model, an upgradable intelligent agent will eventually enter a "runaway reaction" of self-improvement cycles, each new and more intelligent generation appearing more and more rapidly, causing an "explosion" in intelligence and resulting in a powerful superintelligence that qualitatively far surpasses all human intelligence.

— Wikipedia

This is a community for discussing theoretical and practical consequences related to the singularity, or any other innovation in the realm of machine learning capable of potentially disrupting our society.

You can share news, research papers, discussions and opinions. This community is mainly meant for information and discussion, so entertainment (such as memes) should generally be avoided, unless the content is thought-provoking or has some other qualities.

Rules:

Be nice to everyone, even if you disagree.
No spam. No ads.
No NSFW.
Self-promotion is acceptable if not excessive (i.e. no spam).

founded 2 years ago

MODERATORS

ndr

Researchers from Microsoft and UC Santa Barbara Propose LONGMEM: An AI Framework that Enables LLMs to Memorize Long History (www.marktechpost.com)

submitted 2 years ago by megaman1970 to c/singularity

0 comments fedilink hide all child comments

In this paper authors from UCSB and Microsoft Research propose the LONGMEM framework, which enables language models to cache long-form prior context or knowledge into the non-differentiable memory bank and take advantage of them via a decoupled memory module to address the memory staleness problem. They create a revolutionary residual side network (SideNet) to achieve decoupled memory. A frozen backbone LLM is used to extract the paired attention keys and values from the previous context into the memory bank. The resulting attention query of the current input is utilized in the SideNet’s memory-augmented layer to access cached (keys and values) for earlier contexts. The associated memory augmentations are then fused into learning hidden states via a joint attention process.

Paper:

Augmenting Language Models with Long-Term Memory

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here