Machine Learning - Theory | Research

74 readers
1 users here now

We follow Lemmy’s code of conduct.

Communities

Useful links

founded 1 year ago
MODERATORS
51
52
53
54
 
 

https://arxiv.org/abs/2306.09200

Historical Replay for Reinforcement Learning By Lilian Weng Word count: 575 words Average read time: 3 minutes Source code: Link

Summary: Historical replay is a technique used in reinforcement learning where the agent replays and re-learns from past experiences. This is in contrast to online learning where the agent learns sequentially from each new experience. Historical replay provides two main benefits:

  1. Breaks correlation between consecutive experiences. By sampling experiences from the past at random, the agent avoids overfitting to recent experiences. This leads to more robust learning.

  2. Allows for off-policy learning. The agent can learn from experiences generated by other behavior policies, not just the current target policy. This exposes the agent to a more diverse set of experiences, enabling better exploration.

There are two common ways to implement historical replay:

  1. Experience replay - Store experiences in a buffer and sample uniformly from the buffer. This breaks correlation and enables off-policy learning.

  2. Prioritized experience replay - Weight sampling so that important, rare experiences have a higher chance of being selected. This can accelerate learning.

Historical replay is a core technique used in modern deep reinforcement learning to achieve good performance, especially in complex environments. When combined with a target network, it enables stable deep Q-learning.

This content provides a good overview of how historical replay works in the context of reinforcement learning. The main concepts around breaking correlation, enabling off-policy learning, experience replay buffers, and prioritized experience replay are clearly explained.

The content would be highly applicable to developing reinforcement learning systems using neural networks as function approximators. Experience replay is crucial for training deep Q-networks and actor-critic algorithms. The concepts would extend to future applications of large language models and GANs for reinforcement learning as well. Overall this is a helpful resource for understanding a foundational RL technique.

55
 
 

STAR-ML is a generalized tool that can assess the quality of the reporting of Machine Learning (ML) in research articles quickly and consistently. This new tool will allow for filtering ML-related papers that can be included in a systematic or scoping review by ensuring transparent, reproducible, and correct screening of research for inclusion in the review article. It can also be utilized as a guideline when drafting a new manuscript to improve the quality of ML technique reported.

We are currently conducting a study to validate STAR-ML in a larger population and are currently looking for individuals with varying degrees of experience.

To be able to participate in the study:

  • You should be 18 years or older
  • Have a basic idea of ML
  • Have a Bachelor's degree or higher, or be currently enrolled in an undergraduate granting institution and
  • Be able to read and understand English Please note that participation is voluntary. This web-based survey is anonymous and confidential. No one will be able to match identities to individual survey responses.

The survey will take approximately 50 to 60 minutes to complete. To thank you for your time and acknowledge your contribution to this collective effort, you will be able to be entered into a draw for one of the 10 $50 Gift cards at the end of the study.

Note: I am not personally involved with this study but am trying to help them get more participants

56
 
 

https://arxiv.org/ftp/arxiv/papers/2306/2306.07377.pdf

Title: Lost in Translation: Large Language Models in Non-English Content Analysis

Authors: Gabriel Nicholas and Aliya Bhatia

Word Count: Approximately 4,000

Estimated Read Time: 14-16 minutes

Summary:

The paper discusses the potential limitations of using large language models (LLMs), specifically multilingual language models, for analyzing non-English language content online. It provides an overview of how LLMs work, especially multilingual models that are trained on data from multiple languages. The authors note that LLMs tend to be trained mostly on English text and perform inconsistently across languages. They identify several challenges with using LLMs for non-English content analysis:

  1. Reliance on machine-translated text, which introduces errors

  2. Problems are difficult to identify and fix due to unintuitive cross-language connections

  3. Performance varies widely across languages

  4. Failure to account for local language contexts

The paper provides recommendations for companies, researchers, and governments on improving the use of multilingual LLMs. This includes making models more transparent, deploying them with caution, and investing in building capacity in low-resource languages.

Overall, the paper argues that while multilingual LLMs show promise, their current limitations pose risks, especially when deployed for high-stakes tasks like content moderation. More research and improved data is needed to enable equitable use of LLMs across languages.

For applications development, the paper suggests that multilingual LLMs should be used with caution for content analysis tasks. Due to the limitations noted, they are unlikely to be effective as stand-alone models for complex tasks like sentiment analysis or hate speech detection. Domain-specific training and human oversight would likely be needed. However, the more general representations learned by LLMs could potentially be incorporated into hybrid models for specific domains and languages.

57
 
 

https://arxiv.org/pdf/2306.05524.pdf

Title: Check Me If You Can: Detecting ChatGPT-Generated Academic Writing using CheckGPT

Authors: Zeyan Liu, Zijun Yao, Fengjun Li, Bo Luo

Word Count: Approximately 7,600

Estimated Read Time: 26-28 minutes

Github: https://github.com/progressionnetwork/CheckGPT_RestAPI

Summary:

The paper aims to investigate the use and misuse of ChatGPT in academic writing as well as the difficulty of detecting ChatGPT-generated text.

First, the authors collect a dataset of 600,000 human-written and ChatGPT-generated research paper abstracts in three disciplines. They identify three scenarios for ChatGPT usage: writing from scratch, completing partial text, and polishing existing text.

Second, the authors evaluate state-of-the-art detectors on the dataset and find that they provide unsatisfactory results, especially for polished text. A user study with 150+ participants shows that humans, including experienced researchers, are unable to accurately identify ChatGPT-generated abstracts.

The authors then propose CheckGPT, a novel detector that uses a pre-trained transformer model for representation and an attentive LSTM for classification. It achieves high accuracy (>98%) on the dataset and demonstrates transferability to new domains and models.

The key strengths of CheckGPT are:

  1. Affordability: It reuses pre-trained transformer models and requires less computation to deploy compared to fine-tuning the full transformer.

  2. Transferability: By learning generalized features, CheckGPT can be quickly adapted to new domains and tasks with minimum data.

  3. Interpretability: The authors conduct analyses to reveal how CheckGPT detects ChatGPT-generated writing.

In summary, CheckGPT provides an effective solution to the challenge of detecting ChatGPT-generated academic writing. The dataset, code and tool will be shared publicly for further research.

CheckGPT can be used to help monitor and enforce policies regarding the use of AI tools in academic publications. The elaborated investigation and insight in this study can inspire future research to combat the misuse of ChatGPT and similar large language models.

58
59
60
61
62
63
64
65
66
67
68
 
 

Inference-Time Intervention: Eliciting Truthful Answers from a Language Model

https://arxiv.org/pdf/2306.03341.pdf

Authors: Kenneth Li, Oam Patel, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg, Harvard University

Word Count: Approximately 3000 words

Estimated Read Time: Around 10-12 minutes

Code Repo: https://github.com/likenneth/honest_llama

The paper proposes a technique called Inference-Time Intervention (ITI) to enhance the truthfulness of large language models (LLMs). The idea is to shift model activations during inference along directions that are known to produce truthful answers.

The authors experiment with the LLaMA model on the TruthfulQA benchmark, which tests for truthful behavior. They find that ITI significantly improves LLaMA's performance, increasing its true*informative score from 32.5% to 65.1%.

ITI contrasts with existing approaches like RLHF that require huge resources. ITI is computationally inexpensive and data efficient, requiring only a few hundred examples to locate truthful directions.

However, the authors note that ITI by itself is not sufficient to ensure truthful answers from LLMs. But with additional testing and development, it could be useful as part of a more comprehensive approach.

In summary, ITI shows promise as a minimally invasive technique for improving the truthfulness of LLMs. The results suggest that LLMs may have an internal representation of the likelihood of something being true, even if they produce falsehoods on the surface.

In terms of applicability, ITI could potentially be used as one component in developing applications based on LLMs or GANs that require truthful or fact-checked responses. However, more research is needed to better understand ITI's limitations and trade-offs.

69
3
submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]
 
 

Title: AdANNS: A Framework for Adaptive Semantic Search

https://arxiv.org/pdf/2305.19435.pdf https://twitter.com/adityakusupati/status/1668295320445517824

Authors: Aniket Rege, Aditya Kusupati, Sharan Ranjit S, Alan Fan, Qingqing Cao, Sham Kakade, Prateek Jain and Ali Farhadi

Word Count: Approximately 5690

Estimated Read Time: 18-20 minutes

Source Code: The source code is available at https://github.com/RAIVNLab/AdANNS

Summary: The paper proposes AdANNS, a framework that leverages adaptive representations to improve the accuracy-compute tradeoff for Approximate Nearest Neighbor Search (ANNS) systems. Traditional ANNS systems use rigid representations at each stage of construction and inference, which can be sub-optimal.

AdANNS utilizes Matryoshka Representations (MRs) which have nested representations of varying dimensionalities. This allows AdANNS to use lower-dimensional representations for clustering and quantization to optimize accuracy and compute, while using higher-dimensional representations for precise re-ranking when feasible.

AdANNS improves existing ANNS building blocks like inverted file index (AdANNS-IVF) and quantization (AdANNS-OPQ). It also combines the two to create better composite ANNS indices (AdANNS-IVFOPQ and AdANNS-DiskANN) that achieve higher accuracy with lower compute cost compared to baselines.

Overall, AdANNS shows gains of up to 1.5% in accuracy for the same compute cost over existing techniques, and matches accuracy while being up to 90x faster in deployment. It also generalizes across search structures, modalities and encoders.

Applicability to Large Language Models: AdANNS's techniques of using adaptive representations to optimize accuracy and compute could potentially be applied to Large Language Models. Some applications could be:

Using lower-dimensional word/token embeddings for clustering or quantization during training, and higher-dimensional embeddings for precise fine-tuning or inference. Dynamically switching between dimensionalities during inference based on available compute to achieve tradeoffs between latency and accuracy. Improving the efficiency of nearest neighbor searches within the model, which is useful for tasks like entity linking, knowledge retrieval, etc. Overall, the techniques presented in the paper are likely to be broadly applicable across different machine learning models that rely on accurate but efficient semantic search and retrieval.

70
71
72
73
74
75
 
 

Title: Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

https://arxiv.org/pdf/2306.02858.pdf

Authors: Hang Zhang, Xin Li and Lidong Bing from DAMO Academy, Alibaba Group

Word Count: Approximately 2200

Read Time: Around 5-7 minutes

Source Code: The authors have open-sourced the entire codebase for pre-training and fine-tuning as well as the model weights at https://github.com/DAMO-NLP-SG/Video-LLaMA

Video-LLaMA is an audio-visual language model that aims to empower large language models with the ability to understand visual and auditory content in videos. It has two branches:

Vision-Language Branch: Uses a pre-trained image encoder for video frames and a Video Q-Former to generate visual query tokens that are compatible with the LLM's text embeddings.

Audio-Language Branch: Uses a pre-trained ImageBind audio encoder and an Audio Q-Former to generate audio query tokens that align with the LLM's embeddings.

Video-LLaMA is trained in a multi-branch fashion:

The vision components are first pre-trained on video caption datasets to learn video-text correspondence.

They are then fine-tuned on instruction-following datasets to gain vision-instruction ability.

The audio components leverage the shared embedding space from ImageBind and are trained on visual datasets due to lack of audio data.

The model demonstrates the ability to perceive and comprehend video content, generating meaningful responses grounded in visual and audio information.

In summary, Video-LLaMA shows potential as a prototype for audio-visual AI assistants but has limitations like limited perception capacity and handling of long videos.

This model demonstrates how large language models can be extended with multimodal capabilities through a modular approach, leveraging pre-trained vision and audio encoders. With further improvements, such text-to-video understanding models could enable various applications like video summarization, visual dialogue systems, etc.

view more: ‹ prev next ›