this post was submitted on 22 Aug 2023
4 points (100.0% liked)

Singularity

226 readers
1 users here now

The technological singularity—or simply the singularity—is a hypothetical future point in time at which technological growth becomes uncontrollable and irreversible, resulting in unforeseeable changes to human civilization. According to the most popular version of the singularity hypothesis, I. J. Good's intelligence explosion model, an upgradable intelligent agent will eventually enter a "runaway reaction" of self-improvement cycles, each new and more intelligent generation appearing more and more rapidly, causing an "explosion" in intelligence and resulting in a powerful superintelligence that qualitatively far surpasses all human intelligence.

— Wikipedia

This is a community for discussing theoretical and practical consequences related to the singularity, or any other innovation in the realm of machine learning capable of potentially disrupting our society.

You can share news, research papers, discussions and opinions. This community is mainly meant for information and discussion, so entertainment (such as memes) should generally be avoided, unless the content is thought-provoking or has some other qualities.

Rules:

founded 1 year ago
MODERATORS
ndr
 

Abstract

The robustness of legged locomotion is crucial for quadrupedal robots in challenging terrains. Recently, Reinforcement Learning (RL) has shown promising results in legged locomotion and various methods try to integrate privileged distillation, scene modeling, and external sensors to improve the generalization and robustness of locomotion policies. However, these methods are hard to handle uncertain scenarios such as abrupt terrain changes or unexpected external forces. In this paper, we consider a novel risk-sensitive perspective to enhance the robustness of legged locomotion. Specifically, we employ a distributional value function learned by quantile regression to model the aleatoric uncertainty of environments, and perform risk-averse policy learning by optimizing the worst-case scenarios via a risk distortion measure. Extensive experiments in both simulation environments and a real Aliengo robot demonstrate that our method is efficient in handling various external disturbances, and the resulting policy exhibits improved robustness in harsh and uncertain situations in legged locomotion.

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here