inspxtr

joined 1 year ago
[–] inspxtr 6 points 1 year ago* (last edited 1 year ago)

inspired by iasip: pay the troll toll so u don’t get ur soul sucked, eat from its big bowl for free u might get ur hole ****

[–] inspxtr 2 points 1 year ago

This is actually an interesting question. First thing to note is that any estimation is by accounts, not by actual people (one person can have multiple alts on both). Honestly I don’t think it’s possible to have meaningful estimation.

That said, I think the first task is to figure out if we can estimate the number of accounts deleted on Reddit during the controversial period (let’s say April when the API change was starting) up til now.

I’m not aware whether there’s public daily data on it from Reddit, but there have been attempts at archiving reddit during this time and of course before. So one can theoretically use the archives to find out “all” existing users. And check the links now via browser (or curl) to see if they still exist, treat that as a good-enough proxy for deleted account.

One may get an estimate of when they were deleted by checking the links in the archives if possible. If not, there’s also Wayback machine that we may use to get a sense, but there are limitations of that.

Lemmy tracks account registration daily, I believe. I don’t know what stats one needs to run but maybe if we can line up the time series of account creation on Lemmy and account deletion on Reddit, we might have some sense of what a lower bound is for those who jumped ship forever.

[–] inspxtr 2 points 1 year ago

Let me see if I get your point. Are you saying most questions on Lemmy ask for opinions, which makes them look like they are asked to use for training AI models?

If so, I’m not entirely sure I agree. There’s tons of info online about any given topics, which can be very overwhelming. Maybe that causes people to prefer to seek out personal experience and opinions from others on such topics, rather than just hard cold facts.

It may also depend on which communities the questions you’re sampling are asked as well.

[–] inspxtr 12 points 1 year ago (2 children)

how does this work with universities and companies that use GMail/Outlook for their emails?

[–] inspxtr 1 points 1 year ago (1 children)

thanks for your answer! Is this same or different from indexing to provide context? I saw some people ingesting large corpus of documents/structured data, like with LlamaIndex. Is it an alternative way to provide context or similar?

[–] inspxtr 2 points 1 year ago (3 children)

I know nothing about “in context learning” or legal stuff, but intuitively, don’t legal documents tend to reference each other, especially the more complicated ones? If so, how would you apply in context learning if you’re not aware which ones may be relevant?

[–] inspxtr 1 points 1 year ago* (last edited 1 year ago) (1 children)

When you find one or successfully train one, I’d love to know as well. Maybe you can crosspost this on?

I saw this dataset on HuggingFace, does it fit your use case? https://huggingface.co/datasets/lexlms/lex_files

[–] inspxtr 3 points 1 year ago

similarly to people from different races/countries … it’s not only that their conditions might vary and require more data, it is also that some communities don’t visit/trust hospitals to even have their data collected to be in the training set. Or they can’t afford to visit.

Sometimes, people from more vulnerable communities (eg LGBT) might prefer not to have such data collected in the first place, making data sparser.

[–] inspxtr 1 points 1 year ago

Does defining in a “loop” work in C? Like

#define main A
#define A B
#define B C
…
#define Z main
[–] inspxtr 2 points 1 year ago

great! thanks for the tip! looks like I got a new series to follow.

[–] inspxtr 3 points 1 year ago (3 children)

can I skip season 1, or which episode of s1 that I should be watching that I can begin season 2?

[–] inspxtr 4 points 1 year ago

Not related to warp, but just out of curiosity, which protocols have you tried? In one or two univs I visited, I had to switch to TCP instead UDP for it to work. Not sure why.

view more: ‹ prev next ›