this post was submitted on 30 Jan 2025
75 points (93.1% liked)

TechTakes

1610 readers
112 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago
MODERATORS
 

So I was just reading this thread about deepseek refusing to answer questions about Tianenmen square.

It seems obvious from screenshots of people trying to jailbreak the webapp that there's some middleware that just drops the connection when the incident is mentioned. However I've already asked the self hosted model multiple controversial China questions and it's answered them all.

The poster of the thread was also running the model locally, the 14b model to be specific, so what's happening? I decide to check for myself and lo and behold, I get the same "I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses."

Is it just that specific model being censored? Is it because it's the qwen model it's distilled from that's censored? But isn't the 7b model also distilled from qwen?

So I check the 7b model again, and this time round that's also censored. I panic for a few seconds. Have the Chinese somehow broken into my local model to cover it up after I downloaded it.

I check the screenshot I have of it answering the first time I asked and ask the exact same question again, and not only does it work, it acknowledges the previous question.

So wtf is going on? It seems that "Tianenmen square" will clumsily shut down any kind of response, but Tiananmen square is completely fine to discuss.

So the local model actually is censored, but the filter is so shit, you might not even notice it.

It'll be interesting to see what happens with the next release. Will the censorship be less thorough, stay the same, or will china again piss away a massive amount of soft power and goodwill over something that everybody knows about anyway?

top 50 comments
sorted by: hot top controversial new old
[–] [email protected] 71 points 1 week ago (2 children)

Worrying about whether or not an LLM has censorship issues is like worrying about the taste of poop.

[–] [email protected] 14 points 1 week ago (1 children)
[–] [email protected] 9 points 1 week ago* (last edited 1 week ago)

benchmarks show that this LLM has the bouquet of the sunny side of the sewage farm

[–] [email protected] 11 points 1 week ago (1 children)

This is just me being childish, but it would be fun if we could incept the joke that LLM censorship = corn.

The rest of the owl draws itself in the imagination of the listener.

[–] [email protected] 8 points 1 week ago

Hmm, the way I’ve chosen to interpret this is to propose the analogy of a person eating corn as a model of LLMs. You can eat a huge variety of foods and your poop looks more or less the same. Sometimes, you eat something like corn, and the result is you can spot kernels of things resembling real food (i.e. corn kernels) in the poop. However, if you were to inspect said kernels, you would quickly realise they were full of shit.

[–] [email protected] 25 points 1 week ago

Local deepseek answers all my questions, but it is definitely biased in favour is the CCP even you ask about Chinese leadership.

Fortunately that doesn’t come up much for me

[–] [email protected] 20 points 1 week ago (2 children)

So I check the 7b model again, and this time round that’s also censored. I panic for a few seconds. Have the Chinese somehow broken into my local model to cover it up after I downloaded it.

what

[–] [email protected] 11 points 1 week ago

It's a slightly facetious comment on how the same model had gone from definitely not censored to definitely censored. The tripwire for the filter was obviously already there.

[–] [email protected] 11 points 1 week ago

I mean, his username is manicdave. Psychosis is a symptom of mania. It's a pretty wild thought.

[–] [email protected] 13 points 1 week ago (1 children)

the product produced by the producer operating in a violently censorious state has complied with censorship in the production of the product? omg, stop the presses!

(nevermind the fact that if you spend half a second thinking through where said state exercises its power and control, the reasons for the things you observed should all but smack you in the face)

[–] [email protected] 1 points 1 week ago* (last edited 1 week ago) (1 children)

I'm not particularly surprised by the censorship. That's not really the point of the post.

There's constant arguments going on between people over whether it's censored or not. A lot of people, me included tbh, were under the impression that it wasn't because we were able to get information out of it that we would expect to be censored. Other people have claimed not to be able to when trying similar. Therefore we've ended up with people arguing over whether it is or isn't.

I investigated and proved that both sides are kinda of right, and explained why people are getting different results for doing what is ostensibly the same thing.

[–] [email protected] 7 points 1 week ago (2 children)

wow, the point must've picked up a speed booster when it got close to you. so hard to grasp it!

[–] [email protected] -3 points 1 week ago (2 children)

To me it looks like you're the one missing this person's point. And the snark doesn't help you, or them.

[–] [email protected] 4 points 6 days ago

you came into TechTakes wanting less snark? holy fuck you’re lost

[–] [email protected] 2 points 1 week ago

do they not have sidebars in aus?

[–] [email protected] -4 points 1 week ago (1 children)

Easier to grasp than your witticisms, clearly.

[–] [email protected] 2 points 1 week ago

you should try not asking a llm

[–] [email protected] 7 points 1 week ago (1 children)

The local models are distilled versions of Qwen or llama or whatever else, not really deepseek's model. So you get refusals based on the base model primarily, plus whatever it learned from the distilling. If it's Qwen or another Chinese model then it's more likely to refuse but a llama model or something else could pick it up to a lesser extent.

[–] [email protected] 3 points 1 week ago (1 children)

You get the exact same cookie cutter response in the llama models, and the qwen models process the question and answer. The filter is deepseek's contribution.

[–] felixwhynot 1 points 1 week ago (2 children)

From what I understand, the Distilled models are using DeepSeek to retrain e.g. Llama. So it makes sense to me that they would exhibit the same biases.

[–] [email protected] 6 points 1 week ago* (last edited 1 week ago)

Distilling is supposed to be a shortcut to creating a quality training dataset by using the output of an established model as labels, i.e. desired answers.

The end result of the new model ending up with biases inherited from the reference model should hold, but using as a base model the same model you are distilling from would seem to be completely pointless.

[–] [email protected] 3 points 1 week ago (1 children)

Some models are llama and some are qwen. Both sets respond with "I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses." when you spell it Tianenmen, but give details when you spell it Tiananmen.

[–] felixwhynot 3 points 1 week ago

To your point, neither of those are truly Deepseek under the hood

[–] [email protected] 6 points 1 week ago (1 children)

Fuck. Just thinking of those bodies run over repeatedly and washed down the sewers...

[–] [email protected] 6 points 1 week ago

I absolve you of your perceived need to think about it. You’re cured!

load more comments
view more: next ›