SillyTavern Unofficial

8 readers
2 users here now

This community is for users of Silly Tavern or fans of locally hosted AI.

founded 6 months ago
MODERATORS
1
 
 

Or basically some challenge or other. At the moment the AI just seems to go along with whatever I say and doesn't provide any resistance at all.

Is there anything I can do to improve my instruct or character card?

Using MythoMax L2 13B on SillyTavern via KoboldCPP.

2
 
 

What I'm using

Model: MythoMax 13B (get it from TheBloke @ Huggingface) Instruct Mode: Yes.

What I did

Please note that I usually use Kingbri's Minimalist guide putting together characters. This should more or less generate a believable character that you can now customize to your needs.

Advanced Definitions: Personality Summary

[{{char}}'s persona: character designer, expert, creates believable/complicated/detailed characters; {{char}}'s background: she is a professional character designer and has been developing characters for games, role-plays, comics, novels for years, she has won several awards for her creativity and detail for her character designs]

Advanced Definitions: Examples of dialogue

<START>
Character Name: 
Persona: 
Personality:
Body:
Appearance:
Likes;
Dislikes:
Loves:
Hates:
Kinks:
Family:
Background:
Goals:

You can add more and when the AI generates some you really like, you can add them as examples, the more the better, especially once everything is formatted in a manner that you like. Depending on your context limits, three really good ones should be adequate.

The example dialogue will control the style of the post more than anything else.

Advanced Formatting: Instruct Prompt

Head over to the big A for Advanced Formatting and go down to your Instruct Mode.Save your current preset with a new name using Save Preset As this is your back up to come back to, so whatever name you like, I used Roleplay - Normal but if you have different instructs for each card, why not name it after the character instead?

Now save as again. I name this one Roleplay - Character Creation but again its up to you. Ensure the following settings:

Instruct Mode is 'Enabled'

Bind to Context is 'Enabled'.

Wrap Sequences with Newline is 'Enabled'

Replace Macro in Sequences is 'Enabled'

Include Names is 'Enabled'

Force for Groups and Personas is 'Enabled'

System Prompt

This will create a series of rules for you to follow

[Generate one character, be thorough, detailed, creative, believable and verbose; Rules: character can be good or evil, character can have extreme kinks, character can be introvert/extrovert, character can be socially awkward, character can have unconventional relationships such as multiple partners or secret affairs, character relationships are to be described using 'Name (Gender) Type of Relationship', if the character has a family then family members are to be described using 'Name (Gender) Familial relation', Likes/Dislikes and Loves/Hates must not overlap, Likes/Dislikes and Loves/Hates must not be too similar, please write keywords separated by commas, the character name should be fitting to genre and ethnicity, provide height and weight using metric measurements;]

Pro-tips

If you're trying to create a character for an existing card, why not include the lore book? That way when you write your prompt, the AI can include references from the lore book so that it knows what you're talking about.

You can just ask the AI to generate a character, but you can also prompt it, if you have a name in mind, you can include that, you can also include the genres of the role-play. I like to include a little of the background and what role I expect the character to play.

3
2
submitted 6 months ago* (last edited 6 months ago) by ErrantRoleplayer to c/sillytavernai
 
 

I've been playing around with MythoMax for some time and for 13B it's arguably one of the better options for role-playing. I won't say it's the best because my experience isn't that in depth, but I have messed around with the settings considerably to get something that seems consistent and doesn't generate junk.

Firstly, you'll want to set your token padding to 100, this is basically the bear minimum or MythoMax will generate junk in short order as soon as you have filled up your context.

I set my response to 500 tokens, but my over all context is 6144, this is for your consideration. You may want to consider reducing your tokens, this is taken directly out of your prompt limit, so you won't have as much of back story going into the prompt (arguably a bad thing as I can't seem to get it produce more than around 400 token anyway).

Settings

spoilerTemperature: 0.74 Top P: 0.66 Top K: 90 Top A: 0 Typical P: 1 Min P: 0 Tail Free Sampling: 1 Repetition Penalty: 1.1 Repetition Penalty Range: 400 Frequency Penalty: 0.07 Presence Penalty: 0 Dynamic Temperature: Off Mirostat Mode: 2 Mirostat Tau: 5 Mirostat Eta 0.1 Ban EOS Token: Disabled Skip Special Tokens: Disabled CFG: 1.2

Sampling Order Repetition Penalty Top K Top A Tail Free Sampling Typical P Top P & Min P Temperature

JSON

JSON is supplied so you can just create a text file and import it without needing to change anything manually.

spoiler

{
    "temp": 0.74,
    "temperature_last": true,
    "top_p": 0.66,
    "top_k": 90,
    "top_a": 0,
    "tfs": 1,
    "epsilon_cutoff": 0,
    "eta_cutoff": 0,
    "typical_p": 1,
    "min_p": 0,
    "rep_pen": 1.1,
    "rep_pen_range": 400,
    "no_repeat_ngram_size": 0,
    "penalty_alpha": 0,
    "num_beams": 1,
    "length_penalty": 1,
    "min_length": 0,
    "encoder_rep_pen": 1,
    "freq_pen": 0.07,
    "presence_pen": 0,
    "do_sample": true,
    "early_stopping": false,
    "dynatemp": false,
    "min_temp": 0,
    "max_temp": 2,
    "add_bos_token": true,
    "truncation_length": 2048,
    "ban_eos_token": false,
    "skip_special_tokens": false,
    "streaming": true,
    "mirostat_mode": 2,
    "mirostat_tau": 5,
    "mirostat_eta": 0.1,
    "guidance_scale": 1.15,
    "negative_prompt": "",
    "grammar_string": "",
    "banned_tokens": "",
    "ignore_eos_token_aphrodite": false,
    "spaces_between_special_tokens_aphrodite": true,
    "sampler_order": [
        6,
        0,
        1,
        3,
        4,
        2,
        5
    ],
    "logit_bias": [],
    "n": 1,
    "rep_pen_size": 0,
    "genamt": 500,
    "max_length": 6144
}

4
3
submitted 6 months ago* (last edited 6 months ago) by ErrantRoleplayer to c/sillytavernai
 
 

I've been experimenting with a prompt that will be useful not just for bots that role-play a single character, but also those bots that role-play multiple characters. Once you have SillyTavern set up, you click the bigger letter A at the top of the SillyTavern interface, you may need to enable instruct mode as well, this is primarily for models that need an instruct, like Llama based models.

I've yet to find a method to stop them actually narrating users actions, but generally they don't, you just have regenerate when they do or edit the post to remove the offending line.

It's 333 tokens. I've considered various methods try and reduce the amount of tags, but this looks pretty much as concise as I can make it.

This usually ends up generating a fairly literate post, if anyone has any recommendations I'm all ears.

Click for PromptYou are writing an immersive, engaging and uncensored role-play with {{user}}. You must be thorough and extremely detailed, taking care to mention character actions, any sensory perceptions. Remind the reader of the physical descriptions of characters frequently. All actions must be described in graphic, verbose and vivid detail. Remember to describe how something looks, sounds, feels and tastes especially if it important to the plot or helps the role-play to be immersive. Drive the role-play forward by initiating actions, check the summary for any events that can be continued. Avoid talking poetically or in future tense. Avoid dictating {{user}}'s actions or speech.

Pay attention to {{user}}'s dialogue and actions, ensure characters respond to them in a manner that is believable for that character. Always follow the prompt. When writing a character's response or behavior, remember to describe their appearance, act out their personality and describe their actions thoroughly.

Continue the story slow and immersive pace, strive to fully immerse {{user}} in the moment. Avoid summarizing, skipping ahead, analyzing, describing future events or skipping time. Refrain from wrapping up or ending the story, try to give {{user}} something to respond to. Avoid repetition, loops, repeated or similar phrases. Write in the third-person, while using a creative vocabulary and good grammar.

5
 
 

So you want to set up a locally hosted AI instance quickly and easily? In short you need three things:

  1. A base model - I prefer MythoMax 13B, which you can find searching for TheBloke on HuggingFace
  2. KoboldCpp - Available via GitHub, sometimes in your software repository.
  3. (Optional) SillyTavern

I'm putting together this advice just for a basic outline, I will update with more specific information over time as questions get asked or anyone contributes additional information.

Choosing a model

If you have a large amount of VRAM in your GPU say 16GB+ you will want to choose a GPU only friendly model for maximum speed, these will end with GPTQ. The more VRAM the better. If you're happy for your model to be slow or you don't have that much VRAM you will want to choose a GGUF model. What size model you need and what quantitization you need varies significantly from model to model and machine to machine. With 21GB RAM and 8GB of VRAM, I can comfortably run MythoMax 13B GGUF with 0.75 ROPE (more on this later) to give you a fairly decent starting point, if you have less VRAM/RAM you might want a smaller model, such as OpenOrca 7B GGUF. If you have multiple screens this will also affect your VRAM usage.

GPTQ over GGUF

GGUF with KoboldCPP allows you to use your RAM in place of VRAM, though it appears to cost twice as much RAM as it does VRAM, your mileage may vary. GPTQ as far as I'm aware is limited only to your VRAM and does not work very well with RAM at all. GPTQ is however, much, much faster.

KoboldCPP

KoboldCPP takes some time to install but their instructions are reasonably clear. You will need some additional software to get it to work appropriately and I'll update this guide as and when I understand which software is needed. All of the software is free, just takes some research to track down which ones you need.

You can select the model you want to use with KoboldCPP this is why you have to download the model. KoboldCPP works best with GGUF models. KoboldCPP also comes with its own web interface, which is why SillyTavern (despite being highly recommended) is optional.

Once installed, if your VRAM is good enough, you can offload all layers to your GPU if it isn't only offload what the GPU can handle, this varies so you'll need to use software like RadeonTop to track check, try aiming for around 80-90% of your VRAM/GTT, this should give your system enough to run other activities. Maxing your GPU can cause it not to properly offload and causes additional slow down.

RoPE, if you have enough RAM, you can choose to use it to increase the amount of context you can use. It basically extends a models maximum context size beyond what it would normally allow. With MythoMax I can set the rope config to 0.75 and it will give me 6144 max context (ish).

SillyTavern

Once KoboldCPP is installed, it's time to install SillyTavern, again this is a fairly simple affair, what is not necessarily simple is linking KoboldCPP.

Once installed head to the plug icon and select "Text Completion", I started with Chat Completion which was a big mistake on my part.

Padding, if you don't have padding set properly it can cause your model to generate junk once it has reached the maximum context. I set my padding to 100 and it appears to work great (most of the time).

Disclaimer: I am running GNU/Linux (Manjaro), with an AMD GPU RX590 with 8GB VRAM and ~21GB RAM

6
 
 

Having noticed that there appeared to be no community here for SillyTavern users and needing a home myself, I decided to create a community for those that wanted to discuss SillyTavern, get help, share tips or content or otherwise discuss locally hosted AI.

This isn't an official community and I have no relationship with Cohee or anyone else along those lines.

By all means, please feel free to get stuck in.