polystruct

joined 1 year ago
[–] polystruct 1 points 11 months ago (1 children)

Nice answer! Is there a number of concepts, to your knowledge/experience, where a LORA no longer works? For instance, if I want to make a model that understands all car brands and types, assuming that the base model doesn't of course, would a LORA still be sensible here?

Most LORAs I find have a more specialized narrow focus, and I don't know if I would just start with multiple LORAs dealing with individual concepts (a LORA for a "1931 Minerva 8 AL Rollston Convertible Sedan", a LORA for a "Maybach SW 42, 1939", etc.) or if I should try and generate one LORA to ~~rule~~ know them all...

[–] polystruct 2 points 1 year ago

Thanks! I initially didn't think that such extensions would be what I am looking for (they show which parts of the image are heavily influenced by the prompt), but they do give a visual clue as what parts are definitely important to keep in the prompt, whereas image areas that are hardly getting attention might tell me which parts of the prompt are less impactful.

And it gives me a few repo's to dig through the code. Older ones (archived/read-only) are fine, they are often easier to read than the more optimized (but also feature-rich) code that exists nowadays.

 

I understand that, when we generate images, the prompt itself is first split into tokens, after which those tokens are used by the model to nudge the image generation in a certain direction. I have the impression that the model gets a higher impact of one token compared to another (although I don't know if I can call it a weight). I mean internally, not as part of the prompt where we can also force a higher weight on a token.

Is it possible to know how much a certain token was 'used' in the generation? I could empirically deduce that by taking a generation, stick to the same prompt, seed, sampling method, etc. and remove words gradually to see what the impact is, but perhaps there is a way to just ask the model? Or adjust the python code a bit and retrieve it there?

I'd like to know which parts of my prompt hardly impact the image (or even at all).

[–] polystruct 1 points 1 year ago (1 children)

Currently Confluence. We do have a split documentation policy, where long-lived and broadly communicated information should be on M365 (SharePoint and affiliated services) whereas more technical or short-lived (project) documentation is on Confluence.

But even certain broad-use information is showing up on Confluence more and more given it's easier use (wiki and plugins like the draw.io support).