this post was submitted on 28 Sep 2023
318 points (98.5% liked)
Technology
60073 readers
3459 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I'm not angry at all. I'm simply doing a very poor job of helping you see what this really is. I keep trying because I really care to help people see the potential and where this is inevitably going in the near future. I don't care about the AI, I care about you.
AI has been marketed as a product because there was a large investment into proprietary AI as a product. The majority of articles and media are created based of these corporate interests and are not well grounded or are outright mis/disinformation. I am absolutely against proprietary AI, but open source offline AI is a completely different thing. Offline AI is a framework and not a product. It is only limited by the creativity of developers and hobbyists.
Even with Automatic1111/Stable Diffusion like I used for the image yesterday, it is quite easy to do specific tasks in areas. The base prompt is somewhat limited in what it can do. So like, to make this image I started off with a wide frame image of 768 × 512 and I batched the generation to make 60 images to choose from. I set how much variation each image would have and how closely the prompt would be followed overall and how much creativity aka randomness was allowed. This made a weak overall composition, but I chose one image that didn't have very well defined features. Then I captured the seed and prompt I used to make this image and I moved over to the Image to Image generation tab with a bunch of extra tools. I tried a few mods in the wide aspect format, but I didn't really like them, and there were some errors creeping in because the AI model I used was trained on 512 × 512 images. I tried cropping to 512, but after trying an automatic compression it made the image more abstract and I liked that, so I went with it. I used a couple of tools that basically ask the AI what it sees in the image using words in its vocabulary, and then I started playing with eccentric prompt words. I played with their locations and a bunch of things that didn't work like I wanted. While I was messing with the prompt wording, I was using the base image to generate from. Instead of starting each prompt from mathematically random noise like any tools you have likely seen, I am using the base image and determining how much noise is added back into it before the AI starts iterating back over it. If I just add like 20% noise to the base image, the structure of the image will still stay in tact. This is super easy to do in automatic 1111, but it does mean it will regenerate the whole image with some minor variation. This is just the easiest and quickest way I can make something. Even with this, there is an option to paint a color over an area or mask/erase something and replace it with stuff like random noise so that it gets remade. These tools are not super effective for me in A1, but some people really dial them in well, I just haven't taken the time to figure it out. ComfyUI takes generating to a whole different level of control where people generate some really interesting and detailed images. The latest tool I haven't tried yet is Open Dream and it has full masking and layering capabilities like Gimp/Photoshop. Even with A1, I can do something like take an image of a girl, and import it to Image to Image as my base to generate from. Then I can set my noise to something like 35%. I can use the Inpainting tool and draw a teal blob around her neck. Then I can prompt something like "A girl wearing a teal knit scarf" and the AI will turn that teal blob into a scarf. I can also mask the face and generate so that I don't alter the girl. This will generate everything I need for lighting and realism to make it appear like the girl was photographed wearing the teal scarf. It may take a good bit of trial and error just like with gimp but it is entirely doable even with the most basic of generation tools. There are still some limits like generating water droplets is not great, but there are new models and techniques coming out of academia weekly right now. Like I generated on tools that are already quite deprecated. The newer SDXL models are ten times more powerful in what they can do with prompts alone before the software tools get involved.
Don't think of this tech as a product, that simply is not true. It is a tool and it is a productivity multiplier. One of the leading researchers in AI, Yann LeCun said it best when he called open source LLM AI the opportunity for everyone to have individualized education like what is enjoyed only by the super rich class of society, and it is the opportunity for everyone to surround themselves with a group of experts to give them information at a moment's notice like the CEO of a company that hires and utilizes a group of experts available at any time.
These are tools only. They don't create products; we do. Much like a CEO is a generalist at the center that makes decisions by taking in information from smarter people in their niche expertise, the tool is very capable, but you need to be a good CEO and put together a good team and learn to trust them. Does this mean that other experts will be obsolete? Absolutely not, but the shift in possibilities means it is very VERY important for everyone to be aware of and dabble in the art of being their own generalist CEO. This is a technological leap forward that is pronounced, and that means it will have a broad impact. The figurative CEO will be much more efficient in their output. The current output to value ratio will be adjusted. The things that were impossible due to time constraints and cost are now accessible. This is what will change values.
If people in general like yourself fail to adopt and adapt to the technology, the current output quality and expectations will stay the same using a lot less people in the process and all the profits will be consolidated in the process. However, if a sizable population adopt this and raise the bar for expectations and quality, state of the art moves forward. The corporate propaganda machine is pushing public opinion into naïveté in an attempt to profit. Do what you will with this, fight it if you want. I have no motivation except to try to make a better future for myself. All this bla bla bla is because I care about you stranger.
-Jake
You seem to be misunderstanding my position entirely. I suggest you read my first comment again. Cause you're using a lot of words and details to explain useless stuff.
Edit: I don't disagree it's a tool. I disagree its the same as a person being inspired by others. And I am against the claim that they should freely use whatever they want without credit.
There is no such thing as a truly unique idea. Everything is a result of combining concepts and experiences in different and novel ways.
AI doesn't freely use anything in a reproduction type of format. It can't regenerate a copy of a work It is no different that human awareness of a subject. Restricting this type of awareness is the same as the thought policing nonsense of humanity's past eras of stupidity. No court will ever restrict this type of information and access. If they did I could sue someone for their thoughts. This is draconian nonsense.