@AdComfortable1514
@lemmy.worldSimple and cool.
Florence 2 image captioning sounds interesting to use.
Do people know of any other image-to-text models (apart from CLIP) ?
Wow , yeah I found a demo here: https://huggingface.co/spaces/Qwen/Qwen2.5
A whole host of LLM models seems to be released. Thanks for the tip!
I'll see if I can turn them into something useful 👍
Hmm. I mean the FLUX model looks good
, so there must maybe be some magic with the T5 ?
I have no clue, so any insights are welcome.
T5 Huggingface: https://huggingface.co/docs/transformers/model_doc/t5
T5 paper : https://arxiv.org/pdf/1910.10683
Any suggestions on what LLM i ought to use instead of T5?
I get it. I hope you don't interpret this as arguing against results etc.
What I want to say is ,
If implemented correctly , same seed does give the same result for output for a given prompt.
If there is variation , then something in the pipeline must be approximating things.
This may be good (for performance) , or it may be bad.
You are 100% correct in highlighting this issue to the dev.
Though its not a legal document , or a science paper.
Just a guide to explain seeds to newbies.
Omitting non-essential information , for the sake of making the concept clearer , can be good too.
Perchance dev is correct here Allo ;
the same seed will generate the exact same picture.
If you see variety , it will be due to factors outside the SD model. That stuff happens.
But it's good that you fact check stuff.
Do you know where I can find documemtation on the perchance API?
Specifically createPerchanceTree ?
I need to know which functions there are , and what inputs/outputs they take.
Thanks! I appreciate the support. Helps a lot to know where to start looking ( ; v ;)b!