Google 學術搜尋

Collaborative generative ai: Integrating gpt-k for efficient editing in text-to-image generation

W Zhu, X Wang, Y Lu, TJ Fu, XE Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

W Zhu, X Wang, Y Lu, TJ Fu, XE Wang, M Eckstein, WY Wang

arXiv preprint arXiv:2305.11317, 2023•arxiv.org

The field of text-to-image (T2I) generation has garnered significant attention both within the research community and among everyday users. Despite the advancements of T2I models, a common issue encountered by users is the need for repetitive editing of input prompts in order to receive a satisfactory image, which is time-consuming and labor-intensive. Given the demonstrated text generation power of large-scale language models, such as GPT-k, we investigate the potential of utilizing such models to improve the prompt editing process for T2I generation. We conduct a series of experiments to compare the common edits made by humans and GPT-k, evaluate the performance of GPT-k in prompting T2I, and examine factors that may influence this process. We found that GPT-k models focus more on inserting modifiers while humans tend to replace words and phrases, which includes changes to the subject matter. Experimental results show that GPT-k are more effective in adjusting modifiers rather than predicting spontaneous changes in the primary subject matters. Adopting the edit suggested by GPT-k models may reduce the percentage of remaining edits by 20-30%.

arxiv.org

顯示更多顯示較少

儲存引用被引用 5 次相關文章全部共 6 個版本 HTML 版

顯示最佳搜尋結果。查看所有結果

引用

進階搜尋

已儲存至「我的圖書館」

Collaborative generative ai: Integrating gpt-k for efficient editing in text-to-image generation