Google 學術搜尋

[PDF][PDF] Language-based video editing via multi-modal multi-level transformer

TJ Fu, XE Wang, ST Grafton… - arXiv preprint arXiv …, 2021 - alvr-workshop.github.io

Video editing tools are widely used nowadays for digital design. Although the demand for
these tools is high, the prior knowledge required makes it difficult for novices to get started.
Systems that could follow natural language instructions to perform automatic editing would
significantly improve accessibility. This paper introduces the languagebased video editing
(LBVE) task, which allows the model to edit, guided by text instruction, a source video into a
target video. LBVE contains two features: 1) the scenario of the source video is preserved …

儲存引用被引用 5 次相關文章 HTML 版

[PDF] thecvf.com

M3l: Language-based video editing via multi-modal multi-level transformers

TJ Fu, XE Wang, ST Grafton… - Proceedings of the …, 2022 - openaccess.thecvf.com

Video editing tools are widely used nowadays for digital design. Although the demand for
these tools is high, the prior knowledge required makes it difficult for novices to get started.
Systems that could follow natural language instructions to perform automatic editing would
significantly improve accessibility. This paper introduces the language-based video editing
(LBVE) task, which allows the model to edit, guided by text instruction, a source video into a
target video. LBVE contains two features: 1) the scenario of the source video is preserved …

儲存引用被引用 20 次相關文章全部共 6 個版本 HTML 版

顯示最佳搜尋結果。查看所有結果

引用

進階搜尋

已儲存至「我的圖書館」

[PDF][PDF] Language-based video editing via multi-modal multi-level transformer

M3l: Language-based video editing via multi-modal multi-level transformers