[PDF][PDF] Language-based video editing via multi-modal multi-level transformer

TJ Fu, XE Wang, ST Grafton… - arXiv preprint arXiv …, 2021 - alvr-workshop.github.io
Video editing tools are widely used nowadays for digital design. Although the demand for
these tools is high, the prior knowledge required makes it difficult for novices to get started.
Systems that could follow natural language instructions to perform automatic editing would
significantly improve accessibility. This paper introduces the languagebased video editing
(LBVE) task, which allows the model to edit, guided by text instruction, a source video into a
target video. LBVE contains two features: 1) the scenario of the source video is preserved …

M3l: Language-based video editing via multi-modal multi-level transformers

TJ Fu, XE Wang, ST Grafton… - Proceedings of the …, 2022 - openaccess.thecvf.com
Video editing tools are widely used nowadays for digital design. Although the demand for
these tools is high, the prior knowledge required makes it difficult for novices to get started.
Systems that could follow natural language instructions to perform automatic editing would
significantly improve accessibility. This paper introduces the language-based video editing
(LBVE) task, which allows the model to edit, guided by text instruction, a source video into a
target video. LBVE contains two features: 1) the scenario of the source video is preserved …
顯示最佳搜尋結果。 查看所有結果