Research Note
2023.07.28
This note is to record the content of the research and inspiration.
Multimodal Summarization
There are three possible ways to improve model performance on multimodal summarization task.
- Better encoding and decoding model
- Better modal fusion method
- Better loss design
For the first method, cutting-edge models can be used to get a better performance.
For the second method, maybe I should read not only MSMO papers but also some multimodal feature fusion papers.
Loss function has been updated iteratively. At first, only text and image coverage are considered. Relation between text and image and text quality have been considered in the recent research.