Research Note

2023.07.28

This note is to record the content of the research and inspiration.

Multimodal Summarization

There are three possible ways to improve model performance on multimodal summarization task.

Better encoding and decoding model
Better modal fusion method
Better loss design

For the first method, cutting-edge models can be used to get a better performance.

For the second method, maybe I should read not only MSMO papers but also some multimodal feature fusion papers.

Loss function has been updated iteratively. At first, only text and image coverage are considered. Relation between text and image and text quality have been considered in the recent research.