Sicheng Liu bio photo

Email

LinkedIn

Github

Research Note

2023.07.28


This note is to record the content of the research and inspiration.

Multimodal Summarization

There are three possible ways to improve model performance on multimodal summarization task.

  • Better encoding and decoding model
  • Better modal fusion method
  • Better loss design

For the first method, cutting-edge models can be used to get a better performance.

For the second method, maybe I should read not only MSMO papers but also some multimodal feature fusion papers.

Loss function has been updated iteratively. At first, only text and image coverage are considered. Relation between text and image and text quality have been considered in the recent research.