inputting image data; inputting text data; converting the inputted text data into voice data; connecting the obtained voice data and the inputted image data to each other; and creating a file including the image data and the voice data connected to each other.