"Developing evaluation measures for resources is even more complex than evaluating applications, such as summarization and machine translation." . . . . .