Ehud Reiter's Blog

Ehud's thoughts and observations about Natural Language Generation

If we need to present some data to a person, when does it make sense to use a textual summary, when does it make sense to use some kind of graphic (ie, information visualisation), and when is the best option to use a combination of text and graphics?

I have been working on text and graphics for decades; my first paper on the topic was published in 1990, and my first opinion piece in 1998. I just reread this opinion piece; I still believe in most of what I said in it, but it is striking how little experimental data was presented in this paper. Maybe this was a sign of the times?

Basic Principles

Many of the basics haven’t changed much in the 19 years since I wrote my 1998 opinion piece.

Practical constraints: Graphics are not possible for information delivered by radio, SMS text message, or other words-only medium; they also may be inappropriate for many visually-impaired users. Hence text is preferred in such contexts. On the other hand, text is not appropriate for users who dont know the language the text is written in, and some language-impaired individuals (eg, non-verbal autistic people).

Type of information: Abstract information such as causality, provenance, and data quality concerns may best be communicated in words; on the other hand graphics may be best for identifying specific components in a complex visual object (eg, a location on a map or a widget in a complex user interface).

Users: People vary widely in whether they are “visual thinkers” or “verbal thinkers”; this obviously impacts their preferred media. This is especially important when users are expected to remember the information. Also, in many contexts domain experts are better at interpreting complex graphics than domain novices (see Babytalk discussion below).

Use of information: I have mostly worked on decision support, that is providing information to help people make decisions (eg, how to treat a patient). Text and graphics can also be used for persuasion, including behaviour change (eg, encourage people to stop smoking). Another important use case is exploratory data analysis, where a human analyst explores a data set looking for general insights; this plays an important role in many scientific discoveries.

Text and Graphics for Decision Support: BabyTalk

The Babytalk project focused on developing an NLG system which summarised clinical data about babies in neonatal intensive care (NICU). In Babytalk, as well as in some predecessor projects, we did several studies which compared the effectiveness (for decision support) of textual and visual presentations of Babytalk clinical data. To summarise the key findings (I give references below to the actual research papers, which I encourage interested people to read)

Human-written text summaries were the most effective presentation mechanism. Computer-generated texts and visualisations were equally effective on average, but on some data sets texts computer texts were better, and on others visualisations were better. This suggests that a combination of textual summary and data visualisation would be the best approach.
Experienced clinicians were much better at interpreting complex visualisations than junior clinicians. There is a suggestion that this is partially because experienced clinicians are better at focusing on what’s important, while junior clinicians may get distracted by noise or clinically unimportant data.
Clinicians *preferred* visualisations, even if they made better decisions when presented with a textual summary. This could patially be because they are used to looking at visualisations, and hence more comfortable with this

I believe these findings generalise, and indeed I have observed similar things in other projects and indeed in my work for Arria. However, I dont have formal experimental data about text/graphics from these other projects.

Text and Graphics for Persuasion: SaferDriver

A recent research student at Aberdeen, Daniel Braun, looked at using textual and graphical feedback to encourage drivers to drive more safely. His system analysed driving data (acquired using GPS tracking), identified occasions of inappropriate behaviour (such as speeding), and provided feedback reports to drivers about these problems. Daniel has not yet published all of his findings, and it is not for me to “steal his thunder” by revealing unpublished findings here. But one of his papers does discuss user’s reactions to different presentations, including a textual feedback report and a map showing where inappropriate behaviour occurred. He found that if forced to choose, participants preferred the textual feedback report to the map; however what they really wanted was to see both.

I have informally observed similar reactions in other behaviour change projects I have worked on, such as STOP (smoking cessation), but I again dont have experimental data about text/graphics from these projects. In general I think graphics-only presentations are unusual for persuasion and behaviour change, since text is needed to present rationale and logical arguments, give encouragement and highlight success, etc. However text+graphics is often more powerful than text on its own.

Text and Graphics for Exploratory Data Analysis

I personally have not worked on exloratory data analysis, where analysts spend large amounts of time (sometimes weeks or months) investigating large data sets and looking for general insights. This is a very different task from decision-support (where a subject matter expert looks at the data in order to make a specific decision), especially since many decision-making tasks are expected to be done in minutes or hours, not weeks or months. I am not aware of any research on using textual summaries of data to support exploratory data analysis, although there have been a few studies which looked at using texts to summarise results of statistical analyses, simulation runs, what-if scenarios, etc. But for the core task of directly examining the data, I think interactive visualisation workbenches are probably superior to any textual alternative I am aware of. Although it is very difficult to scientifically evaluate the effectiveness of visualisation tools in this context, as Plaisant and others have pointed out; for example it is difficult to run a controlled experiment comparing different approaches to supporting EDA because EDA takes a long time (days, weeks, months) and there is huge variability in the skill and effectiveness of analysts doing EDA.

Summary

The clearest message from the above is that the best approach is usually to combine text and graphics when presenting information. In other words, the best answer to the question “text or graphics?” is to use text *and* graphics. The two media have different strengths and weaknesses, so the best approach is to use both, preferably integrated into a combined presentation. This is the approach Arria has taken, for example by automatically producing annotated graphs.

Key relevant research papers

Babytalk

F Portet, E Reiter, A Gatt, J Hunter, S Sripada, Y Freer, C Sykes (2009). Automatic Generation of Textual Summaries from Neonatal Intensive Care Data.
Artificial Intelligence 173:789-816. (DOI)

S Cunningham, S Deere, A Symon, R Elton, N McIntosh (1998). A randomized, controlled trial of computerized physiologic trend monitoring in an intensive care unit. Critical Care Medicine, 26 (1998) 2053-2059. (PubMed)

A.S. Law, Y. Freer, J. Hunter, R.H. Logie, N. McIntosh, J. Quinn (2005). A comparison of graphical and textual presentations of time series data to support medical decision making in the neonatal intensive care unit. Journal of clinical monitoring and computing, 19 (3) 183-194. (PubMed)

van der Meulen, M., R.H. Logie, Y. Freer, C. Sykes, N. McIntosh, J. Hunter (2010). When a graph is poorer than 100 words: A comparison of computerised Natural Language Generation, human generated descriptions and graphical displays in neonatal intensive care. Applied Cognitive Psychology 24:77-89. (DOI)

E Alberdi, J Becher, K Gilhooly, J Hunter, R Logie, A Lyon, N McIntosh, J Reiss (2001). Expertise and the interpretation of computerized physiological data: implications for the design of computerized monitoring in neonatal intensive care. International Journal of Human-Computer Studies, 55(3), 191-216. (DOI)

SaferDriver

D Braun (2016). Creating Textual Driver Feedback from Telemetric Data. Master’s Thesis, University of Aberdeen. (Aberdeen Uni library).

D Braun, E Reiter, A Siddharthan (2015). Creating Textual Driver Feedback from Telemetric Data. In Proceedings of ENLG-2015, pages 156-165. (ACL Anthology)

Exploratory Data Analysis

C Plaisant (2004). The Challenge of Information Visualisation Evaluation. Proc of AVI 2004. (PDF)

Text or Graphics?

Ehud Reiter's Blog

Ehud's thoughts and observations about Natural Language Generation

Basic Principles

Text and Graphics for Decision Support: BabyTalk

Text and Graphics for Persuasion: SaferDriver

Text and Graphics for Exploratory Data Analysis

Summary

Recommend

主动或被动之下，那些离开在线教育的年轻人

中国邮政集团与农业农村部签署战略合作协议

Front-End vs Back-End Development

浪潮发起融智联盟瞄准百亿增量市场

如何游刃有余的应聘Go语言开发工程师

无法有法无法颠覆天下，逆乱阴阳

Save Google Maps to SharedPreferences

C++ Removed Features

How to get specific object values from java & lt; List & gt; and conve...

2021世界移动通信大会闭幕 5G技术应用受瞩目

About Joyk