Focused attention in focus:
Crossing Micro-Analytical Boundaries1

Sig­rid Norris


1. Introduction

This artic­le sets out to dis­cuss an issue that dis­cour­se ana­lysts exami­ning ever­y­day inter­ac­tions often do not worry about: Focu­sed atten­ti­on. The artic­le hones in on the theo­re­ti­cal issue of focu­sed atten­ti­on and exem­pli­fies it with an empi­ri­cal exam­p­le from a stu­dy of 82 par­ti­ci­pan­ts inter­ac­ting with fami­ly mem­bers via Sky­pe. The issue pre­sen­ted in this artic­le is two-fold:

  1. The artic­le argues that we can­not deter­mi­ne focu­sed atten­ti­on when we cut our data pie­ces too small; and
  2. The artic­le argues that focu­sed atten­ti­on can­not be deter­mi­ned on the sole ground that a par­ti­ci­pant is using lan­guage to communicate.

Figu­re 1 shows a par­ti­ci­pant in New Zea­land Sky­p­ing with his sis­ter and nie­ce in Aus­tra­lia. The con­nec­tion has just been estab­lished less than 15 seconds ear­lier when the mother of the child and the participant’s (the man sit­ting in front of the lap­top) sis­ter prompts her ›you got to look at the screen‹ and once the child looks at the screen, the par­ti­ci­pant reacts to see­ing the child’s face with ›the­re you are‹.

Figu­re 1: Mother ins­truc­ting child whe­re to look at the begin­ning of a Sky­pe call.2

In images 42–45, we see the par­ti­ci­pant is sit­ting in front of the lap­top, loo­king at the screen, lis­tening to his sis­ter speak with the child and then reac­ting to the child’s face appearing on the screen. Here, it seems that we can easi­ly deter­mi­ne in this brief excerpt that the par­ti­ci­pant is focu­sed upon the Sky­pe call with his sis­ter and nie­ce. He osten­si­bly demons­tra­tes through his pos­tu­re (which is posi­tio­ned towards the lap­top), his gaze (which is focu­sed upon the screen), and his lan­guage use (lis­tening and spea­king), that he is focu­sed upon the inter­ac­tion. Par­ti­cu­lar­ly his use of lan­guage appears to be cle­ar­ly indi­ca­ting that he must be focu­sed upon the conversation.

Howe­ver, this artic­le demons­tra­tes that:

  1. Focu­sed atten­ti­on can only be ana­ly­zed cor­rect­ly when crossing micro-ana­ly­ti­cal boun­da­ries. This means that if rese­ar­chers cut their micro data pie­ces that they are inves­ti­ga­ting too short (as in the exam­p­le in Figu­re 1), they are unable to detect and cor­rect­ly ana­ly­ze what par­ti­ci­pan­ts are actual­ly focu­sed upon.
  2. A par­ti­ci­pant can uti­li­ze lan­guage wit­hout pay­ing focu­sed atten­ti­on to an inter­ac­tion. Thus, if rese­ar­chers assu­me that lan­guage use means focu­sed atten­ti­on, the assump­ti­on may in fact be incorrect.

The artic­le builds upon a dis­cus­sion of atten­ti­on lite­ra­tu­re (Nor­ris, forth­co­ming), which is not repea­ted here for space reasons, and revi­sits part of an excerpt that has been writ­ten about in Nor­ris (2016), whe­re an inter­ac­tion around minu­te 4 of the video dis­cus­sed here is ana­ly­zed. This artic­le dis­cus­ses the first two minu­tes in detail and illus­tra­tes how the focus of the par­ti­ci­pant is detec­ted when taking a look at a data pie­ce from the begin­ning of a recor­ding. In Nor­ris (2016, 152), it was demons­tra­ted that it is not lan­guage which gives away what par­ti­ci­pan­ts are focu­sed upon, but rather modal den­si­ty (of which lan­guage is a part) that comes about through modal inten­si­ty and/or modal com­ple­xi­ty (Nor­ris 2004). Here, the ear­lier point of that exam­p­le illus­tra­tes the very begin­ning of the Sky­pe call with a mul­ti­mo­dal tran­script (Figu­re 2–6) and honing in on the lan­guage that the par­ti­ci­pant uses (Audio Tran­scripts 1–4). Here, I show that the participant’s use of lan­guage syn­chro­niza­ti­on (not in a time-syn­chro­ni­zed man­ner, but in a repe­ti­ti­ve man­ner) allows the par­ti­ci­pant to ful­ly func­tion ver­bal­ly even though he is not focu­sed upon the inter­ac­tion. Becau­se I wish to demons­tra­te that a par­ti­ci­pant can smooth­ly inter­act ver­bal­ly wit­hout being focu­sed upon the inter­ac­tion, I have cho­sen to repre­sent the­se sequen­ces in the form of both mul­ti­mo­dal (Figu­re 2–6) and audio tran­scripts (Audio Tran­script 1–4). A list of hig­her-level actions and the video excerpt dis­cus­sed in this artic­le can be found in Nor­ris (2019, 190ff.; Video 5.2).


2. Data and data analysis

The data dis­cus­sed in this artic­le is part of a lar­ger stu­dy of 17 New Zea­land fami­lies and 82 par­ti­ci­pan­ts from infants to an 80+ year old woman (most­ly) Sky­p­ing with fami­ly mem­bers in Aus­tra­lia, Bri­tain or Cana­da. Data coll­ec­tion occur­red in the New Zea­land par­ti­ci­pan­ts’ homes by one to three rese­ar­chers at a time (depen­ding upon avai­la­bi­li­ty) with a rese­arch lap­top that had a screen recor­ding soft­ware instal­led and one to two tri­pod-stan­ding video came­ras (depen­ding on need and pos­si­bi­li­ty) recor­ding the inter­ac­tions of the fami­ly mem­bers in the homes around the Sky­pe inter­ac­tions. A few weeks after the recor­ding, (usual­ly) a fol­low-up pho­ne inter­view with at least one of the New Zea­land adult fami­ly mem­bers was con­duc­ted (Nor­ris 2019). The data in this artic­le, howe­ver, comes from the video recor­ded data. This par­ti­cu­lar data pie­ce was then mul­ti­mo­dal­ly tran­scri­bed fol­lo­wing Norris’s tran­scrip­ti­on con­ven­ti­ons in order to ensu­re repli­ca­bi­li­ty and relia­bi­li­ty and fur­ther ana­ly­zed in detail (Nor­ris 2004, 2011, 2019).

Through a sys­te­ma­tic and detail­ed ana­ly­sis (Nor­ris 2019), it beco­mes evi­dent on the one hand that spe­ci­fic micro data pie­ces sel­ec­ted by rese­ar­chers from a lar­ge amount of data, when a lar­ger point of view is dis­re­gard­ed, can lead to incor­rect or par­ti­al fin­dings. When, on the other hand, micro-ana­ly­ti­cal boun­da­ries are crossed, ground­brea­king fin­dings can be dis­co­ver­ed and exact shifts in a participant’s focu­sed atten­ti­on can be deter­mi­ned (Piri­ni 2014, 2015, 2017). In Nor­ris (2016, 154f), it is shown that the par­ti­ci­pant shifts his focus to the Sky­pe call bet­ween minu­te 3:54 and 3:57. Thus, rather than as argued in Nor­ris (2011) that we need to employ a mul­ti­mo­dal lens in order to gain grea­ter insight into ever­y­day inter­ac­tion, this artic­le demons­tra­tes that it is also the sca­le of a data pie­ce cho­sen (Nor­ris 2017) that reve­als les­ser or grea­ter insight into ever­y­day interaction.


3. Beginning a Skype call

The data pie­ce sel­ec­ted here shows a New Zea­land par­ti­ci­pant during the initi­al Sky­pe call to his sis­ter in Aus­tra­lia. The data is recor­ded in his home on a rese­arch lap­top and a tri­pod-stan­ding came­ra. Two rese­ar­chers are pre­sent and are inter­ac­ting with the par­ti­ci­pant, the participant’s part­ner and each other. Both rese­ar­chers and the part­ner of the par­ti­ci­pant are out of came­ra view at this point. The part­ner is not audi­ble at the very begin­ning, but then beco­mes audi­ble and also par­ti­al­ly visi­ble in the video as she picks up a pho­ne next to the par­ti­ci­pant in order to lea­ve the room to call her mother (Figu­re 2). Then, later in the Sky­pe con­ver­sa­ti­on (not shown here), she also inter­acts with all Sky­pe par­ti­ci­pan­ts and beco­mes a par­ti­ci­pant herself.

Figu­re 2 is a mul­ti­mo­dal tran­script of the very begin­ning of the rese­arch ses­si­on. The mul­ti­mo­dal tran­script fol­lows mul­ti­mo­dal tran­scrip­ti­on con­ven­ti­ons (Nor­ris 2002, 2004, 2011, 2019) and the utteran­ces are color-coded to illus­tra­te spea­k­er chan­ges. In Figu­re 2, image 1–5, we see the lap­top screen as it is chan­ging during the begin­ning of a Sky­pe call. In image 4, we see the participant’s utterance ›gon­na go on‹ with the into­na­ti­on pat­tern dis­play­ed as an appro­xi­ma­te cur­ve. Image 5 then shows that the first rese­ar­cher says ›I’m gon­na start recor­ding‹ when the exter­nal tri­pod-stan­ding came­ra beg­ins to record the par­ti­ci­pant as he is sit­ting at a desk in front of the rese­arch lap­top, try­ing to estab­lish a con­nec­tion with his sis­ter. The second rese­ar­cher responds to the first with ›yep‹ (image 6). A very brief moment later, the par­ti­ci­pant says ›ahm‹ (image 6) and con­ti­nues (images 7–9) with ›am I sit­ting up straight‹. His voice starts out low and increa­ses slight­ly in volu­me as he straigh­tens up his pos­tu­re (images 7–10) and as he turns to and looks at the rese­ar­chers and his part­ner (images 9–10). In images 11–14, we see the par­ti­ci­pant tur­ning back towards the lap­top and with this turn, shif­ting his gaze and his head, slou­ch­ing slight­ly for­ward, and rela­xing his arms. Throug­hout, the par­ti­ci­pant demons­tra­tes a wide smi­le, show­ing the humor in the question.

Figu­re 2:  Dia­l­ing up.

What we see here is that the par­ti­ci­pant pro­du­ces high modal den­si­ty in his inter­ac­tion with the rese­ar­chers, show­ing his inter­ac­tion­al focus. The modal den­si­ty fore­ground-back­ground con­ti­nu­um and, as men­tio­ned befo­re, the ana­ly­sis of a slight­ly later excerpt are dis­cus­sed in detail in Nor­ris (2016). Here, the par­ti­ci­pant reacts to the rese­ar­chers’ utteran­ces about start­ing the recor­ding and makes a joke about sit­ting up straight for the came­ra and indi­ca­ting this being fun­ny through his demons­tra­ti­ve sit­ting up straight and smi­ling wide­ly towards the rese­ar­chers and the came­ra. Thus, the par­ti­ci­pant uses the mode of lan­guage, the mode of pos­tu­re, the mode of hand-arm move­ment, the mode of head move­ment, the mode of gaze and the mode of facial expres­si­on, buil­ding up high modal den­si­ty through both inten­si­ty (of lan­guage and facial expres­si­on) and com­ple­xi­ty (through modal interconnectedness).

Howe­ver simul­ta­neous­ly, as expli­ca­ted in detail in Nor­ris (2016, 152ff.), the par­ti­ci­pant is not una­wa­re of the Sky­pe call that he has initia­ted. Rather, he is pay­ing medi­um atten­ti­on to the call by sit­ting in front of the lap­top, having his tor­so tur­ned toward the screen so that he can easi­ly be seen once the con­nec­tion is estab­lished. He hears the rin­ging of the Sky­pe call, and doubt­less­ly is lis­tening to it. Fur­ther, we can sur­mi­se that he has not for­got­ten that his part­ner is in the room. They were enga­ged in inter­ac­tion befo­re the rese­arch ses­si­on began, are enga­ged when she gets her pho­ne and during tech­no­lo­gy break­downs and are later inter­ac­ting with his rela­ti­ves tog­e­ther. Thus, we can say that even when the part­ner is not in the same room, the par­ti­ci­pant is awa­re of his part­ner if mere­ly through pro­xe­mics (the part­ner being at home), pay­ing some inter­ac­ti­ve atten­ti­on to her.


4. Problematizing data piece selection

As dis­cus­sed abo­ve, the par­ti­ci­pant is cle­ar­ly focu­sed upon the inter­ac­tion with the rese­ar­chers while he initia­tes his Sky­pe call. Howe­ver, we can only deter­mi­ne this focus when we include this seg­ment in our ana­ly­sis and tran­scri­be this seg­ment as illus­tra­ted in Figu­re 2. Yet, when a rese­ar­cher dis­mis­ses this very seg­ment as irrele­vant and beg­ins the ana­ly­sis at a point when the par­ti­ci­pant is actual­ly inter­ac­ting with his sis­ter and his niece(s) via Sky­pe, the participant’s actu­al inter­ac­tion­al focus beco­mes obscu­red. In other words, when a rese­ar­cher focu­ses only upon the actu­al Sky­pe inter­ac­tion as exem­pli­fied in Figu­re 1, the rese­ar­cher is incli­ned to view the Sky­pe con­ver­sa­ti­on wit­hout hesi­ta­ti­on as the participant’s focu­sed inter­ac­tion. A participant’s focus, I would like to argue, is most often not ana­ly­zed, rather it is usual­ly pre­sup­po­sed by rese­ar­chers in two respects:

  1. The researcher’s focus may be a Sky­pe con­ver­sa­ti­on or par­ti­cu­lar ins­tances in Sky­pe con­ver­sa­ti­ons such as an adult direc­ting a child to look at the screen. Thus, a rese­ar­cher may be inte­res­ted in such inter­ac­tions whe­re par­ti­ci­pan­ts on both sides of the screen are inter­ac­ting with each other. Thus, the rese­ar­cher pre­sup­po­ses that a par­ti­ci­pant focu­ses upon the inter­ac­tion that the rese­ar­cher is inte­res­ted in.
  2. The rese­ar­cher pre­sup­po­ses that if a par­ti­ci­pant is enga­ged ver­bal­ly with other par­ti­ci­pan­ts, then the par­ti­ci­pant has to unques­tionab­ly be focu­sed upon the interaction.

Here, I would like to argue that both of the­se pre­sup­po­si­ti­ons can be fal­se and may lead to a mis­re­a­ding of focu­sed inter­ac­tions. As dis­cus­sed in detail by Pash­ler (1998, 38), eye move­ment does not neces­s­a­ri­ly indi­ca­te a social actor’s focu­sed atten­ti­on. Simi­lar­ly, as dis­cus­sed in detail by Nor­ris (2011), lan­guage pro­duc­tion does not neces­s­a­ri­ly indi­ca­te a social actor’s focu­sed attention.

Accor­ding to the ana­ly­sis in Nor­ris (2016, 155ff.), the participant’s shift in inter­ac­tion­al focus occurs clo­se to minu­te 4 in the data. Here is what hap­pens: The pie­ce tran­scri­bed abo­ve ends at 00:00:16:01. At 00:00:20:25, the Sky­pe con­ver­sa­ti­on beg­ins with the adults gree­ting and an inter­ac­tion emer­ges bet­ween the par­ti­ci­pant, his sis­ter and one of her two daugh­ters as dis­cus­sed in detail below (see also Nor­ris 2019, 190ff.). For about 105 seconds, the inter­ac­tion runs smooth­ly, then a tech­no­lo­gy cut-off occurs. The par­ti­ci­pant and his part­ner inter­act during this cut-off. At minu­te 2, the con­nec­tion is re-estab­lished, and the Sky­pe con­ver­sa­ti­on con­ti­nues until a new tech­no­lo­gy glitch occurs around minu­te 3. Throug­hout the­se three minu­tes, the par­ti­ci­pant is focu­sed upon eit­her the rese­ar­chers or his part­ner. Yet, he speaks with his sis­ter in Aus­tra­lia and with two of her child­ren (only inter­ac­tions with one of them are dis­cus­sed here). First, a mul­ti­mo­dal tran­script is pre­sen­ted and this is fol­lo­wed by an audio tran­script. Audio tran­scripts use some con­ven­ti­ons from Tan­nen (1984) so that: ›?‹ means strong rising into­na­ti­on, a com­ma means slight rising into­na­ti­on, and a peri­od means lowe­red into­na­ti­on. Over­lap is indi­ca­ted with squa­re bra­ckets. The par­ti­ci­pant shown in Figu­re 1 is cal­led Part (for par­ti­ci­pant), his part­ner in New Zea­land is cal­led Part­ner, Rese­ar­cher 1 and 2 are R1 and R2 respec­tively, and the two child­ren are cal­led Child 1 and Child 2 in the audio tran­scripts. Fur­ther, the children’s mother is here cal­led Sis­ter sin­ce she is the sis­ter of the par­ti­ci­pant and our focus here is the participant.

Figu­re 3 is a direct con­ti­nua­tion of the mul­ti­mo­dal tran­script in Figu­re 2 and Figu­re 1 is taken from the very last seg­ment in Figu­re 3. The tran­script (Figu­re 3) is then fol­lo­wed by the audio tran­script (Audio Tran­script 1), which demons­tra­tes the lan­guage used by all in Figu­re 2 & 3. The lan­guage in the mul­ti­mo­dal tran­scripts is color-coded (see foot­no­te 2).

Figu­re 3: Con­nec­ting and begin­ning the interaction.

The first part (Audio Tran­script 1) beg­ins at the same point as the mul­ti­mo­dal tran­script shown in Figu­re 2 and ends with the end of the mul­ti­mo­dal tran­script in Figu­re 3.

Audio Tran­script 1: Begin­ning a Sky­pe call.

As illus­tra­ted in Audio Tran­script 1 lines 1 through 20, the par­ti­ci­pant is cal­ling his sis­ter in Aus­tra­lia, his part­ner in New Zea­land is tel­ling him that she is going to call her own mother and the two rese­ar­chers are spea­king quiet­ly in the back­ground (lines 2 & 3 and 8–11). As soon as the call goes through and the participant’s sis­ter picks up, the inter­ac­tants greet each other (lines 9 & 10) and the sis­ter imme­dia­te­ly inqui­res about being seen. As soon as visu­al con­nec­tion is assu­red, the sis­ter asks one of her daugh­ters to ›say hi‹ (line 14) and the child does as she has been asked. Now the child greets the par­ti­ci­pant and the par­ti­ci­pant greets the child (lines 15–17). Then, the sis­ter of the par­ti­ci­pant directs the child’s gaze to the screen and the par­ti­ci­pant reacts to see­ing the litt­le girl’s face on screen (lines 19 & 20; also Figu­re 1). Bet­ween the time the Sky­pe call beg­ins and the end of this excerpt, of which Figu­re 1 is a part, the par­ti­ci­pant fills 5 lines (Audio Tran­script lines 9–20). Howe­ver, what he says only requi­res medi­um atten­ti­on on his part. The reason is that he uses Sky­pe often and, accor­ding to our fin­dings in the lar­ger stu­dy, it is a most com­mon ope­ning to first greet each other and then inqui­re about whe­ther one can be seen. Thus, here the par­ti­ci­pant speaks, going through the ever­y­day moti­ons when begin­ning a Sky­pe call. His focus is still on being recor­ded for a rese­arch pro­ject even though he is enga­ged ver­bal­ly with his sis­ter and her young daugh­ter. In Figu­re 3, image 26–29, we see a rese­ar­cher pla­cing his bot­t­le of beer on the desk and in image 46 & 47 (which direct­ly fol­lows and actual­ly over­laps with Figu­re 1), we see a rese­ar­cher loo­king over the participant’s should­er. The pro­xe­mics of the rese­ar­chers and his part­ner with the par­ti­ci­pant allow us to ana­ly­ze the strong modal den­si­ty that is pro­du­ced bet­ween par­ti­ci­pant, rese­ar­chers and part­ner. Thus, even though the par­ti­ci­pant appears to focus upon the Sky­pe inter­ac­tion through his pos­tu­re, gaze and lan­guage use, he in fact is focu­sed upon the inter­ac­tion with the rese­ar­chers (and at times with his part­ner). Here, in image 47 (Figu­re 3), we see the rese­ar­cher smi­ling and right after (not shown here), the par­ti­ci­pant responds to the researcher’s outbreath.

Next, the sis­ter tells her brot­her that her daugh­ter is dres­sed up for him and the con­ver­sa­ti­on con­ti­nues about what she is wea­ring. Here, the par­ti­ci­pant uti­li­zes com­mon social eti­quet­te asking what the girl is wea­ring and then com­men­ting that ›it is love­ly.‹ Again, the participant’s focu­sed atten­ti­on is not nee­ded to con­ti­nue the con­ver­sa­ti­on and appear ful­ly enga­ged. Here, I say »appear enga­ged«, becau­se later, almost at the end of the three minu­tes, we see that his sis­ter in fact knows that he is not ful­ly focu­sed upon the con­ver­sa­ti­on as she inqui­res ›are you alright?‹ But the­re are other ways than social eti­quet­te and regu­lar ope­nings of a Sky­pe con­ver­sa­ti­on that he uti­li­zes to enga­ge in a con­ver­sa­ti­on wit­hout being ful­ly focu­sed upon it. He does this by syn­chro­ni­zing his utteran­ces to the interlo­cu­tors’ through repetition.

For some time, the con­ver­sa­ti­on is dri­ven by the litt­le girl, who speaks with the par­ti­ci­pant about coming to his house (Figu­re 4 and Audio Tran­script 2).

Figu­re 4: Three-year old con­ver­sing with the par­ti­ci­pant (her uncle).

At her young age of 3, she tri­es to express hers­elf as shown in Figu­re 4 and in Audio Tran­script 2 (lines 33–37).

Audio Tran­script 2: Syn­chro­ni­zing utteran­ces with tho­se of the child.

But what is remar­kab­le here is that the par­ti­ci­pant uses the child’s utteran­ces, refor­mu­la­tes them slight­ly and par­rots back to the child what she has said with only very slight chan­ges (Figu­re 4 and Audio Tran­script 2).

Then, when for exam­p­le his sis­ter comes to the res­cue to cla­ri­fy what the litt­le girl was tel­ling the par­ti­ci­pant (Figu­re 5, image 7 and Audio Tran­script 3, line 54), he again uses repe­ti­ti­on to con­ti­nue the con­ver­sa­ti­on wit­hout much atten­tio­nal effort. Here, the child tells her uncle what she wants to get, her mother cor­rects the word ›Mohawk‹ and the par­ti­ci­pant repeats part of the word, ending in OK.

Figu­re 5: Mother cor­rec­ting speech.

In all nine images of the mul­ti­mo­dal tran­script (Figu­re 5), we see a rese­ar­cher stan­ding or moving behind the par­ti­ci­pant. Cle­ar­ly, the par­ti­ci­pant, who can see the rese­ar­cher on the lap­top screen, is high­ly awa­re of being recor­ded and obser­ved. Howe­ver, he enga­ges con­ti­nuous­ly with his fami­ly mem­bers via Sky­pe. He does this easi­ly through the use of repe­ti­ti­on (Audio Tran­script 3).

Audio Tran­script 3: Syn­chro­ni­zing utteran­ces with the child and the sister.

Thus, the par­ti­ci­pant skillful­ly repeats what the child and his sis­ter say by using their utteran­ces and forming them into his own (Bakhtin 1981). By doing so, he appears to be lis­tening and pay­ing clo­se atten­ti­on wit­hout a need to actual­ly pay focu­sed atten­ti­on. Social eti­quet­te and syn­chro­niza­ti­on of his utteran­ces with tho­se of the interlo­cu­tors thus enable him to ver­bal­ly enga­ge in the mid-ground of his atten­ti­on, while he is simul­ta­neous­ly still focu­sed upon the rese­arch ses­si­on. Syn­chro­niza­ti­on in inter­ac­tion has been shown to pro­du­ce con­nec­tion (Brey­er et al. 2017). Ver­bal­ly syn­chro­ni­zing, not at the time when the same thing is being said, but syn­chro­ni­zing what is being said, allows the par­ti­ci­pant in the abo­ve exam­p­le to estab­lish and dis­play con­nec­tion while he mid-grounds this inter­ac­tion. As shown ear­lier (Nor­ris 2011), in clo­se rela­ti­onships whe­re one interlo­cu­tor pays focu­sed atten­ti­on to the other (here, it would be the litt­le girl, for exam­p­le), while the other attends to the con­ver­sa­ti­on in the mid-ground (here, it would be our par­ti­ci­pant), enables the focu­sed interlo­cu­tor to speak free­ly. Thus, depen­ding upon the inter­ac­tion, such an atten­ti­on con­stel­la­ti­on, whe­re one interlo­cu­tor pays focu­sed atten­ti­on and the other mid-grounds the inter­ac­tion, can be expe­ri­en­ced as com­for­ta­ble by the per­son pay­ing focu­sed attention.

The fact that the par­ti­ci­pant is not ful­ly focu­sed upon the Sky­pe inter­ac­tion can also be seen when his part­ner chi­mes in and comm­ents on the litt­le girl as soon as a tech­no­lo­gy break­down occurs. At that point, this brief exch­an­ge occurs (Mul­ti­mo­dal Tran­script (Figu­re 6) and Audio Tran­script 4, lines 61–67).

Figu­re 6: Con­nec­tion is lost.

Audio Tran­script 4: Con­ver­sa­ti­on bet­ween par­ti­ci­pant and his partner.

Howe­ver, in Figu­re 6 (images 5 & 6 and 8 & 9) and Audio Tran­script (lines 65–67), the par­ti­ci­pant sounds thoughtful rather than with a gist of humor as one would expect when a per­son speaks about some­thing being fun­ny. As for exam­p­le, demons­tra­ted in Helm­holz (1896) or Cher­ry (1953), atten­ti­on is sel­ec­ti­ve and here, it appears that the par­ti­ci­pant is actively sel­ec­ting to shift his focus to the Sky­pe inter­ac­tion. This shift, howe­ver, does not occur imme­dia­te­ly. As shown in Ber­nad-Mechó (2017), a shift in focus often goes through an inter­me­dia­te stage in which the interlo­cu­tor is neither ful­ly focu­sed upon one inter­ac­tion nor on the other. When the par­ti­ci­pant recon­nects, he beco­mes more acti­ve as an interlo­cu­tor, asking about the other girl and spea­king about her hair. But in the next few utteran­ces, he again reverts to syn­chro­ni­zing his utteran­ces with tho­se of his interlo­cu­tors’ pre­vious utteran­ces and then again beco­mes more acti­ve as he asks ques­ti­ons of the girls. The point here is that he seems to be going through a tran­si­ti­on from focu­sing upon taking part in a rese­arch pro­ject and mid-groun­ding the Sky­pe inter­ac­tion to focu­sing upon the Sky­pe inter­ac­tion and mid-groun­ding the rese­arch ses­si­on. His sis­ter reacts to his inter­ac­tion­al atten­ti­on when she asks ›are you alright?‹ and brief­ly after, as dis­cus­sed in detail in Nor­ris (2016, 160), he refo­cu­ses com­ple­te­ly and focu­ses upon the Sky­pe ses­si­on. As illus­tra­ted the­re, the par­ti­ci­pant dis­plays a clear mul­ti­mo­dal focus so that he uses his pos­tu­re, his facial expres­si­on, hand/arm move­ments and gaze as well as lan­guage to inter­act via Sky­pe. His enga­ge­ment has chan­ged not only mul­ti­mo­dal­ly, but also in the rhythm of his speech. Thus, we find a chan­ge in rhythm once he has refo­cu­sed. While the rhythm in the first three minu­tes of the Sky­pe call is slow, the rhythm increa­ses in inten­si­ty as soon as the par­ti­ci­pant has refocused.


5. Conclusion

This artic­le pro­ble­ma­ti­zes ana­ly­ti­cal approa­ches to dis­cour­se, whe­re histo­ry, memo­ry, and over­all con­text are dis­re­gard­ed, whe­re minu­te samples are cut from much lar­ger data pie­ces wit­hout taking a broa­der view of the data and whe­re rese­ar­chers or came­ras recor­ding the par­ti­ci­pan­ts are assu­med to be irrele­vant to the par­ti­ci­pan­ts’ atten­ti­on. Par­ti­ci­pan­ts’ atten­ti­on can neither be pre­sup­po­sed based on micro-data pie­ce sel­ec­tion by a rese­ar­cher, nor can it be pre­sup­po­sed based on lan­guage use by the interlo­cu­tors. In order to make this point, the artic­le began by first show­ing a micro-data pie­ce (Figu­re 1), whe­re the focu­sed inter­ac­tion of the par­ti­ci­pant seems to be appa­rent. Then, by ana­ly­sing the few images from Figu­re 1 in their con­text (Figu­re 3, images 42–45), it is demons­tra­ted that micro-data pie­ces easi­ly are mis­lea­ding. Figu­re 1, as shown abo­ve, was a micro excerpt taken from the end of the mul­ti­mo­dal tran­script (Figu­re 3), which cle­ar­ly demons­tra­tes that, when loo­king at the lon­ger excerpt, the par­ti­ci­pant is actual­ly not focu­sed upon the Sky­pe call in Figu­re 1.

Cur­rent and past atten­ti­on lite­ra­tu­re gives insight into aspects of atten­ti­on and is expli­ca­ted in Nor­ris (forth­co­ming). But brief­ly, scho­lars dif­fer in their assump­ti­on of what hap­pens during inter­ac­tion. Gun­del et al. (1993), Brennan (1995), Levelt (1989), or Clark and Mar­shall (1981), for exam­p­le, work with the theo­re­ti­cal assump­ti­on of ›spea­k­ers’ models of lis­ten­ers’ know­ledge‹ (Bard et al. 2000, 3).  At a later moment in the inter­ac­tions (which I can only touch upon here), we find that the sis­ter (in that case the spea­k­er) reacts to the dis­play­ed mid-groun­ded atten­ti­on by the par­ti­ci­pant (in that case the hea­rer) when she asks whe­ther he is all right. Thus, here, we find a case of the spea­k­ers’ model of the lis­ten­ers’ atten­ti­on. Simul­ta­neous­ly, howe­ver, we find the oppo­si­te view­point posi­ted by scho­lars such as Cha­fe (1994), Arnold and Lao (2015), or Bard et al. (2000), who cla­im that spea­k­ers focus more on their own speech than on the interlo­cu­tor. This may be par­ti­cu­lar­ly evi­dent when the par­ti­ci­pant sel­ects to pay focu­sed atten­ti­on to the Sky­pe call rather than to the rese­arch ses­si­on. Even though this sel­ec­tion takes time to ful­ly per­form, he chan­ges his spea­king style to ask ques­ti­ons and beco­me more actively involved.

This artic­le thus demons­tra­tes inter­ac­tion­al atten­ti­on (Nor­ris 2002, 2004, 2006, 2008, 2011, 2016, 2019), which is an aspect of a phe­no­me­nal con­cep­ti­on of atten­ti­on. Inter­ac­tion­al atten­ti­on (Nor­ris 2004) con­ver­ges the two points of view on atten­ti­on dis­cus­sed abo­ve. In other words, with this frame­work, we see that interlo­cu­tors do judge and react to the inter­ac­tion­al atten­ti­on of others. At the very same time, with this frame­work, we can deter­mi­ne when and how spea­k­ers focus more on their own than on others’ actions. In order to pro­per­ly ana­ly­ze inter­ac­tion­al atten­ti­on, we have to ana­ly­ze each interlo­cu­tor indi­vi­du­al­ly and ana­ly­ze the interlo­cu­tors tog­e­ther, as sug­gested in Nor­ris (2011). The reason for this is that one interlo­cu­tor may pay a dif­fe­rent level of atten­ti­on to an inter­ac­tion than ano­ther (see also Nor­ris 2006).

This artic­le fur­ther­mo­re demons­tra­ted that syn­chro­niza­ti­on of utteran­ces, not at the same time, but in form of repe­ti­ti­on, can allow an interlo­cu­tor to inter­act ver­bal­ly wit­hout having to pay focu­sed atten­ti­on. This kind of syn­chro­niza­ti­on can have two effects (depen­ding upon the situa­ti­on): 1. It can demons­tra­te that the one syn­chro­ni­zing their utteran­ces to tho­se of the other is lis­tening; and 2. It can pro­du­ce a con­nec­tion. Howe­ver, while this is the case in the inter­ac­tion ana­ly­zed abo­ve, more rese­arch is nee­ded to deter­mi­ne under which cir­cum­s­tances such syn­chro­niza­ti­on func­tions in this way. Fur­ther, it has been sug­gested abo­ve that the rhythm of inter­ac­tion chan­ges when a per­son focu­ses upon it after having pre­vious­ly mid-groun­ded the inter­ac­tion. Here too, more rese­arch is nee­ded to dis­co­ver if a chan­ge of rhythm is always pre­sent after a chan­ge in focus.

The artic­le thus shows that when theo­ri­zing atten­ti­on as inter­ac­ti­ve atten­ti­on, we can deter­mi­ne the atten­ti­on level of par­ti­ci­pan­ts in inter­ac­tion in a theo­re­ti­cal­ly groun­ded man­ner. This way of working ent­ails a broa­der point of view of stu­dy­ing inter­ac­tion, both in the direc­tion of modal use bes­i­des lan­guage (i.e. mul­ti­mo­da­li­ty) and in the direc­tion of delinea­ting what it is that we are exami­ning (moving the rese­arch inte­rest bey­ond the ins­tance that rese­ar­chers may find rele­vant). Modal den­si­ty, it is shown, is achie­ved through eit­her inten­se or com­plex usa­ges of modes, resul­ting in rhythm and pace of speech as well as rhythm and pace of other modes (Nor­ris 2009).

This artic­le cri­ti­cal­ly asses­ses the assump­ti­on that lan­guage use by par­ti­ci­pan­ts neces­s­a­ri­ly leads the rese­ar­cher to the par­ti­ci­pan­ts’ focu­sed inter­ac­tion. As shown with the mul­ti­mo­dal tran­scripts in Figu­res 1–5 and in con­nec­tion with this in Nor­ris (2016), a true focus of atten­ti­on by a par­ti­ci­pant can only be deter­mi­ned if we take a broa­der view of our data. If we are sim­ply picking and choo­sing brief excerp­ts that we as rese­ar­chers are for wha­te­ver reason focu­sed upon, we can­not make any claims about the focus of the par­ti­ci­pan­ts. Fur­ther, we can make no cla­im about the rele­van­ce or importance of the lan­guage that is being used. Lan­guage may be used by a par­ti­ci­pant in the focus, but lan­guage may also be used in the mid-ground and even in the back­ground of a participant’s atten­ti­on. The stu­dy of inter­ac­tion­al atten­ti­on pro­mi­ses to help us dis­co­ver a who­le host of new fin­dings that, as long as we as rese­ar­chers insist that lan­guage pro­duc­tion always occurs in the focus of a par­ti­ci­pant, we may actual­ly miss. Thus, we need to stop picking minu­te inter­ac­tion­al sequen­ces which dis­re­gard the big­ger data pie­ces that the minu­te ones are a part of.

Here, it is neces­sa­ry to rea­li­ze that whe­ther lan­guage is uti­li­zed in the focus, the mid-ground or the back­ground of somebody’s always mul­ti­mo­dal­ly dis­play­ed atten­ti­on does not make lan­guage less important! Rather, it is spe­ci­fi­cal­ly the fact that social actors can and do use lan­guage on a ran­ge of atten­tio­nal levels that moves us into a new and high­ly pro­mi­sing direc­tion of rese­arch. Nor­ris (2011), for exam­p­le, show­ed that a par­ti­ci­pant, who was focu­sing upon a con­ver­sa­ti­on with her fri­end, was deligh­ted that her fri­end did not recipro­ca­te the atten­ti­on that she hers­elf paid to the con­ver­sa­ti­on. In fact, the par­ti­ci­pant who was focu­sing upon the con­ver­sa­ti­on felt safe, un-jud­ged and taken serious­ly by her fri­end who was mid-groun­ding the con­ver­sa­ti­on. Simi­lar­ly, school child­ren may wrongful­ly be told to pay focu­sed atten­ti­on and look at their tea­cher, when in fact they might be lear­ning much bet­ter by not dis­play­ing inter­ac­tion­al focus to what is being said or done. Simi­lar­ly, other inter­ac­tions whe­re the one in a lower power posi­ti­on is asked to give infor­ma­ti­on such as some mana­ger-worker inter­ac­tions or some parent-child inter­ac­tions and pos­si­bly even some doc­tor-pati­ent inter­ac­tions may pro­ceed much smoot­her if the one in power does not pay focu­sed inter­ac­tion­al atten­ti­on to the interlo­cu­tor to demons­tra­te the one giving the infor­ma­ti­on is safe, un-jud­ged and taken serious­ly. Howe­ver, this is only a sug­ges­ti­on and much rese­arch is nee­ded in order to deter­mi­ne how and when inter­ac­tion­al focus is hel­pful and when it is not. But one thing is cer­tain: Rese­arch into inter­ac­tion­al atten­ti­on, which is rese­arch that cros­ses micro-ana­ly­ti­cal boun­da­ries, will have social rami­fi­ca­ti­ons with prac­ti­cal dimensions.



Bakhtin, Mikhail Mikhai­lo­vich 1981: The Dia­lo­gic Ima­gi­na­ti­on. Aus­tin, TX: Uni­ver­si­ty of Texas Press.

Ber­nad-Mechó, Edgar 2017: Meta­dis­cour­se and Topic Intro­duc­tions in an Aca­de­mic Lec­tu­re: A Mul­ti­mo­dal Insight. In: Mul­ti­mo­dal Com­mu­ni­ca­ti­on 6/1, 39–60.

Brey­er, Thiemo/Buchholz, Michael/Hamburger, Andreas/Pfänder, Ste­fan (eds.) 2017: Reso­nanz, Rhyth­mus & Syn­chro­ni­sie­rung: Inter­ak­tio­nen in All­tag, The­ra­pie und Kunst. Bie­le­feld: transcript.

Cha­fe, Wal­lace. L. 1994: Dis­cour­se, con­scious­ness, and time. Chi­ca­go: Chi­ca­go Uni­ver­si­ty Press.

Clark, Her­bert H./Marshall, Cathe­ri­ne R. 1981: Defi­ni­te refe­rence and mutu­al know­ledge. In: Ara­vind K. Joshi/Bonnie L. Webber/Ivan A. Sag (eds.): Ele­ments of dis­cour­se under­stan­ding. Cam­bridge: Cam­bridge Uni­ver­si­ty Press, 10–63.

Cher­ry, E. Colin 1953: Some expe­ri­ments on the reco­gni­ti­on of speech, with one and with two ears. Jour­nal of the Acou­stic Socie­ty of Ame­ri­ca. (25): 975–979.

Helm­holz, Her­mann von 1896: Hand­buch der phy­sio­lo­gi­schen Optic. L. Voss.

Levelt, Wil­lem J. M. 1989: Spea­king. Cam­bridge: MIT Press.

Nor­ris, Sig­rid 2002: A theo­re­ti­cal frame­work for mul­ti­mo­dal dis­cour­se ana­ly­sis pre­sen­ted via the ana­ly­sis of iden­ti­ty con­s­truc­tion of two women living in Ger­ma­ny. Dis­ser­ta­ti­on. Depart­ment of Lin­gu­i­stics, George­town University.

Nor­ris, Sig­rid 2004: Ana­ly­zing Mul­ti­mo­dal Inter­ac­tion: A Metho­do­lo­gi­cal Frame­work. Lon­don: Routledge.

Nor­ris, Sig­rid 2006: Mul­ti­par­ty inter­ac­tion: a mul­ti­mo­dal per­spec­ti­ve on rele­van­ce. Dis­cour­se Stu­dies 8/3, 401–421.

Nor­ris, Sig­rid 2009: Tem­po, Auf­takt, levels of actions, and prac­ti­ce: rhyth­ms in ordi­na­ry inter­ac­tions. Jour­nal of Appli­ed Lin­gu­i­stics 6/3, 333–356.

Nor­ris, Sig­rid 2008: Some thoughts on per­so­nal iden­ti­ty con­s­truc­tion: A mul­ti­mo­dal per­spec­ti­ve.  In Bha­tia, Vijay, Flower­dew John, and Jones, Rod­ney, H. (eds) New Direc­tions in Dis­cour­se. Lon­don: Rout­ledge. 132–149.

Nor­ris, Sig­rid 2011: Iden­ti­ty in (Inter)action: Intro­du­cing Mul­ti­mo­dal (Inter)action Ana­ly­sis. Berlin/Boston: Mouton.

Nor­ris, Sig­rid 2017: Sca­les of action: An exam­p­le of dri­ving & car talk in Ger­ma­ny and North Ame­ri­ca. Text & Talk. 37(1): 117–139.

Nor­ris, Sig­rid 2019: Sys­te­ma­ti­cal­ly working with mul­ti­mo­dal data: Rese­arch methods in mul­ti­mo­dal dis­cour­se ana­ly­sis. Hobo­ken, NJ: John Wiley and Sons.

Nor­ris, Sig­rid (Forth­co­ming): Mul­ti­mo­dal Theo­ry and Metho­do­lo­gy: for the Ana­ly­sis of (Inter)action and Iden­ti­ty. New York: Routledge.

Pash­ler, Harold. E. 1998: The psy­cho­lo­gy of atten­ti­on. Cam­bridge, MA: MIT Press.

Piri­ni, Jes­se 2014: Pro­du­cing Shared Attention/Awareness in High School Tuto­ring. In: Mul­ti­mo­dal Com­mu­ni­ca­ti­on 3/2, 163–179.

Piri­ni, Jes­se 2015: Tuto­ring as Know­ledge Com­mu­ni­ca­ti­on: A mul­ti­mo­dal (Inter)action Ana­ly­sis. Unpu­blished PhD the­sis. Auck­land Uni­ver­si­ty of Tech­no­lo­gy, New Zealand.

Piri­ni, Jes­se. 2017. Agen­cy and Co-pro­duc­tion: A Mul­ti­mo­dal Per­spec­ti­ve. Mul­ti­mo­dal Com­mu­ni­ca­ti­on 6/2, 1–20.

Tan­nen, Debo­rah 1984: Con­ver­sa­tio­nal Style: Ana­ly­zing Talk Among Fri­ends. Nor­wood, NJ: Ablex.


Online sources

Arnold, Jen­ni­fer E./Lao, Shin-Yi C. 2015: Effects of psy­cho­lo­gi­cal atten­ti­on on pro­no­un com­pre­hen­si­on. In: Lan­guage, Cogni­ti­on and Neu­ro­sci­ence 30, 832–852. DOI: 10.1080/23273798.2015. 1017511.

Bard, Ellen Gurman/Anderson, Anne H./Sotillo, Catherine/Aylett, Matthe/­Doh­erty-Sned­don, Gwyneth/Newlands, Ali­son 2000: Con­trol­ling the intel­li­gi­bi­li­ty of refer­ring expres­si­ons in dia­lo­gue. In: Jour­nal of Memo­ry and Lan­guage 42 (1), 1–22. DOI: 10.1006/jmla.1999.2667.

Brennan, Sus­an E. 1995: Cen­te­ring atten­ti­on in dis­cour­se. In: Lan­guage and Cogni­ti­ve Pro­ces­ses, 10(2), 137–167. DOI: 10. 1080/01690969508407091.

Gun­del, Jea­nette K./Hedberg, Nancy/Zacharski, Ron 1993: Cogni­ti­ve sta­tus and the form of refer­ring expres­si­ons in dis­cour­se. Lan­guage 69, 274–307. DOI: 10.2307/416535.

Nor­ris, Sig­rid 2016: Con­cepts in mul­ti­mo­dal dis­cour­se ana­ly­sis with examp­les from video con­fe­ren­cing. Year­book of the Poz­nań Lin­gu­i­stic Mee­ting 2 (1), De Gruy­ter Open, 141–165. ISSN (Online): 2449-7525, DOI: 10.1515/yplm‐2016‐0007.



1 I would like to thank Frei­burg Insti­tu­te for Advan­ced Stu­dies (FRIAS), Uni­ver­si­ty of Frei­burg, Ger­ma­ny and the Peo­p­le Pro­gram­me (Marie Curie Actions) of the Euro­pean Union’s Seventh Frame­work Pro­gram­me (FP7/2007–2013) under REA grant agree­ment no. [609305] for making the wri­ting of this artic­le pos­si­ble. I would also like to thank the Facul­ty of Design and Crea­ti­ve Tech­no­lo­gies, the School of Com­mu­ni­ca­ti­on Stu­dies, and the AUT Mul­ti­mo­dal Rese­arch Cent­re at Auck­land Uni­ver­si­ty of Tech­no­lo­gy in New Zea­land for fun­ding the pro­ject that this artic­le is based upon. Fur­ther, I would like to thank the par­ti­ci­pan­ts in the Fami­ly Video Con­fe­ren­cing Inter­ac­tions Project.

2 Utteran­ces: par­ti­ci­pant = white, part­ner = green, sis­ter = yel­low, rese­ar­chers = pink, child = blue. All images are published with per­mis­si­on of the participants.