How Can Single-Case Data Be Analyzed? Software Resources, Tutorial, and Reflections on Analysis

The present article aims to present a series of software developments in the quantitative analysis of data obtained via single-case experimental designs (SCEDs), as well as the tutorial describing these developments. The tutorial focuses on software implementations based on freely available platforms such as R and aims to bring statistical advances closer to applied researchers and help them become autonomous agents in the data analysis stage of a study. The range of analyses dealt with in the tutorial is illustrated on a typical single-case dataset, relying heavily on graphical data representations. We illustrate how visual and quantitative analyses can be used jointly, giving complementary information and helping the researcher decide whether there is an intervention effect, how large it is, and whether it is practically significant. To help applied researchers in the use of the analyses, we have organized the data in the different ways required by the different analytical procedures and made these data available online. We also provide Internet links to all free software available, as well as all the main references to the analytical techniques. Finally, we suggest that appropriate and informative data analysis is likely to be a step forward in documenting and communicating results and also for increasing the scientific credibility of SCEDs.

data analysis is likely to be a step forward in documenting and communicating results and also for increasing the scientific credibility of SCEDs.
Keywords single-case designs, data analysis, effect size, software, tutorial The current article aims to bring analytical developments closer to applied researchers conducting single-case studies by presenting and illustrating free software available for carrying out the analysis-software that is accompanied by free 270-plus-page tutorial. 1 First, we offer a brief presentation of single-case experimental designs (SCEDs). Second, we stress the need to focus on data analysis, including both visual and statistical analyses, and we explain why the current article represents a step forward in this topic. Third, we provide an illustration, applying several analytical techniques to a real dataset, with the latter being selected as representative of SCED studies.

Sustained and Increased Attention to SCEDs
The three main characteristics of an SCED (focus on one entity, repeated measures across time, and experimental control;  can be used to define SCEDs in a broader research context, which might help understanding how these designs differ from closely related designs. In terms of the focus, unlike group-comparison design study dealing with average treatment effect estimates, SCED studies focus on a limited number of preselected individuals and subject-specific treatment effect estimates are obtained (Barlow, Nock, & Hersen, 2009;. Moreover, unlike the most common between-group studies, SCEDs make possible identifying within-subject trends. In terms of experimental control, SCEDs do not always entail random assignment of measurement occasions to treatments. However, a SCED should not be confounded with a (qualitative) case study (Blampied, 2000) or observational case study research, as in these latter types of studies there is not a purposeful manipulation of an independent variable nor are there necessarily repeated measures.
From an applied perspective, the relevance and recognition of the use of SCEDs has been made evident from the number of papers dedicated recently to the topic in disciplines as varied such as speech-language pathology (Byiers, Reichle, & Symons, 2012), pediatric psychology (Cohen, Feinstein, Masuda, & Vowles, 2014), education (Plavnick & Ferreri, 2013), technology-based medical interventions (Dallery, Cassidy, & Raiff, 2013), sport psychology (Gorczynski, 2013), rehabilitation (Graham, Karmarkar, & Ottenbacher, 2012), and group work (Macgowan & Wong, 2014). This is well aligned with the recognition of SCEDs as a means of obtaining evidence about interventions (Howick et al., 2011). Accordingly, the conclusion of Smith's (2012) review of 409 studies was that "recently published SCED research is largely in accordance with contemporary criteria for experimental quality" (p. 510).
From an academic perspective, the salience of SCEDs is also illustrated in the publication in the last decade of revised editions of classical and major reference books on SCED methodology and analysis, such as the ones by Barlow et al. (2009), Kazdin (2011), Gast and Ledford (2014), and Kratochwill and Levin (2014). Moreover, there have been an important number of journal special issues dedicated to SCEDs from a methodological and/or analytical point of view (e.g., Journal of Behavioral Education, Journal of Applied Sport Psychology, Remedial andSpecial Education in 2013, Journal of School Psychology in 2014, Neuropsychological Rehabilitation). Accordingly, the U.S. Institute of Education Sciences (2014) continues to show interest in funding research related to single-case methodology.

SCED Researchers' Data-Analytical Practices
The salience and wider acceptance of SCEDs as a valid methodology for obtaining scientific evidence has been translated in methodological advances such as standards Smith, 2012), methodological quality scales (Tate et al., 2013), and further recommendations (Horner et al., 2005;. Concrete recommendations about reporting SCED studies are also available (Tate et al., 2016). In terms of data analysis, it has been recommended that objective summary measures be used for documenting results, communication across researchers and meta-analysis (Busse, Kratochwill, & Elliott, 1995;Jenson, Clark, Kircher, & Kristjansson, 2007;Kromrey & Foster-Johnson, 1996), but this is not always the case, given the strong predominance of visual analysis (Kratochwill & Brody, 1978;Parker et al., 2005;Perdices & Tate, 2009). In that sense, it is possible that apart from methodological improvements, such as randomization , the scientific credibility of SCEDs could be boosted by improving the analytical practice of applied researchers and practitioners in their everyday life work.
Several reviews have been performed on the way in which SCED data are analyzed. Parker and Brossart (2003) performed an informal review of SCED articles in counseling, clinical and school psychology journals in the period 1987-2002, reporting that visual analysis was used in absence of statistical analysis in more than 65% of articles. The review of Perdices and Tate (2009) in the field of neuropsychological rehabilitation in the period [1991][1992][1993][1994][1995][1996][1997][1998][1999][2000][2001][2002][2003][2004][2005][2006][2007][2008] showed that 78% of the articles reported graphed data, 64% used some kind of statistical analysis, and 26% used visual analysis alone. Smith (2012) reviewed articles from peer-reviewed journals for the period 2000-2010, and reports that visual analysis alone is used in 21.6% of the studies using multiple-baseline designs, 17.1% of the studies using a reversal design, and 23.1% of alternating treatment designs, whereas the corresponding percentages for statistical analysis alone are 13.4, 12.9, and 7.7 and those for combined visual and statistical analysis are 6.4, 5.7, and 19.2. Considering the repeated calls to use visual and statistical analysis jointly (Fisch, 2001;Franklin, Gorman, Beasley, & Allison, 1996;Harrington & Velicer, 2015;Houle, 2009), apparently it is still necessary to stress that point, given the relatively infrequent combined use of these two types of analyses.
To complement visual analysis with quantitative analysis, the following steps could be followed: (a) create sound techniques and test their statistical properties with simulated data, and field test them with real behavioral data; (b) illustrate their application in a step-by-step fashion in the same way as some of the papers published in the abovementioned special issues (e.g., Heyvaert & Onghena, 2014;; (c) present the developments at conferences and workshops (e.g., Manolov, Krasny-Pacini, Evans, & Chevignard, 2014;; (d) develop software and tutorials; and (e) write papers presenting the software and tutorials (e.g., Bulté & Onghena, 2009Onghena & Van Damme, 1994). The current article corresponds to the fifth step, whereas a description of the tutorial is provided in the appendix. In the following, we present the breadth and limitations of the usefulness of the current article.

Usefulness of the Software, the Tutorial, and Current Article
We consider that there are two basic ways in which guidance can be provided to applied researchers conducting SCEDs and willing to analyze their data quantitatively. The first option is to review the evidence and discussions available on the performance and the characteristics of all/many analytical techniques, and recommend which techniques should be used when. This option would deal with the question "What is the use of a good analytical technique, if applied researchers are not aware of its existence and/or qualities?" We do mention here most of the recent developments and show how they can be applied, but we do not compare their quality or appropriateness for establishing recommendations for choosing among them in different situations. This latter topic is dealt with in Manolov and Moeyaert (2016) and it is important for future research.
The second option is to make the application of all/many analytical techniques a feasible task by providing free software and a tutorial explaining how to use it. We have been creating such software and reviewing the software created by other authors, and we have also created such a tutorial. Therefore, in the current article we aim to illustrate the capabilities of a variety of analytical techniques with a typical SCED dataset. Moreover, the current article helps dealing with the question "What is the use of a myriad of analytical proposals if these cannot be applied easily and with no additional cost?" Thus, the purpose was to bring recent SCED analytical developments closer to applied researchers to bridge the gap between statistical advances and actual analytical practice by making applied researchers aware of the fact that there are multiple user-friendly software resources available. However, we do not claim that such information is sufficient for improving the analytical skills of researchers.
In summary, the need for software, tutorial, and an article presenting them is based on the following points: (a) the software presented is freely available; (b) the tutorial guides the application of the analytical techniques in a step-by-step fashion, relying heavily on screenshots, commenting on the results obtained, and referring the interested reader to the original literature presenting the techniques; (c) many different quantitative techniques are included in the software and tutorial and illustrated here, given that SCED data are complex, and that the necessary visual analysis deals with several aspects of the data such as assessing baseline stability, within-phase level and trend, changes in level, trend and variability, overlap, immediacy of effects, consistency of the patterns, and comparing observed and projected data patterns Lane & Gast, 2014), apart from taking into account whether the predictions make sense or not (Parker, Cryer, & Byrns, 2006); (d) visual analysis is also part of the software, tutorial, and current illustration, given that it can be useful for selecting an analytical technique and for evaluating whether the quantitative results obtained are intuitively meaningful (Parker et al., 2006); actually, there is a graphical representation of the data accompanying almost every quantitative procedure, to ensure that the assessment of intervention effectiveness takes into account both types of information as suggested repeatedly (Davis et al., 2013;Fisch, 2001;Smith, 2012); (e) the techniques included and illustrated show a wide variety of complexities, from simple graph rotation to building multilevel models; (f) one of the main complexities of the software are the different data structures required by the different pieces of R code/packages, but we here include an Excel file, as complementary online material, with separate worksheets representing all the data structures required for all analytical techniques included in the tutorial (even if they are not illustrated here); (g) illustrating the different uses and types of information provided by the different techniques, we are also implicitly giving some indications regarding the crucial questions "Which techniques should be used when?" and "How should I analyze my data?" Regarding previous papers on single-case data analysis (SCDA) software tools, most have a narrower focus than the current article. In chronological order, the following documents have been made public : Nagler, Rindskopf, and Shadish (2008) illustrated how UnGraph can be used for retrieving data from graphs, and they also show how multilevel models can be applied with SPSS and HLM software; Dixon et al. (2009) explained how to create graphs with Microsoft Excel; Bulté and Onghena (2013) presented the SCDA plugin for R; Parker, Vannest, and Davis (2014a) mentioned the WinPepi free software (http://www.brixtonhealth.com/pepi4windows.html) for computing Tau-U for designs beyond the basic AB (baseline phase followed by an intervention) and also for meta-analytical use; Levin, Ferron, and Gafurov (2014) and Levin, Evmenova, and Gafurov (2014) described the use of the Excelbased ExPRT for randomization tests; de Vries, Hartogs, and Morey (2015) presented R code for Bayesian analysis about estimating effect size and hypothesis testing; Maric, de Haan, Hogendoorn, Wolters, and Huizenga (2015) explained the use of SPSS for piecewise regression analysis, and Busse, McGill, and Kennedy (2015) mentioned a webpage (http://www.interventioncentral.org/teacher-resources/graph-maker-free-online) that graphs the data and allows computing trend, the percentage of nonoverlapping data and the standardized mean difference using the standard deviation of the baseline data in the denominator. The most general article on SCED analysis software is by Chen, Peng, and Chen (2015), published after the submission of the current article and dealing with several different types of software and analyses. Chen et al. (2015) provided less detail regarding use and interpretation of the output than the one available in the current article and in our tutorial. In that sense, Chen et al. (2015) focused more on whether the software tools function properly, whereas we deal with how to interpret their outcomes. In terms of interpretation, we rely heavily on graphical representations to aid the interpretation of the results of the techniques, given the importance of visual analysis as a way of validating quantitative results (Parker et al., 2006). In contrast, Chen et al. (2015) paid more attention to the formulaic expression of the analytical techniques. Finally, we present a tutorial created by us, which describes the use of many R scripts also developed by us and useful for implementing a variety of procedures (created or suggested by a variety of authors), apart from describing software developed by other researchers. In summary, we consider that the Chen et al. (2015) article, together with the current text, and the tutorial we are presenting here offer sufficient information for applied researchers to know how to implement virtually any SCED analytical technique, once it has been chosen.

Illustrative Data
To choose an appropriate dataset for illustrating the analytical techniques and the output of the software, we took into account the characteristics of SCED data as reported in recent reviews (Shadish & Sullivan, 2011;Solomon, 2014) and also the design requirements for making possible the demonstration of intervention effects Tate et al., 2013). As a result, we chose the data collected by Singh and colleagues (2007) on mindfulness training for controlling aggressive behavior in people diagnosed with several mental disorders such as depression, schizoaffective disorder, borderline personality, and antisocial personality. These data have the following characteristics: (a) a multiple-baseline design is used, which is the most common design structure (present in 54% of studies), including three cases which represent the median and modal number of cases per study (Shadish & Sullivan, 2011) and meets the design requirements from the What Works Clearinghouse Standards ; (b) the amounts of data points for each comparison between baseline and treatment condition are 16 (for Jason), 18 (for Michael), and 22 (for Tim), matching well the median number of 20 measurements found by Shadish and Sullivan (2011); the baselines have lengths of 3, 4, and 6, which matches well the finding that 54% of the baselines have less than 5 measurements, and meets current standards of a minimum of 3 Tate et al., 2013); (c) the number of outcomes measured per case is 2 (verbal and physical aggression), which corresponds well to the finding that most SCEDs (60%) include more than one outcome per case, and these outcomes represent the number of aggressive behaviors, 2 which is well aligned with the fact that 48% of the SCEDs use total counts as measures (Shadish & Sullivan, 2011); (d) some of the AB comparisons present baseline trend, but these trends are heterogeneous (some improving, some deteriorating), as found in the Solomon (2014) review. Thus, we can consider that the dataset analyzed here is typical for SCEDs, although the complexity of possible SCED data patterns is impossible to illustrate with a single dataset and impossible to represent in a single study. The reader interested in learning more about SCED data is referred to scholarly texts by Barlow et al. (2009), Gast and Ledford (2014), Johnston and Pennypacker (2009), Kazdin (2011), Kennedy (2005), and Kratochwill and Levin (2014).

Data Analysis
We use the free software (mainly R packages and R code, but also web pages) that is described in the tutorial. Practically, all analyses performed require only entering the data in a specific way. We have included as supplementary material an Excel file containing the data organized in all necessary ways. The only technique requiring more than data input is the multilevel model. We provide the exact code used here as supplementary material in a text file.

Results
The results will be presented in the following order: First, we focus on the within-phase levels and comparisons of the levels in different conditions. We start inspecting visually the whole dataset, including two outcomes for each of the three participants, before we move to comment specific AB comparisons according to the data characteristics of interest and according to the aspects that visual aids and quantitative analysis deal with. Second, we take a look at within-phase trends and changes in trend across conditions, once again beginning the analysis with all the data, before moving to some comparisons between pairs of conditions (baseline and treatment data for one of the outcomes for one of the participants). Third, we present procedures that quantify changes in both level and slope. The reader will see that some of the analyses also entail a comparison between projected and actually obtained (intervention phase) data. Fourth, we comment on procedures for quantifying data overlap. All these analytical techniques (and several more) can be implemented with the free software commented in the tutorial we present here.
In the process of looking at the different aspects of the data quantifying any changes taking place, we will show that some analytical techniques (e.g., the d-statistic by Hedges and colleagues and multilevel models) combine results from separate AB comparisons in a more direct way than others (e.g., nonoverlap indices). For the latter, we will illustrate how such integrations can easily (although slightly more laboriously) be obtained from the separate quantifications of behavioral change. Figure 1 shows that for two outcomes of all three participants, there is a change in level consistent with the desired intervention effect. This effect is visually clearer for physical aggression as an outcome and for Jason as a participant.

Level
The naked-eye visual inspection can be complemented with visual aids (Visual tools section of the tutorial), such as the standard deviation bands Note. Obtained using the SCDA plug-in for R (http://cran.r-project.org/web/packages/ RcmdrPlugin.SCDA/index.html) described in the Visual tools section of the tutorial. The dashed line represents the within-phase median. SCDA = single-case data analysis. (Callahan & Barisa, 2005;Pfadt & Wheeler, 1995), which are more appropriate when the baseline does not show a clear trend. When looking into Michael's data (Figure 2), we see that the physical aggressions are lower than what can be expected by projecting the baseline mean and considering the baseline variability; for verbal aggression, this visual aid 3 suggests lack of effect. This is consistent with our general impression about effect according to the outcome stated above.
If a researcher is willing to quantify the difference in level, she or he can turn to the procedures included in the following sections of the tutorial: percentage indices not quantifying overlap, unstandardized indices and their standardized versions, or application of two-level multilevel models for analyzing data (see the appendix). One option is to use the mean baseline reduction Note. Obtained using R code (https://dl.dropboxusercontent.com/s/elhy454ldf8pij6/SD_ band.R) and described in the Visual tools section of the tutorial. (Campbell, 2004) or the percentage change index (Hershberger, Wallace, Green, & Marquis, 1999; also called percentage reduction data, Wendt, 2009). Focusing on the verbal aggressions by Jason and Michael (Figure 3), we can see that the mean baseline reduction taking all the data into account is similar (73.7% vs. 76.8%), whereas if the quantification is based only on the last 3 points of each phase due to some substantive reason the effect is clearer for Jason (percentage reduction data 94.4% vs. 77.8%), confirming our visual impression and attaching an objective quantitative summary to it. Note that both indices convert the raw measures into percentages and thus make the results comparable (e.g., despite the fact that Michael showed initially less verbal aggressions). However, in some cases it may be justified to expect the effect to accumulate at the last (three) intervention data points, whereas the choice of the last baseline measurements may be due to the stability (if the Note. Obtained using R code (https://www.dropbox.com/s/wt1qu6g7j2ln764/MBLR.R?dl=0) and described in the Percentage indices not quantifying overlap section of the tutorial.
researcher waited for the baseline to stabilize before intervening). All values of the percentage change index (100% and 94.4% for Jason; 100% and 77.8% for Michael; 100% and 68.8% for Tim for physical and verbal aggressions, respectively) indicate that the reduction in target behavior is substantial and potentially clinically relevant. Moreover, this index confirms our visual impression that the change for physical aggression is greater.
Another option is to compute a standardized mean difference, as proposed by Shadish (2012, 2013). This index allows obtaining a single quantification (controlling for small sample size and for autocorrelation) for a multiple-baseline design or an AB k design if there are at least three cases present. For physical aggression, we obtained d = −2.25 (SE = 0.55) and for verbal aggression d = −1.44 (SE = 0.55), once again reflecting the clearer effect for the former type of behavior. In comparison with the previous indices expressed in percentages, this one is expressed in standard deviations: The variability within-and between cases is taken into account. Actually, it should be noted that while the numerator of the d-statistic deals with differences in level, its denominator takes variability (another data feature which is object of visual analysis) into account. Moreover, as the d-statistic by Hedges et al. (2012Hedges et al. ( , 2013 is comparable with the classical d-statistic for group designs, it could be argued that this index can be interpreted in terms of overlap, assuming normality of the distributions (Vacha-Haase & Thompson, 2004). It is also possible to combine meta-analytically the d-statistic values . For merely illustrative purposes, we will here proceed, as if the physical and verbal aggression outcomes were independent 4 (a requisite for meta-analysis), although they are not as the data are obtained from the same participants. Figure 4 shows the forest plot from which we see that the weighted average d = −1.85 and its 95% confidence interval ranges from −2.64 to −1.05, indicating the relatively low precision of the point estimate and its statistical significance at the .05 level. The true heterogeneity observed in the effects is very small I 2 = 7.18%, as it is smaller than the usual cutoff for small heterogeneity, 25%. The results obtained so far indicate that the effect of the intervention is large, at least in quantitative terms.

Immediate Change
When looking at and quantifying the level in each condition, it is possible to focus not only on average but also on the immediacy of the effect, which is one of the aspects assessed in visual analysis . However, the data for verbal aggression for Jason seem to suggest an immediate change. Immediate change can be quantified via piecewise regression analysis (Center, Skiba, & Casey, 1985-1986, from the "Unstandardized indices and their standardized versions" section of the tutorial, also quantifying change in slope, which is why it is also commented in the "Level and Trend" subsection later. Most of the datasets in Figure 1 show that the main reduction takes place after 1 or 2 weeks, suggesting that a change in slope might describe better the type of effect than a change in level. The graphical representation and quantification for Jason provided in Figure 9 show that the immediate effect of the intervention consists in a reduction of 7.15 behaviors. In contrast, for Michael, the decrease is estimated, according to piecewise regression, as half a behavior (−0.5). However, if we look at the data actually obtained and not at the regression lines fitted, the immediate effect of the intervention is actually a deterioration; an increase from two to three physically aggressive behaviors. Moreover, the regression trend fitted to the intervention phase data does not seem to represent them well. Thus, this quantification should also be interpreted with caution. Note. The values of the d-statistic are obtained using the "scdhlm" package for R (http://blogs. edb.utexas.edu/pusto/software/), whereas the forest plot is obtained using R code (https:// www.dropbox.com/s/41gc9mrrt3jw93u/Across%20studies_d.R?dl=0) and described in the Integrating results of several studies section of the tutorial.

Trend
As the previously presented graphical representations and quantifications do not take trend into account, further analyses are necessary, as the consideration of trend might change our initial conclusions. If we turn our attention to trend ( Figure 5), we can see that in all AB comparisons there seems to be a change in (ordinary least squares regression) trend with the introduction of the intervention. For Jason and Tim, flat and worsening trends give way to improving trends after the intervention (indicating intervention effectiveness), whereas for Michael the trend stops being as improving as it was before the intervention. The latter finding is related to the fact that counts lower than 0 are impossible. Note how looking at the trends makes the evaluation of the data an easier task, as complexity is reduced. However, at the same time, one should assess to what extent the trends match well the measurements obtained. Actually, we here illustrated the simplest linear model, although the type of data (with an achievable minimum of 0 may require a logistic model). Moreover, we used the ordinary least squares estimation, which is appropriate for continuous data (i.e., not counts)-more complex and potentially more appropriate options can be consulted in Shadish, Zuur, and Sullivan (2014).
Apparently, the only baseline trends that are cause for concern are the ones observed for Michael. One way of dealing with trends (included in the Visual tools section of the tutorial) is to physically rotate the graph ( Figure 6) after a trisplit trend has been fitted, so that this trend is now perfectly horizontal (Parker, Vannest, & Davis, 2014b). Afterward, a quantification of choice can be computed on the rotated data (Parker et al., 2014b,

Comparing Observed and Projected Data Patterns
It is also possible to fit split-middle (i.e., bisplit) trend to the data and project it into the next phase to explore whether the projected and actual data are similar, taking baseline data variability into account . The illustration provided for Michael's verbal aggression (Figure 7) shows that no intervention phase measurements improve what could already be predicted from the baseline. If a quantification is desired, the percentage of data points exceeding the split-middle trend (Wolery, Busick, Reichow, & Barton, 2010) can be computed. Figure 7 shows graphically and numerically that few (28.6%) of the intervention phase data points improve the projected baseline trend. These results suggest that the effect on Michael's verbal aggressive behavior is not clear.
Another quantification possible is using the mean phase difference (MPD; Manolov & Rochat, 2015;Manolov & Solanas, 2013b) from the "Unstandardized indices and their standardized versions" section of the tutorial. In this case, similar to the percentage of data points exceeding the split-middle trend, a trend Note. Obtained using the SCDA plug-in for R (http://cran.r-project.org/web/packages/ RcmdrPlugin.SCDA/index.html) and described in the Visual tools section of the tutorial. The dashed line represents the within-phase trend estimated by ordinary least squares regression. SCDA = single-case data analysis. Figure 6. Graphical representation of the use of the graph rotation technique (Parker et al., 2014b) on the data obtained by Singh et al. (2007) for the verbal aggressions by Michael.
Note. The graphs are obtained using R code (https://www.dropbox.com/s/jxfoik5twkwc1q4/ GROT.R?dl=0) and described in the Visual tools section of the tutorial.
line is extended, but there are two differences: (a) in MPD trend is based on differencing not on the split-middle method, and (b) the difference between projected and actual intervention phase measurements is computed, instead of focusing on overlap. As can be seen from Figure 8, the projections for Michael (into impossible negative values, as was the case using split-middle trend) and for Tim differ from the actual measurements obtained.

Level and Trend
According to what we have seen from Figure 5, it might be interesting to quantify the changes in both level and slope, given that both types of effects are present. However, these effects are not always in the same direction. Here, we will use tools from the tutorial sections "Unstandardized indices and their standardized versions" and "Application of two-level multilevel models for analyzing data" (see the appendix). Using the slope and level change procedure (Solanas, Manolov, & Onghena, 2010), we obtain the following quantifications of change in slope for physical aggression: −0.25 for Jason, 0.10 for Michael, and −0.60 for Tim, indicating that the improving change in slope is stronger for Tim than for Jason, and Michael's physically aggressive behavior is getting reduced at a slightly slower rate after the intervention than before. Looking the net level change (once slope change is controlled for), we obtain −0.86 for Jason, 0.37 for Michael, and −1.44 for Tim. We once again observe apparently a deterioration for Michael, but it is due to Figure 7. Graphical representation of the use of the split-middle trend and percentage of data points exceeding it, as well as a projection taking baseline data variability into account.
Note. The data are the verbal aggressions by Michael, as collected by Singh et al. (2007). The graphs are obtained using R code (https://www.dropbox.com/s/rlk3nwfoya7rm3h/ PEM-T.R?dl=0, described in the Nonoverlap indices section of the tutorial, and https:// dl.dropboxusercontent.com/s/5z9p5362bwlbj7d/ProjectTrend.R described in the Visual tools section of the tutorial). Note the different ways in which the y-axis is represented and its possible effect on visual inspection. Note. The data are the verbal aggressions by Michael and Tim, as collected by Singh et al. (2007). The graphs are obtained using R code (https://www.dropbox.com/s/ nky75oh40f1gbwh/MPD.R?dl=0, described in the Unstandardized indices and their standardized versions section of the tutorial).
the strong effect of the correction of baseline trend. Thus, in this case, the numerical result for the net level change does not seem to agree with the visual impression and we will stick to the latter, as it seems to represent better the data. The average of the slope change estimate for physical aggression, weighted by the number of measurements in each AB comparison, is −0.27 (see Figure 9). This value can be interpreted as an average progressive decrease of almost three physically aggressive behaviors per each 10 intervention phase measurements. The weighted average level change for physical aggression is −0.86, that is, less than one behavior average difference. When we look at the same quantifications for verbal aggressions, we see a weighted average slope change of −0.30 (similar to physical aggression) and a weighted average level change of −2.47 (indicating a much larger change than for physical aggression). The latter estimate for the average net level change diverges from our initial visual impression that the intervention effect Note. The average verbal and physical behaviors across the three participants are obtained first (using https://www.dropbox.com/s/74lr9j2keclrec0/Within-study_SLC_std.R, and described in the Unstandardized indices and their standardized versions section of the tutorial), before obtaining the global weighted average. This graph is obtained using R code (https://www.dropbox.com/s/wtboruzughbjg19/Across%20studies.R) and described in the Integrating results of several studies section of the tutorial.
is clearer for physical aggression. This difference is due to the strong influence of the baseline trend estimated and controlled for in the case of Jason, leading to a level change of −6.85 for this participant (apart from the change in slope of −1.67). Thus, the numerical values in this case have to be interpreted with caution.
To continue exploring the changes in level and in slope, it is possible to use piecewise regression analysis (Center et al., 1985(Center et al., -1986. We will turn our attention to the verbally aggressive behaviors for Jason-we previously mentioned an immediate reduction of 7.15 behaviors. Moreover, the change in slope is negative, suggesting a reduction of 1.6 behaviors per measurement occasion, as compared with the baseline trend. In this case, the results are very similar to the ones provided by the slope and level change procedures, and in both cases it has to be considered whether a trend can be fitted reliably to only three baseline data points and whether it can be expected to continue in the same way (to very high values) throughout the intervention phase.
Next, we turn our attention to the verbal aggression measurements for Michael, as we said that the level change estimate of the slope and level change procedure seemed to disagree with the visual impression. From Figure 10, we observe that the change in slope (0.19) as estimated through piecewise regression is consistent with the slope and level change procedure and with the visual impression of small change.
Finally, another analytical technique that makes possible quantifying change in level and in slope 5 is a multilevel model . Multilevel models (also referred to as hierarchical linear models and mixed linear models) allow modeling the dependencies in SCED studies (i.e., the measurements belonging to the same individuals are autocorrelated, see the reviews by Shadish & Sullivan, 2011;Solomon, 2014) and also when combining several SCED studies (i.e., the outcomes from the same study are not assumed to be independent). Moreover, these models permit modeling both fixed effects (e.g., the same baseline level for all individuals) and random effects (e.g., different effects of the intervention on time trend for the different individuals). In that sense, what are obtained are both average estimates across cases (e.g., an average level change for verbal aggression and for physical aggression) and estimates of the variation between cases in these changes. Moreover, individual-level (shrunken) estimates are provided using Empirical Bayes estimation. Other modeling capabilities include taking autocorrelation and heterogeneous variance into account. However, one of the potential limitations of the technique is that it requires many Level 2 units unless data series of least 20 measurements are available, to ensure the precision of the estimates of the average effects, especially of the variance components which should be interpreted with caution in datasets such as the current one (Ferron, Bell, Hess, Rendina-Gobioff, & Hibbard, 2009). For more details about how models can be made increasingly more complex to model different types of effects (e.g., change in level and change in slope) and to account for autocorrelation and within-case or between-cases variability, see , whereas for more information about how to specify design matrices for a variety of SCEDs, Moeyaert, Ugille, Ferron, Beretvas, and Van den Noortgate (2014) could be consulted.
Given that two outcomes per participant we measured, we apply 2 twolevel models (measurements nested into individuals): one for verbal Note. The numerical results and the graphs are obtained using R code (https://www.dropbox. com/s/bt9lni2n2s0rv7l/Piecewise.R?dl=0), and described in the Unstandardized indices and their standardized versions section of the tutorial. aggression and one for physical aggression. In the current example, with only three participants, it was not possible to model all data aspects that we wanted to model as random effects (i.e., allowing for variation between participants): baseline trend, change in level, change in trend, and autocorrelation. Thus, we did not model baseline trend. The graphical representation of the results (Figure 11) shows that for physical aggression, all participants start from a very similar level (2.19; SE = 0.33) and they also show similar changes in level (−0.56; SE = 0.43) and in slope (−0.16; SE = 0.04). First, considering that baseline trend was not modeled, the average estimate of the initial baseline value (i.e., the intercept) may not be a good representation of the actual measurements. Second, these values for the two types of effects are somewhat smaller than the ones obtained by the slope and level change procedure, but both are consistent with the general visual impression and offer a quantification of the amount of change. The amount of difference should be assessed, taking into account the fact the number of physically aggressive behaviors observed ranges from 0 to 4. Such information is potentially useful for stating whether the effect observed is rather small or large.
For verbal aggression, there is greater variability in the initial levels (an average of 6.97, SE = 2.50, with individual estimates equal to 11.81, 4.02, and 5.09), in the change in level (an average of −2.58, SE = 1.46, with individual estimates equal to −4.99, −2.15, and −0.62), and in slope (an average of −0.30, SE = 0.16, with individual estimates equal to −0.61, −0.13, and −0.18). The greater variation is consistent with the visual inspection and makes clear how much larger the effect is for Jason. When such amount of variability (which is also statistically significant) is observed, multilevel models allow for including moderator variables to account for it, but we will not complicate the analysis further here. Another interesting result is that the average effects (Table 1) are very similar to the ones provided by the slope and level change procedure (−2.58 vs. −2.47 for change in level; −0.30 according to both for change in slope).

Overlap
We have already shown a progression from simpler analysis (i.e., visual) to more complex ones (i.e., multilevel models). However, there is one other type of information usually taken into account by visual analysts: data overlap. From Figure 1, we see that there is very little overlap between the measurements pre-and postintervention for Jason and Michael, and thus, according to this criterion, the intervention effect is clearer for them than for Tim. In terms of quantification, there have been many nonoverlap indices proposed (see , and the "Nonoverlap indices" section of the tutorial), but we will focus here on two relatively recent and promising proposals (NAP by ; and Tau-U by Parker, Vannest, Davis, & Sauber, 2011) instead of the classical Percentage of nonoverlapping data Figure 11. Graphical representation of the multilevel (two-level) models (modeling random immediate effect and change in slope) applied separately on data obtained by Singh et al. (2007) for physical and verbal aggressions.
Note. The graphs are obtained adapting slightly the R code (https://www.dropbox.com/s/ slakgbeok8x19of/Two-level.R?dl=0) and described in the application of two-level multilevel models for analyzing data section of the tutorial. The slim lines are the predicted values for each participant. The thick line for the model for verbal aggression represents the average predicted values for all three participants. (Scruggs, Mastropieri, & Casto, 1987). The main difference between the two indices is the possibility to control for baseline trend. Figure 12 focuses on the verbal aggressive behaviors by Jason and Michael. For Jason, there is no data overlap (NAP ≡ A vs. B = −1.00), and there is also a trend that is deteriorating in general. However, due to the fact that the second baseline point is actually an improvement compared with the first one, Tau-U (A vs. B − trendA) controls for this improvement and yields a value of −.95. For Michael, the general Note. SAS proc mixed is a commercial alternative for carrying our multilevel analysis . NA = result not available from the output of the R package "nlme"; NP = result not provided for this model by the R package "nlme"; CI = 95% confidence interval.  ). An index sometimes (e.g., Wehmeyer et al., 2006) used jointly with nonoverlap indices is the percentage of zero data (Wolery et al., 2010): a quantification that could be useful for the current data, given that the aim is to eliminate the aggressive behavior. According to this index, the majority of Note. The numerical results and the graphs are obtained using R code (https:// dl.dropboxusercontent.com/u/2842869/Tau_U.R) described in the Nonoverlap indices section of the tutorial. measurement occasions after the first 0 remain at 0 (90.9% and 80% for Jason; 100% and 63.6% for Michael; 84.6% for Tim's physically aggressive behavior), except for Tim's verbal aggression (28.6%). Taking this result into account, together with the fact that there is only one 0 value in all six baseline phases, the effect of the intervention seems to be a practically relevant one.

Variance
Variance is important when analyzing SCED data as stable baselines are traditionally considered necessary for further comparisons (Kazdin, 1978). Moreover, data variability reduces the degree to which estimates of average level and trend are meaningful and representative of the data. Therefore, data variability is considered when constructing standard deviation bands around average levels ( Figure 2) and when projecting trend (Figure 7): see the "Visual tools" section of the tutorial. Despite the importance of variance, it is not usually the object of the intervention or the focus of the data analysis (but see Winkens, Ponds, Pouwels-van den Nieuwenhof, Eilander, & van Heugten, 2014, for whom the increased variability of behavior after the intervention was an important reason for deeming it unsuccessful). Nevertheless, the information about data variability is incorporated into the standardized mean difference indices (Glass et al., 1981;Hedges et al., 2012Hedges et al., , 2013, and the information about the variability around the baseline trend lines is suggested as part of the weighting strategy for the MPD (Manolov & Rochat, 2015). Finally, multilevel models allow modeling heterogeneous unexplained data variability in the baseline and the intervention phases (see Table 4 in .

Consistency of the Data Across Phases
The degree to which the data pattern is similar across replications, that is, for all three participants and for both types of behaviors can be assessed visually. The data for Michael and Jason (as per Figure 1) are similar, with a rapid and progressive reduction of aggressive behavior and elimination for physical aggression. Nevertheless, the data for Michael include an improving trend in the baseline, which makes causal attributions more difficult. For Tim, the data are more variable, but the intervention effect is also clearer for physical aggression. Tim is the only person for whom there is a deteriorating trend before the intervention.
The degree to which the effect is similar across cases has been incorporated as an output of the new version of the MPD (Manolov & Rochat, 2015), and it was also suggested to be used as part of the weight assigned to the average intervention effect for the whole study. Variability is represented via a strip chart, for the raw version of MPD expressed in number of behaviors, and for the percentage change and standardized versions (see Figure 13 for an example with the data for verbal aggression).
Moreover, multilevel models offer the possibility to model 6 the intervention effect as fixed (i.e., the same for all AB comparisons in the multiplebaseline design) or as random (i.e., varying across cases). The information for verbal aggression from Table 1 suggests that the amount of variation in the intervention effect on level and on slope is statistically significant at the .05 level, as the 95% confidence intervals do not include 0 as a plausible value for the variance. This information concurs with the visual impression from Figure 11, in which both the immediate effect and the slopes differ across individuals.

General Assessment of the Intervention Effect
Both visual analysis and the quantifications indicate that there is, in general, an effect of the intervention reducing the target behaviors (an average of two-three behaviors; almost 2 standard deviations) or even eliminating them. Nevertheless, there are differences across the participants (differences that are shown to be statistically significant). In view of the general results and the graphical information, we encourage researchers not to interpret any quantification in an isolated way, for instance, in relation to cases such as Michael's data, where estimating and projecting trends may obscure the fact that the problematic behavior is progressively reduced to elimination during the intervention phases, whereas the relatively short baseline does not allow predicting such an outcome with sufficient certainty. Thus, a sensitivity Figure 13. Graphical representation of the use of the mean phase difference procedure.
Note. The data are the verbal aggressions by all three participants, as collected by Singh et al. (2007). The graphs are obtained using R code (https://www.dropbox.com/s/ll25c9hbprro5gz/ Within-study_MPD_percent.R and https://www.dropbox.com/s/g3btwdogh30biiv/Within-study_MPD_std.R), described in the Unstandardized indices and their standardized versions section of the tutorial.
analysis is called for to explore whether similar conclusions are obtained using different modeling options (see , for a detailed example).

A Remark on Formulaic Representations
Regression and multilevel analyses require specifying a model to be used as a representation of the data. For fitting and comparing ordinary least squares regression trend lines, such as the ones represented in Figure 5, the R code described in the tutorial can be used by only entering the data. The same is the case for piecewise regression model represented in Figure 9, for which the formulae are presented in Center et al. (1985Center et al. ( -1986. Multilevel modeling is somewhat more complex, as the model is building according to the data features that the researcher considers relevant. Onghena (2003, 2008) and  provide indications and formulaic representations for multilevel models. Moreover, general indications about data modeling (e.g., how to model change in level and in trend in different design structures) are available in Huitema and McKean (2000) and Moeyaert, Ugille, et al. (2014). We preferred to offer verbal instead of formulaic expressions in the current article.

A Remark on Meta-Analysis
To perform a meta-analysis of SCED studies, there are several options. First, one could use a three-level model taking into account the nested structure of the data: measurements within cases within studies (Moeyaert, Ugille, et al., 2014). Second, one could use classical meta-analytical techniques with the values of the d-statistic obtained from each study , using inverse variance as a weight and obtaining confidence intervals, heterogeneity tests, and assessment of potential publication bias. Third, using other quantifications initially proposed for comparing a pair of conditions (e.g., nonoverlap indices, slope and level change, percentage change index) it is possible to obtain the average or the mean of the effects observed for each comparison. In the latter case, among the possibilities for using weights, the number of measurements seems to be a parsimonious solution Manolov, Guilera, & Sierra, 2014;Shadish, Rindskopf, & Hedges, 2008). Once an average effect is observed to represent the whole study 7 (or selected due to a substantive criterion), the effects for several studies can again be combined in the same way. Fourth, although it is not exactly meta-analysis, the results of different studies can be integrated combining probabilities (Darlington & Hayes, 2000;Rosenthal, 1978), such as the ones obtained via randomization tests (Edgington & Onghena, 2007). All four options can be performed using free code and are illustrated in the tutorialsee the section "Integrating results of several studies".

Discussion
In this article, the aim was to bring analytic developments within the field of SCEDs closer to the applied SCED researcher. Another possible step that would potentially improve analytical practice would be statisticians and methodologists to collaborate with applied researchers sharing their knowledge. Such collaborations can also bring analytical developments to realworld practice, but we consider that the efforts (such as the current article) to make practitioners autonomous are justified, as the latter are the people who have most intimate knowledge of the client, the context (and the data in general) and should be the main actors in data analysis, instead of being detached from it.

Implications and Recommendations for Applied Researchers
The current article is intended to inform applied researchers about an easy-touse set of analytical tools, so that they can obtain as much information as possible beyond what visual analysis can offer, always keeping in mind their objectives and the characteristics of data. We consider that informing applied researchers about tools for visual and quantitative analyses is justified, especially because the free availability of the techniques means that the software implementations come with no additional economic cost. Moreover, the current illustration alongside the ones included in the tutorial can guide researchers in their data-analytical process.
Papers such as the current one, together with journal special issues on the topic, are intended to increase the awareness of applied researchers about the existence of different analytical techniques, their usefulness, and the way in which they can actually be applied. In that sense, it would be important to know in which research areas it is most necessary to provide such information. To deal with this topic, we first review the research areas in which SCED studies appear most frequently, and then compare these areas with areas in which special issues about SCED methodology and analysis are available. This comparison can lead to a tentative idea about the domains in which statistical analysis is lacking visibility.
On one hand, Shadish and Sullivan (2011) identified 113 studies in the fields of psychology and education from 2008. The most frequently represented journals in this sample of studies deal with behavior modification/ intervention, developmental disorders (including autism spectrum disorders), and education. The review performed by Smith (2012) identified 409 articles in peer-reviewed journals in the period 2000-2010 reporting an SCED study, with the most frequent research areas being behavioral modification/interventions, developmental disorders, and school psychology. (In both reviews, special education articles are generally included in journals on developmental disorders or school psychology and education.) Moreover, area-specific reviews show that SCEDs are common in special education (Hammond & Gast, 2010), school-based intervention (Solomon, 2014), and neuropsychological rehabilitation (Perdices & Tate, 2009).
On the other hand, the areas in which special issues on SCED methodology and data have been offered include ( If we compare the areas in which SCED studies are most frequently published with the areas in which special issues on SCED analysis are available, apparently the field of behavior modification/intervention has been object of less systematic efforts to cover a broad range of possibilities for statistical analysis and present them jointly in a synthesized for. In that sense, the current summary is timely for a journal focused on behavior modification. Nevertheless, identifying areas needing more emphasis on data analysis requires a proper review and, moreover, the references of the current article show that Behavior Modification has published articles related to SCED statistical analysis, with the current article presenting a broad overview. In summary, although it is difficult to know whether special issues are a result of an increased awareness of the importance of quantitative analyses or they are created due to the detection of the continuous omission of any quantitative techniques, efforts such as the special issues, as well as articles offering a broad overview such as the current one (see also Manolov & Moeyaert, 2016), along with conference presentations and workshops are deemed to be the way to promote the use of appropriate analytical techniques.
As our aim is not only to increase awareness and provide a broad overview in a research area which would apparently benefit from such a synthesis but also to show how analytical techniques can actually be applied, we consider that with information available here, in Chen et al. (2015) and in Manolov and Moeyaert (2016) applied researchers are capable to meet the criteria proposed by Tate et al. (2013) for scoring maximum the data analysis item in the Methodological Quality scale: structured visual analysis as detailed by Kratochwill and colleagues (2010) or Lane and Gast (2014), or visual analysis together with quasi-statistical quantifications (e.g., nonoverlap or other percentage-based indices) or statistical techniques accompanied by the justification of their choice. Improved analysis makes more likely the publication of the results of an SCED study, although the methodological rigor and the interest of the results are also necessary for achieving this aim.
We consider that any potential improvement in the way in which data are analyzed or the way in which results are documented and communicated across researchers is useful establishing the evidence basis of interventions. This in turn can contribute to increasing the scientific credibility of SCEDs. Moreover, the tutorial presented here might prompt methodologists and statisticians to improve the usability of their software or to enhance the way in which the techniques they have authored are presented in that document. The result of this interest of basic researchers could be twofold. On the one hand, software implementations, illustrations, and tutorials can lead to methodological papers having impact on the real world (not only academic impact factor), when their proposals are being used and useful for making decisions about the (degree of) effectiveness of an intervention. On the other hand, large-scale collaborations may take place, leading to the development of a single software including all major analytical proposals for SCEDs and providing quantifications as well as a graphical representation of their meaning. Such a software would help applied researchers use their time more efficiently as, for instance, they would not need to organize their data in different ways according to the analysis to be used.

Limitations and Future Research
Several potential limitations need to be made explicit. First, in the current illustration of the possibilities for SCED data analysis, we included several but not all possible analytical techniques. A detailed illustration and discussion of each of the techniques would have taken a lot of space-the tutorial that we are presenting here has more than 270 pages. In the appendix, we provide a list of analytical techniques for which free software is available, including references, URLs where to find them, and a brief specification of the type of comparison performed with the software. One of the procedures we did not include in the current illustration, but which is included in the tutorial (as there is free R software for it), is randomization test. Randomization tests were not included due to two reasons. First, we did not want to illustrate such an analysis in the absence of random assignment in the design, as it is necessary for ensuring the validity of the analysis (Edgington, 1980) and for the adequate performance of the test (Ferron, Foster-Johnson, & Kromrey, 2003). Thus, an illustration would have been potentially misleading. Second, the SCDA plug-in for R (Bulté & Onghena, 2012) requires that all AB comparisons have the same length, which is not the case here. Nevertheless, this omission should not be understood as an inadequacy of randomization tests for all types of SCED data (see Heyvaert & Onghena, 2014;, regarding the importance of randomization in the design; Levin, Ferron, & Kratochwill, 2012, for a discussion on randomization tests).
Second, part of the R code developed is not available via the CRAN website (http://cran.r-project.org), given that it is more difficult to develop and maintain the packages when R versions change. We have also not used any website (other than Dropbox URLs and ResearchGate and Academia personal web pages), as the current work is developed without external funding and we do not wish to bother the users with unnecessary advertisements (by companies which we do not endorse) from free websites.
Third, no new technique was proposed. However, we consider that given the myriad of possibilities, applied researchers first need to get acquainted with existing alternatives and how they can actually be used, before conducting further basic methodological research. Such methodological research should also be connected to (and made useful for) actual professional practice.
Fourth, the presentation of statistical techniques according to the criteria used in visual analysis was chosen due to the common use of the latter. Nevertheless, it should be noted that the quantifications may focus mainly on a single data feature but may be affected by other data features as well. For instance, a greater difference in level will refer to a greater nonoverlap, according to the amount of variability in the data. Also, the importance of a difference in means is subjected to how well these means represent the data, which is related to the amount of variability. Finally, some techniques may yield different results according to decisions made by the researcher-in multilevel analysis, the point in which the comparison in level is made is relevant (e.g., at the beginning of the intervention phase or at the end); when using Tau-U, the decision to control or not for monotonic baseline trend and to quantify or not for intervention monotonic trend has effect on the quantification of overlap. Therefore, although there is no technique taking into account absolutely all data features on which visual analysts focus, several data features are relevant for the quantifications obtained. This further illustrates the need for joint use of visual and statistical analysis.
Finally, although dealing with the question "Which techniques should be used when?" was not our aim here, the indications provided are restricted to data that have characteristics similar to the ones collected by Singh et al. (2007). Accordingly, the instances in which we detected that a procedure was not as helpful as we would have desired (e.g., a multilevel model including all interesting data aspects) or that some procedures (e.g., slope and level change procedure, piecewise regression) may entail interpretative challenges when integrating the information with visual analysis are also restricted to the current dataset. Therefore, more detailed and comprehensive discussion is necessary (e.g., Manolov & Moeyaert, 2016) on that topic, and this is one of the possible lines of future research.
Another potentially interesting task for the future is to combine all the pieces of software into the same package, just as all explanations are available in the same tutorial. The tutorial itself is continuously updated and can certainly be improved, especially if several experts in different analytical techniques collaborate on this task. A study of the acceptability of several analytical techniques and also the acceptability and perceived usability of the software implementations and the tutorial is also potentially useful (e.g., consulting the tens of users that have already downloaded the tutorial from ResearchGate and Academia).
procedures for which software implementations were not available). In most cases, this R code provides as output both numerical results and graphical representations, as illustrated in the main text of the article. The software generally requires either that (a) the user locates an already prepared data file (e.g., a data matrix in Excel or .txt file) or (b) the measurements are introduced separated by commas. The results are obtained with a couple of clicks in the SCDA package (http://cran.r-project.org/web/packages/RcmdrPlugin.SCDA/ index.html), or by copying and pasting the corresponding R code. The tutorial includes the details about downloading and using the packages and code. It is not necessary to learn programming languages but only to follow the examples with the datasets available online on how to use the techniques and afterward repeat the analyses with own data. Interpretation is also given, although applied researchers are still advised to read primary papers presenting the techniques for more detailed information.
The current version of the tutorial (https://www.dropbox.com/s/boc-saj8yvnrlyvq/Tutorial.pdf?dl=0; also available from the www.researchgate.net and www.academia.edu profiles of the first author) covers the procedures listed below. We also provide here the main references and a URL to the freely available software for each procedure. All the R code we created (but not the one created by other authors) is available in the same pdf file of the tutorial (see the Dropbox link presented before and from the www.researchgate.net and www.academia.edu profiles of the first author).

Tools for Visual Analysis
• • Visual analysis with the SCDA package Comparison: whole data pattern (e.g., multiple-baseline design, a reversal ABAB design in which initial baseline is followed by intervention introduction, withdrawal of the intervention, and re-introduction of the intervention, alternating treatments design, apart from AB) References: Bulté and Onghena (2012), Gast and Spriggs (2010),  URL: http://cran.r-project.org/web/packages/RcmdrPlugin.SCDA/index. html

• • Using standard deviation bands as visual aids
Comparison: a pair of phases (e.g., AB) References: Callahan and Barisa (2005), Pfadt and Wheeler (1995) URL: https://dl.dropboxusercontent.com/s/elhy454ldf8pij6/SD_band.R Regarding the example used in the current article, the URL for the Excel file including data, organized in different ways according to the analytical technique, and used in this article is https://www.dropbox.com/s/atkly341lo1b503/ Singh%20data.xlsx?dl=0. This file is also available as supplementary online material to the article.

Note on the Availability and Updating of R Tools
Regarding the R software resources described in the tutorial, some of them have been formalized as R packages (i.e., the SCDA plug-in for the R-Commander package, the nlme and scdhlm packages), whereas others consist of code that has to be copied, in some cases slightly modified to specify the data to be analyzed and pasted in the R console (i.e., all remaining tools not mentioned in the previous parentheses). Moreover, only the SCDA plug-in and the nlme package are available in the CRAN repository (https://cran.r-project. org/web/packages/), and thus only these can be installed directly using the R menus. Given that R is continuously evolving, to remain in the CRAN repository, it is necessary for their maintainer to ensure that these packages still function properly for more recent versions of R. Regarding the R software tools that are not included in the repository, if they depend on other packages (i.e., the R code for Tau-U depends on the Kendall package, the R code for generalized least squares regression depends on the lmtest package, the R code for graph rotation depends on the rgl package, and the R code for meta-analysis depends on the metafor package), the creator of those tools needs to check whether these other packages are still available for more recent versions of R, although their importance for data analysis in general suggests that this is almost certainly so. In case, the R code does not depend on any other packages (i.e., all remaining tools not mentioned in the previous parentheses), it is expected to function properly regardless of the R version. The tutorial presented here described how the packages and codes existing at the time of its creation work (at the end of year 2015). The discussion provided in this article, regarding the use of the single-case analytical techniques, is still applicable (albeit representing our own view on the topic) regardless of the version or the existence of the software tools.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.

Notes
1. All the necessary links are provided in the appendix. 2. The ordinate axes of all figures represent counts of the corresponding type of aggressive behavior. 3. We do not suggest using the standard deviations band as a formal statistical tool, given that the data are not likely to be normally distributed, as assumed in the procedure.
4. We only perform this analysis to show how d-statistic values can be meta-analyzed using classical meta-analytical techniques. For the type of data collected by Singh et al. (2007) -two outcomes per participant-it is possible to carry out multivariate analysis or to use a multilevel model (Van den Noortgate, López-López, Marín-Martínez, & Sánchez-Meca, 2013). 5. Actually, multilevel models can be used to model only change in level or only change in slope; they are flexible enough to be adapted to the data aspects that the researcher considers relevant to be modeled. 6. Although the design used by Singh et al. (2007) is a multiple baseline, it is relevant to mention that with piecewise regression it is possible to compare whether effects are similar in the different comparisons (A 1 B 1 and A 2 B 2 ) involved in a reversal design using design matrices 5 and 8 from Moeyaert, Ugille, Ferron, Beretvas, and Van den Noortgate (2014). 7. It is also possible to select at random one of the outcomes reported in a study or to perform this selection on a substantive basis (i.e., because it is the outcome of interest for the meta-analysis). 8. Commercial software is available for multilevel analysis, including options in SAS and SPSS as well as the specific programs HLM and MLwiN.