Introduction
The current technological disruption prompts a reevaluation of teaching models rooted in epistemological traditions. Various authors have addressed the challenge of adapting education to the technological era through a student-centered focused on the development of 21st-century skills For example, Siemens (2004) proposes connectivism, which emphasizes networking, knowledge management, and the ability to connect and access diverse sources of information. Prensky (2013), for his part, presents the pedagogy of co-association that suggests how, where and when teachers should employ technology.
Technologies such as Artificial Intelligence, the metaverse, robots, 3D printing, augmented reality and virtual reality are transforming how students learn and access knowledge. The integration of these disruptive technologies into classrooms must be supported by sound learning theories or approaches to ensure their effectiveness in achieving educational goals (Lui et al., 2023; Qiu et al., 2023). In addition to the previously mentioned connectivism and co-association, there are other pedagogical approaches that can also underpin the integration of technologies such as experiential learning (Kolb, 1984), multimedia learning (Mayer, 2005) and mobile learning (Sharples et al., 2010).
This study focuses on immersive technologies such as augmented reality (AR) and virtual reality (VR) in primary education. These disruptive technologies are renewing the way students experience physical and virtual environments, from observation to immersion. AR visualizes the physical environment overlaid with digital content in real time and with three-dimensional registration (Cabero-Almenara et al., 2022), while VR displaces the user to a fully synthetic environment, which can mimic real-world properties. However, it can also exceed the limits of physical reality by creating a world in which the laws governing space, time and mechanics are no longer valid (Sandoval-Henríquez & Badilla-Quintana, 2021).
There is some empirical evidence regarding the application of immersive technologies and characteristics in primary education. For example, Huang et al. (2023) explored the effect of AR on computational thinking and programming skills. The results demonstrated the technology’s effectiveness when used with game-based learning. Abdullah et al. (2022) examined the impact of AR on academic achievement, interest, and science skills. Their findings showed that technology integration grounded in inquiry-based learning was effective on all three measured variables. Sandoval-Henríquez and Badilla-Quintana (2022) described experiences of interactivity, presence, and flow after students interacted with AR and VR. The results confirmed that the integration of immersive technologies, based on experiential learning, allows student to experience reciprocal interaction with the resources, a sense of being present in the virtual world, and high levels of concentration.
Systematic literature reviews highlight the educational advantages of immersive technologies. Regarding AR, Buchner and Kerres (2023) acknowledge that the technology can be used to design effective and engaging learning environments. However, they caution that its effectiveness depends on the educational context, prior knowledge, and learning objectives. Mystakidis et al. (2022) argue that AR supports learning in STEM subjects, but emphasize that the integration of technologies must be supported by a learning theory. Regarding VR, Lui et al. (2023) indicate that the technology promotes science learning, although they highlight the importance of reducing the cognitive load imposed by immersive systems, as well as considering student characteristics when designing VR-based content. Hamilton et al. (2021), for their part, state that VR enables the exploration of complex and realistic content, unlike other traditional strategies such as computers and digital presentations. The authors also emphasize the importance of using appropriate instruments to measure learning, since these tools can affect the interpretation of the usefulness of the technology.
Despite being essential for the advancement of educational research, systematic reviews present a relevant limitation by not including quantitative measurements that allow for comparative evaluation of the impact of immersive technologies. Consequently, meta-analyses emerge as a more effective and comprehensive alternative, as they combine results from multiple independent studies (Fau & Nabzo, 2020).
Among recent meta-analyses in this area, Cao and Yu (2023) studied the effect of AR on attitudes, motivation, and academic achievement at all educational levels. The analysis of 28 studies reported that the technology fosters better attitudes and academic achievement compared to traditional methods. Chang et al. (2022) examined the impact of AR on learning at all levels of education. The analysis of 134 studies shows a medium effect size, as well as positive responses, with a stronger impact in language learning and the social sciences. Villena-Taranilla et al. (2022) explored the effect of VR on academic achievement in K-6 education. According to 21 studies, findings indicate that the technology promotes greater learning compared to control conditions. Additionally, brief interventions (less than two hours) are more effective than those of longer duration. Coban et al. (2022) analyzed the impact of VR on learning in K-12 and higher education. Based on 48 studies, the results show a small effect size in the experimental condition.
Meta-analyses have predominantly focused on higher education, particularly in fields such as medicine, nursing, and rehabilitation (Guo et al., 2023; Hsieh et al., 2025; Kim & Kim, 2023; Liu et al., 2023; Neher et al., 2025). At this educational level, students typically specialize in specific areas of study, and immersive technologies are employed to practice technical skills within safe and controlled simulation environments. Although the evidence in this context is encouraging, further research is needed in primary education settings (Sandoval-Henríquez et al., 2024).
Different levels of education are associated with different stages of cognitive and psychosocial development. In primary education, which ranges from 6 to 12 years of age, students are in the stage of concrete operations in cognitive development. According to Piaget (1974), this stage is characterized by the ability to think logically about objects and events. Students can perform mental operations, such as conserving quantity and classifying objects into categories. In terms of psychosocial development, they are typically in the latency period. According to Erikson (1985), at this stage, students are eager to learn and demonstrate skills in various areas. It is a stage in which support and positive recognition can foster confidence, self-esteem, and autonomy.
The integration of immersive technologies at this level should consider students’ characteristics and be used to create playful experiences that foster cognitive stimulation, curiosity, and creativity (Baba et al., 2022; Demircioglu et al., 2022; Tsai & Yu-Cheng, 2022). The scarce empirical background in primary education, the contradictory findings regarding the effectiveness of these technologies on learning, and the specific characteristics of this educational level are the motivation behind this research, which poses the following questions:
RQ1. What methodological characteristics are considered in immersive technology-based interventions with respect to the sample (nationality and age), treatment (educational content, duration of exposure to the technology, and learning model), and measurement instrument (psychometric properties)?
RQ2. What is the effect of interventions based on immersive technologies on learning, compared to traditional interventions?
The following hypothesis emerges from this second question and the reviewed background information: interventions based on immersive technologies result in significantly higher learning outcomes compared to traditional interventions.
Materials and Methods
Identification and Selection of the Study Sample
A meta-analysis was conducted following PRISMA statement guidelines to ensure a relevant and accurate search of the study topic. The process consisted of three phases: identification, screening, and inclusion (Page et al., 2021). In addition, recommendations for the appropriate reporting of meta-analysis results were followed (Rubio-Aparicio et al., 2018).
Phase 1: Identification
A search was conducted in August 2023 using the Web of Science, Scopus, and ERIC databases. The keywords and search syntax were adapted from a previous systematic review on immersive technologies in primary education (Sandoval-Henríquez et al., 2024). The keywords used were: "virtual reality" OR "augmented reality" AND "primary school" OR "elementary school" OR "primary education" OR "elementary education" AND "academic performance" OR "academic achievement" OR "educational performance" OR "learning" AND "quasi-experiment" OR "quasiexperimental" OR "experiment" OR "experimental" OR "intervention".
Filters were applied for year of publication (studies published between 2018 and 2023), language (Spanish and English), and access type (open access studies). After applying these filters, 42 publications were retrieved from Web of Science, 55 from Scopus, and 48 from ERIC. Of the 145 articles identified, 48 duplicates were removed.
Phase 2: Screening
A review of the titles and abstracts of the 97 studies identified in the previous phase was conducted. The aim of this process was to eliminate studies that were not directly related to the central topic. Subsequently, a full-text reading of the selected articles was performed, and inclusion and exclusion criteria were applied to determine which studies met the established requirements.
The inclusion criteria comprised quantitative studies with an experimental design that used AR or VR technologies and focused on primary education. Conversely, the exclusion criteria led to the removal of studies with qualitative or non-experimental designs, those employing technologies other than augmented or virtual reality, and those focusing on educational levels other than primary education. Additionally, studies that did not report statistics required for meta-analysis, such as mean and standard deviation, were excluded.
After the full-text review and application of the inclusion and exclusion criteria, a total of 79 studies were excluded.
Phase 3: Inclusion
Two investigators independently assessed the previous phases without discrepancies. The bias assessment also involved a third investigator, who used the PRISMA digital checklist (Page et al., 2021) to assess the information incorporated in the manuscript sections. The management of bibliographic references and the elimination of duplicates were carried out using EndNote 21 software. Figure 1 shows the flow diagram with the phases followed.
Extraction of Information from the Studies
To extract information from the 18 articles, a protocol was established that included the following elements: ID, citation, study objective, country, age, educational level, sample size, educational content, duration of exposure to technology, learning model, instrument to measure learning, and descriptive statistics. The principal investigator performed the initial extraction of information, collecting the relevant data from each article according to the established protocol. The rest of the team then reviewed and verified the extraction to ensure the consistency and accuracy of the data. Table 1 2, 4, 5, 8, 13, 15, 24, 30, 38, 40, 47, 49, 51, 54, 55, 56, presents the extraction matrix with the information associated with each study. The results from this table will be presented to answer RQ1.
Table 2 presents the descriptive statistics (mean, standard deviation, and number of cases) associated with each study. The results from this table will be presented to answer RQ2.
Data Analysis
The analysis was performed using the standardized mean difference as the outcome measure, which allows for the comparison of effects across studies using different outcome scales. A random-effects model was fitted to the data, assuming that the true effects vary between studies due to differences in context, intervention design, or population characteristics (Rubio-Aparicio et al., 2018).
Heterogeneity (i.e., tau²) was obtained using the restricted maximum likelihood estimator (Viechtbauer, 2005). In addition to the estimation of tau², the Q-test for heterogeneity and the I² statistic were calculated. If some degree of heterogeneity is detected (i.e., tau² > 0, regardless of the Q-test results), a prediction interval for the true effect is also provided. Studentized residuals and Cook's distances are used to examine whether the studies may be outliers and/or influential in the context of the model.
Publication bias was assessed using the Fail-Safe N, which estimates the number of unpublished or missing studies with null or non-significant results were included; the overall results of the meta-analysis would still be statistically significant or consistent.
In addition, bias was assessed with Egger's Regression, which examines the association between effect sizes and their precision. A non-significant p-value (e.g., p > 0.05) indicates no evidence of publication bias. However, a significant p value (e.g., p < 0.05) suggests the presence of publication bias but may also mean that the sample size is too small, or that there is substantial heterogeneity among the included studies (Egger et al., 1997).
Data analysis was performed using the MAJOR module of JAMOVI software version 2.3.13.0.
Results
The results are presented according to the research questions.
General Characteristics of the Studies (RQ1)
The studies come from Turkey (39%; n = 7), Taiwan (28%; n = 5), Saudi Arabia (17%; n = 3), Malaysia (6%; n = 1), China (6%; n = 1), and Denmark (6%; n = 1). The educational levels with the highest prevalence are 6th grade (50%; n = 9) and 7th grade (28%; n = 5). Sample sizes range from 22 to 102 participants, distributed across control (absence of technologies) and experimental (presence of immersive technologies) groups.
The interventions address educational content in science (67%; n = 12), English (28%; n = 5), and programming (6%; n = 1). The reported learning models include mobile learning (17%; n = 3), experiential learning (11%; n = 2), inquiry-based learning (6%; n = 1), the SMAR model (6%; n = 1), creative situated learning (6%; n = 1), and collaborative learning (6%; n = 1). Fifty percent of the studies do not report a specific learning theory guiding the integration of immersive technologies.
Exposure time is reported in two main formats: by number of weeks (61%; n = 11), ranging from 3 to 7 weeks, or by number of sessions (39%; n = 7), ranging from 1 to 6 sessions.
Regarding the instruments, the studies generally employ ad hoc content tests to measure academic achievement, mostly based on multiple-choice items (94%; n = 17). Only one study reports using a test previously developed by other authors.
Effectiveness of the Integration of Immersive Technologies (RQ2)
Table 3presents the results of the heterogeneity test, showing significant heterogeneity among the 18 studies (Q(17)=128.745, p<0.0001), considerable variability among the effects of the individual studies (tau²=0.7406) and observed variability (I²=89%), suggesting that the studies originate from different populations.
Table 4 17, presents the results of the publication bias; the Fail-Safe N indicates that at least 1,043 unfound or unpublished studies with null results would be necessary for the results of the current meta-analysis to be insignificant. For its part, the Egger's Regression test indicates that there is significant statistical evidence of asymmetry in the data. This suggests the possible presence of a publication bias, where studies with significant results are more likely to be published than those with non-significant results.
An examination of the studentized residuals revealed that none of the studies had a value greater than ±2.9913, indicating no outliers in the context of this model. According to Cook’s distances, none of the studies could be considered overly influential. Figure 2 presents the funnel plot.
Both the rank correlation test and the regression test indicate possible skewness (p = .0022 and p = .0005, respectively), suggesting the presence of publication bias. This asymmetry may reflect a tendency to publish studies with significant or positive results, while non-significant or negative findings remain unpublished or are less accessible.
Table 5presents the results of the random-effects model for the 18 studies. The analysis estimates a large effect size of 1.02 (Sawilowsky, 2009). However, due to variability across studies, the true population effect may range from 0.591 to 1.443, with a 95% confidence interval.
Figure 3 displays the forest plot, which includes 18 effect sizes, each represented by an asterisk in the central column, with horizontal lines indicating their corresponding 95% confidence intervals.
The dashed vertical line denotes the line of no effect, while the diamond at the bottom represents the overall pooled effect size. The position of the diamond to the right of the vertical line indicates that the experimental group achieved significantly better learning outcomes compared to the control group.
Discussion
General Characteristics of the Studies (RQ1)
According to previous findings and systematic reviews (Altinpulluk, 2019; Garzón et al., 2019; Qiu et al., 2023), there is greater scientific production on the use of immersive technologies in Asia. This corroborates that certain countries in that region are showing increasing interest and investment in integrating technologies to enhance the learning experience in primary education.
Regarding the age of the participants, the studies were conducted with students aged 11 and 12, corresponding to sixth and seventh grades. At this stage of cognitive and psychosocial development, students are undergoing changes in how they think, process information, and understand the world, as well as in how they interact socially (Erikson, 1985; Piaget, 1974). The visualization features of AR and VR can have a significant impact on students' learning and development at this age, due to several factors identified in the literature: the exploration of concepts and scenarios in an interactive way; increased engagement in learning, as they are at an age when interest may fluctuate due to personal and contextual factors; development of cognitive and social skills; and sensory stimulation that supports different learning styles (Akçayır & Akçayır, 2017; Hamilton et al., 2021; Villena-Taranilla et al., 2022).
Concerning the educational interventions, the studies mainly address science-related content. This finding is consistent with other reviews that highlight the widespread use of immersive technologies for learning about cells, human body systems, the planetary system, flora, and fauna (Garzón et al., 2019; Pellas et al., 2021; Mystakidis et al., 2022; Oyelere et al., 2020). As for the time of exposure to the technology, it varies in how it is reported (e.g., weeks, sessions, hours), which constitutes a limitation. The lack of consistency in reporting exposure time makes it difficult to replicate interventions or draw generalizable conclusions. For example, the study by Hashim et al. (2022), which reports improvements in learning after several weeks of exposure, may not be comparable to the study by Tsai and Yu-Cheng (2022), which only reports minutes of use in a single session.
The studies apply different learning theories to the design of technology-based interventions, with mobile learning and experiential learning being the most common. However, the lack of an explicit theoretical framework in 50% of the studies analyzed is a recurring limitation in educational research (Buchner & Kerres, 2023; Mystakidis et al., 2022; Qiu et al., 2023). Learning theories provide the conceptual basis for selecting and designing pedagogical strategies aligned with educational goals and student needs. When theories are not referenced, it becomes difficult to understand the rationale behind the integration of AR and VR in the classroom (Sandoval-Henríquez & Badilla-Quintana, 2021).
Regarding assessment, the studies employed content tests to measure academic achievement. None of the reviewed studies used indirect measures such as perceived learning. This aligns with the findings of previous reviews (Hamilton et al., 2021). The inclusion of indirect measures alongside traditional assessments could offer a broader understanding of the motivational, emotional, and engagement-related dimensions associated with immersive technologies and their effect on learning. Pedagogical models such as co-association also promote the use of alternative forms of assessment, such as self-assessment and peer assessment, which help students become aware of their own progress and develop self-regulation skills (Prensky, 2013).
Effectiveness of the Integration of Immersive Technologies (RQ2)
Individual studies report that the integration of immersive technologies has a positive impact on academic achievement compared to control conditions. In this regard, learning gains are a commonly reported educational benefit in previous systematic reviews (Buchner & Kerres, 2023; Garzón et al., 2019; Mystakidis et al., 2022). The results of the meta-analysis indicate an effect size of 1.02, which is considered large (Sawilowsky, 2009). However, due to variability across the 18 studies, the true population effect may range from 0.59 to 1.44 with a 95% confidence interval-i.e., a moderate to large effect.
These findings are consistent with those of other meta-analyses. Villena-Taranilla et al. (2022) reported a moderate effect size (g = 0.64), noting that shorter interventions (less than two hours) were associated with greater learning effects. Similarly, Chang et al. (2022) found a moderate effect size (g = 0.65) for the impact of immersive technologies on academic achievement. Garzón et al. (2019) also reported a moderate effect size (g = 0.65), with higher effectiveness observed in science, arts, and humanities. In contrast, Cao and Yu (2023) reported a significant difference between experimental and control groups, with the former achieving a large effect size (g = 0.85).
However, some meta-analyses have found more modest or inconsistent results. For example, Coban et al. (2022) reported a small effect size (g = 0.38) for the impact of VR on learning outcomes. Interestingly, their study also found that immersive technologies had a significantly larger effect in elementary education compared to higher education. These contrasting results underscore the need for further research that considers factors such as educational level, intervention duration, subject area, and the degree of immersion (fully immersive, semi-immersive, or non-immersive) to better understand the impact of these technologies on learning.
Conclusion
This research allows us to draw several conclusions and recommendations.
First, the hypothesis that the integration of immersive technologies leads to significantly higher learning outcomes compared to traditional strategies is supported. However, it is important to acknowledge that the effect of AR and VR may depend on additional factors, such as the quality of educational content, instructional design, and teacher training.
Second, several studies did not explicitly report the theoretical underpinnings of their interventions. It is essential for researchers to ground their work in established learning theories to ensure methodological soundness and promote the continuous improvement of educational practices. In this review, only a few theories were mentioned-such as experiential learning (Kolb, 1984), multimedia learning (Mayer, 2005), mobile learning (Sharples et al., 2010), and the pedagogy of co-association (Prensky, 2013). Future studies should aim to compare the effectiveness of a given learning theory when applied in both control and experimental groups. This would enable a more nuanced understanding of how the presence or absence of technology interacts with specific pedagogical frameworks in primary education.
Third, some studies showed inconsistencies in how exposure time to immersive technologies was reported. Future research should establish clearer guidelines and standards for the consistent reporting of exposure time and other key variables in experimental studies. Providing detailed information about study design enhances transparency, improves the comparability of findings, and contributes to a more robust understanding of how technology affects learning across educational contexts.
Finally, this meta-analysis has several limitations that should be taken into account. Due to the variability in how intervention duration was reported, it was not possible to compare whether longer or shorter exposure times had differential effects on academic achievement. Another limitation concerns publication bias: studies with positive results are more likely to be published than those with non-significant or negative findings. This may lead to an overestimation of the impact of immersive technologies on learning and should be carefully considered when interpreting the overall conclusions of this meta-analysis






















