학술논문대한경영학회지2026.01 발행

인간-AI 조합에서 생성형 AI vs 전통적인 AI 간의 시너지효과 분석

Generative AI and Traditional AI in Human-AI Combination

윤우제(국가공무원인재개발원); 유희재(숭실대학교); 현동열(숭실대학교); 정원준(숭실대학교)

39권 1호, 125~159쪽

초록

본 연구는 인간–AI 조합에서 어떤 유형의 AI가 언제 시너지(synergy)를 만들어내는지를 규명하고자 한다. 최근 메타분석에 따르면 인간–AI 조합은 평균적으로 인간 단독보다는 성과가 높지만, 인간과 AI 중 최고의 수행자보다 더 나아지는 강한 시너지는 드물게 나타난다. 이는 인간–AI 조합이 어떤 상황에서만 인간과 AI 단독을 능가하는지에 대한 이론적이고 실무적 질문을 제기한다. 본 연구는 특히 생성형 AI를 전통적인 AI(예측･분류 중심의 모델)와 구분하여, AI 유형이 인간–AI 조합의 시너지에 미치는 조절 효과를 분석한다. 이를 위해 Vaccaro, Almaatouq, & Malone(2024)의 인간–AI 팀 메타분석 자료 중 305개 효과크기를 재분석하고, 네 가지 시너지 지표(강한 시너지, 인간 증강, AI 증강, 부정적 시너지)를 산출하였다. AI 유형을 생성형 AI와 전통적인 AI로 구분하고, 과업에서 AI가 인간보다 우수한지(AI>인간) 혹은 인간이 우수한지(인간>AI)에 따라 네 가지 시너지 지표를 Welch t-검정과 Holm–Bonferroni 보정으로 검증하였다. 분석 결과, AI가 인간보다 우수한 과업에서는 생성형 AI 팀이 전통적인 AI 팀보다 유의미하게 높은 강한 시너지를 보였고, 전체 과업에서도 생성형 AI 팀은 평균적으로 시너지가 양(+)의 값, 전통적인 AI 팀은 음(–)의 값을 나타냈다. 반대로 인간이 우수한 과업에서는 전통적인 AI 팀이 인간 기준 대비 성과 향상을 보였지만, 생성형 AI 팀은 인간 단독보다 낮은 성과를 보여 오히려 방해 효과를 보였다. 또한 AI가 우세한 과업에서 생성형 AI 팀은 AI 기준 대비 성과 향상을 보였으나, 부정적 시너지 위험 측면에서는 전통적인 AI 팀이 상대적으로 더 안정적으로 “최악보다는 나은” 수준을 유지하였다. 이러한 결과는 AI 유형이 인간–AI 조합 시너지의 핵심 조절 변수임을 보여준다. 생성형 AI는 AI가 인간보다 우수한 과업에서는 시너지를 이끌어내지만, 인간이 우세한 전문가 영역이나 실패 비용이 큰 영역에서는 전통적인 AI보다 위험할 수 있다. 이론적으로는 인간–AI 조합 연구에서 “인간 vs AI 상대적 우위”에 더해 “생성형 vs 전통적인 AI”라는 축을 포함하는 다차원 협업 모형이 필요함을 시사한다.

Abstract

This study investigates when and how different types of AI systems generate synergy in human–AI teams. Prior work shows a puzzling pattern: on average, human–AI combinations outperform humans alone but rarely exceed the better of the human or the AI partner. Strong synergy—teams doing better than both humans and AI alone—appears only in a minority of tasks, while mere augmentation of human performance is common. Against this backdrop, we ask whether AI system type (generative vs classical) moderates synergy and under what conditions generative AI (large language models, LLMs) is beneficial or harmful in human–AI collaboration. We re-analyze the meta-analytic dataset of human–AI team experiments compiled by Vaccaro et al. (2024), which aggregates 370 effect sizes from 106 studies comparing human solo, AI solo, and human–AI team performance on the same tasks. Excluding Woz-type systems, we use 305 effect sizes (8 generative AI, 297 classical AI). For each experimental condition, we compute standardized effect sizes (Hedges’ g) for four synergy metrics: strong synergy (g(HAI,max): team vs best solo), human augmentation (g(HAI,H): team vs human), AI augmentation (g(HAI,AI): team vs AI), and negative synergy (g(HAI,min): team vs worst solo). We then classify tasks by the relative advantage of AI over humans (RA>0: AI>Human; RA<0: Human>AI) and test five research questions comparing generative vs classical AI using Welch’s unequal-variance t-tests with Holm–Bonferroni correction. The results reveal a clear asymmetry. When AI outperforms humans (AI>Human), human–LLM teams achieve significantly higher strong synergy than human–classical AI teams: on average, they surpass the best solo performer, whereas classical AI teams fall below AI solo performance, indicating a loss from adding a human teammate. Generative AI teams also show positive AI-augmentation effects (team＞AI) in these AI-dominant tasks, while classical AI teams do not. Aggregated across all tasks, human–LLM teams display a positive mean strong synergy, whereas human–classical AI teams show a negative mean, suggesting that generative AI can raise the upper bound of human–AI performance. However, when humans outperform AI (Human>AI), the pattern reverses. Classical AI teams exhibit positive human-augmentation effects (team＞human), but generative AI teams, on average, reduce performance below the human baseline. Finally, in terms of negative synergy risk (team performing worse than the worst solo), classical AI teams more reliably avoid catastrophic failures than generative AI teams, whose performance is more volatile and occasionally falls below the weakest individual. These findings make three contributions. First, they provide empirical evidence that AI system type is a critical moderator of human–AI synergy: generative AI enables strong and AI-augmenting synergy primarily when AI is already better than humans, but may harm performance in expert-dominant tasks and does not systematically reduce worst-case outcomes. Second, they extend existing models of human–AI collaboration by highlighting a two-dimensional design space defined by relative human–AI advantage and AI type (generative vs classical), rather than treating “AI” as a single category. Third, they offer practical guidance for organizations: generative AI is best deployed for open-ended, creative, or high-complexity tasks where AI outperforms humans and can provide rich drafts or explanations, while classical AI is more suitable for supporting experts in narrow, safety-critical domains and for serving as a robust safeguard against catastrophic failure.

발행기관:: 대한경영학회
분류:: 경영학

AI 법률 상담

이 논문의 주제에 대해 더 알고 싶으신가요?

460만+ 법률 자료에서 관련 판례·법령·해석례를 찾아 답변합니다

AI 상담 시작