학술논문재산법연구2026.02 발행

생성형 AI 학습 데이터 무단 이용의 위법성 판단과 민법 제750조의 보충적 적용 - 저작권법, 부정경쟁방지법의 검토와 민법 제750조의 보충적 적용을 중심으로 -

Determination of Illegality Regarding Unauthorized Use of Generative AI Training Data and Supplementary Application of Article 750 of the Civil Act Focusing on the Review of the Copyright Act and the Unfair Competition Prevention Act, and the Supplementary Application of Article 750 of the Civil Act

이성용(고려대학교); 김상중(고려대학교)

43권 1호, 131~169쪽

AI이 논문 주제로 AI 상담 원문 보기 (KCI)

초록

생성형 인공지능의 발전에 따라 뉴스, 블로그, SNS 등 방대한 데이터가 자동으로 수집･가공되어 모델 학습에 이용되고 있으며, 이 과정에서 데이터 제공자의 동의 여부, 저작권 및 데이터베이스권 침해 가능성, 공정한 경쟁질서 유지 등 다양한 법적 쟁점이 논의되고 있다. 이에 대하여 저작권법, 부정경쟁방지법, 데이터산업법, 인공지능기본법 등은 각기 저작물, 데이터베이스, 한정제공 데이터, 고영향 AI에 대한 규율을 통하여 일정한 보호와 이용 통제의 틀을 마련해 왔다. 그러나 생성형 AI 학습 단계에서 일어나는 대량의 웹 크롤링과 비정형 데이터 축적은 기존 규범이 예정하지 않았던 새로운 이용 형태를 전제로 하고 있어, 현행 특별법 체계만으로 위법성･책임의 범위를 선명하게 그려내기에는 한계가 있다는 지적도 제기된다. 본 논문은 개인정보에 해당하지 않는 비개인 데이터를 중심으로, 생성형 AI 학습 단계에서의 데이터 이용에 대해 이들 특별법이 어떠한 범위에서 보호를 제공하는지를 먼저 검토한 뒤, 그로써도 남게 되는 영역에서 민법 제750조 일반불법행위 규정이 보충적으로 어떤 역할을 수행할 수 있는지를 살펴본다. 특히 저작권법과 부정경쟁방지법의 적용 범위를 분석하여, 특별법이 이미 충분한 보호를 제공하는 영역과 그렇지 못한 영역을 구분하고, 후자의 경우를 중심으로 민법 제750조의 적용 가능성과 위법성 판단 기준을 제시한다. 이 과정에서 데이터 보호를 물권적 권리 부여가 아닌 행위규제 중심으로 설계해 온 우리 법제의 흐름과 EU 데이터베이스･TDM 규정 및 일본 한정제공 데이터 법제를 참고하여, 일반불법행위법리가 이러한 법제 하에서 어떤 보충적 기능을 수행할 수 있는지 이론적으로 정립하고자 한다. 저작권법은 창작성 있는 저작물과 상당한 투자를 통해 구축된 데이터베이스에 대해 배타적 권리를 부여함으로써 창작･투자 인센티브를 보호하는 기능을 수행한다. 부정경쟁방지법 또한 데이터 부정사용행위 규정과 성과 도용 일반조항을 통하여, 업으로서 한정 제공되는 데이터나 경쟁자의 성과에 대한 무임승차를 규율한다. 이와 같이 상당 부분에서 기존 특별법이 이미 데이터 투자와 공정경쟁질서를 실질적으로 보호하고 있다는 점을 전제로 하면서도, 웹상에 공개된 단순 사실 데이터, 창작성이 없는 비정형 데이터, 배열･구조화를 통해 독립된 데이터베이스로 구축되었다고 보기 어려운 자료 등은 여전히 보호 범위 밖에 놓일 여지가 있다. 본 논문은 이러한 잔여 영역에서 데이터 자체에 대한 배타적 소유권이 아니라 데이터의 수집･선별･갱신･관리 과정에 투입된 ‘상당한 투자와 노력의 성과’를 민법 제750조상 법률상 보호할 가치가 있는 이익으로 파악할 수 있는지를 검토한다. 그리고 상관관계설에 기초하여, ① 기술적 보호조치(robots.txt, IP 차단, 캡차 등)의 명시적 제한을 무력화하는 경우, ② 과도한 크롤링으로 시스템 장애를 야기하는 경우, ③ 학습 결과가 원본 콘텐츠의 시장을 실질적으로 대체하는 경우, ④ 데이터 출처･취득 경로･처리 과정에 관한 기록을 전혀 남기지 않아 투명성과 설명가능성을 현저히 결여한 경우 등 일정한 요건 하에서는 일반불법행위 성립 가능성이 인정될 수 있음을 논증한다. 다만 비영리 학술 연구, 데이터 보유자의 명시적･묵시적 승인, 저작권법 제35조의5 공정이용 및 EU･일본의 TDM 면책 규정 취지와 부합하는 이용에 대해서는 위법성이 조각되거나 완화될 수 있음을 아울러 제시한다. 결론적으로 본 논문은 생성형 AI 학습 데이터 이용에 관한 기존 특별법의 보호 구조를 전제로 그 의의를 긍정적으로 평가하는 전제 위에 서 있으면서도 일부 비개인･비창작성 데이터 보호에 남겨진 규범적 공백에 대해서는 일반법인 민법 제750조 불법행위 법리가 보충적으로 적용될 수 있는 기준을 제시함으로써 데이터 투자 보호와 AI 기술 발전, 나아가 데이터 거버넌스의 투명성･예측가능성 제고 사이의 조화로운 조정 가능성을 모색한다.

Abstract

With the rapid development of generative artificial intelligence, massive volumes of data from news, blogs, social media and other sources are automatically collected and processed for model training. In this process, numerous legal issues have been raised, including the requirement for consent from data providers, the risk of copyright and database rights infringement, and the maintenance of fair competition. In response, the Copyright Act, the Unfair Competition Prevention and Trade Secret Protection Act, the Framework Act on the Promotion of Data Industry and Utilization, and the Framework Act on Artificial Intelligence each have established frameworks for protection and use control through regulations addressing, respectively, works of authorship, databases, data provided on a limited basis, and high impact AI systems. However, large scale web crawling and the accumulation of unstructured data in the training phase of generative AI presuppose new forms of use that were not contemplated when these statutory provisions were enacted. Accordingly, it has been observed that the existing framework of special statutes alone faces limitations in clearly delineating the scope of unlawfulness and responsibility. Focusing on non-personal data that does not constitute personal information under applicable law, this article first examines the extent to which these special statutes provide protection for data use in the training phase of generative AI, and subsequently considers what supplementary role the general tort liability provision of Article 750 of the Civil Act can perform in addressing the remaining gaps. In particular, by analyzing the scope of application of the Copyright Act and the Unfair Competition Prevention and Trade Secret Protection Act, the article distinguishes between areas where these special laws already provide adequate protection and those where they do not, and, only with respect to the latter category, proposes conditions for the applicability of Article 750 of the Civil Act and a framework for assessing unlawfulness. In doing so, it situates the analysis within the broader legal framework that has designed data protection primarily through conduct-based regulatory approach rather than through the conferral of proprietary rights, and, with reference to EU regulations on databases and text-and-data- mining (TDM) as well as the Japanese “limited data provision” regime, seeks to establish theoretically how the general law of torts can perform supplementary functions within this paradigm. The Copyright Act protects incentives for creativity and investment by conferring exclusive rights in original works of authorship and in databases created through substantial investment. The Unfair Competition Prevention and Trade Secret Protection Act, through provisions addressing unlawful use of data and the general provision against misappropriation of achievements, regulates both the free-riding on data made available on a limited basis as part of a commercial enterprise and the appropriation of a competitor's accumulated achievements. While the article takes as a premise that much of this existing special law framework already provides substantive protection for data investment and the maintenance of fair competition, it observes that certain categories of data remain outside the scope of protection: simple factual data publicly available on the web, unstructured data lacking in creativity, and materials that are difficult to characterize as constituting an organized database by virtue of their selection or structure. In these residual areas, the article examines whether one can identify, instead of exclusive property rights in data itself, the “fruits of substantial investment and effort” expended in the collection, selection, updating and management of data as constituting an “interest worthy of legal protection” under Article 750 of the Civil Act. Building on the correlational approach to unlawfulness, the article argues that general tort liability may be recognized under certain conditions: ① deliberate circumvention of explicit technical protection measures (such as robots.txt, IP blocking, or CAPTCHA), ② system disruption resulting from excessive crawling, ③ substitution of the market for the original content by the training outputs, and ④ a significant absence of transparency and explainability owing to the failure to maintain records concerning data sources, acquisition methods or processing procedures. Simultaneously, the article indicates that unlawfulness may be excluded or mitigated in cases of non-profit academic research, uses conducted with the express or implied consent of the data holder, and uses consistent with the fair use provision in Article 35-5 of the Copyright Act and with the rationale underlying TDM exceptions in the EU and Japan. In conclusion, this article positively evaluates the existing protective framework established by special statutes governing the use of training data for generative AI, while proposing criteria under which the general tort liability provision of Article 750 of the Civil Act can be applied in a supplementary manner to address remaining normative gaps concerning certain non-personal and non-creative data. Through this approach, the article seeks to explore a balanced accommodation between the protection of data-related investments and the advancement of AI technology, and, more broadly, the enhancement of transparency and predictability in data governance.

발행기관:: 한국재산법학회
DOI:: http://dx.doi.org/10.35142/prolaw.43.1.202602.004
분류:: 민법

AI 법률 상담

이 논문의 주제에 대해 더 알고 싶으신가요?

460만+ 법률 자료에서 관련 판례·법령·해석례를 찾아 답변합니다

AI 상담 시작