학술논문정보관리학회지2010.06 발행

An Extraction Method of Bibliographic Information from the US Patents: Using an HTML Parsing Technique

한유진(숙명여자대학교); 오승우(Seoul National University)

27권 2호, 7~20쪽

초록

This study aims to provide a method of extracting the most recent information on US patent documents. An HTML paring technique that can directly connect to the US Patent and Trademark Office (USPTO) Web page is adopted. After obtaining a list of 50 documents through a keyword searching method, this study suggested an algorithm, using HTML parsing techniques, which can extract a patent number, an applicant, and the US patent class information. The study also revealed an algorithm by which we can extract both patents and subsequent patents using their closely connected relationship, that is a very distinctive characteristic of US patent documents. Although the proposed method has several limitations, it can supplement existing databases effectively in terms of timeliness and comprehensiveness.

Abstract

발행기관:: 한국정보관리학회
DOI:: http://dx.doi.org/10.3743/KOSIM.2010.27.2.007
분류:: 문헌정보학

AI 법률 상담

이 논문의 주제에 대해 더 알고 싶으신가요?

460만+ 법률 자료에서 관련 판례·법령·해석례를 찾아 답변합니다

AI 상담 시작