An Analysis of Korean characters’ Frequency Survey from the 21st Century Sejong Plan for Type Design in Hangeul
An Analysis of Korean characters’ Frequency Survey from the 21st Century Sejong Plan for Type Design in Hangeul
박재홍(명지대학교)
21권 3호, 292~299쪽
초록
This study, the Korean language data of the Corpus of the 21st Century Sejong Project, was made in modern Korean, using approximately 370,000 literature words and about 800,000 words of spoken Korean. YoonDesign Group's Park Jae-hong typeface research institute collaborated with Seoul National University's Shin Hyo-pil research team in surveying and analysing frequency of the corpus. As a research method, frequency examined using Python, an advanced programming language published by Guido van Rossum in 1991. The frequency of initial consonants was found to be over 54 % of the total frequency of the 19 initial consonants out of the four consonants, ㅇ, ㄱ, ㄷ, ㅅ. The frequency of vowels was 50 % or more of the total 21 consonants in three consonants, ㅏ, ㅣ, ㅡ . The frequency of final consonants was found to be over 52 % of the total frequency of the 28 final consonants out of the six consonants, characters without final consonant, ㄴ, ㄹ, ㅇ, ㄱ, ㅁ. ‘이’ was used most often with 9.351% out of the horizontally aligned characters without consonants. ‘한’ was used most often with 4.252% out of horizontally aligned consonants. ‘고’ was used most often with 12.389% out of vertically aligned characters without consonants. ‘는’ was used most often with 15.511% out of vertically aligned consonants. ‘의’ was used most often with 42.506% out of characters without consonants that are aligned vertically and horizontally. ‘원’ was used most often with 19.898% out of consonants that are both horizontally and vertically aligned. 3,025 characters are found on the around 45,000,000 words on this research. The frequency of characters was found to be over 9 % of the six consonants, 이, 다, 는. The top 822 syllables represented 99 % of total use. The survey could suggest that the top 181 letters, which account for 80 % of the total syllables as the representative of the type design in Hangeul.
Abstract
This study, the Korean language data of the Corpus of the 21st Century Sejong Project, was made in modern Korean, using approximately 370,000 literature words and about 800,000 words of spoken Korean. YoonDesign Group's Park Jae-hong typeface research institute collaborated with Seoul National University's Shin Hyo-pil research team in surveying and analysing frequency of the corpus. As a research method, frequency examined using Python, an advanced programming language published by Guido van Rossum in 1991. The frequency of initial consonants was found to be over 54 % of the total frequency of the 19 initial consonants out of the four consonants, ㅇ, ㄱ, ㄷ, ㅅ. The frequency of vowels was 50 % or more of the total 21 consonants in three consonants, ㅏ, ㅣ, ㅡ . The frequency of final consonants was found to be over 52 % of the total frequency of the 28 final consonants out of the six consonants, characters without final consonant, ㄴ, ㄹ, ㅇ, ㄱ, ㅁ. ‘이’ was used most often with 9.351% out of the horizontally aligned characters without consonants. ‘한’ was used most often with 4.252% out of horizontally aligned consonants. ‘고’ was used most often with 12.389% out of vertically aligned characters without consonants. ‘는’ was used most often with 15.511% out of vertically aligned consonants. ‘의’ was used most often with 42.506% out of characters without consonants that are aligned vertically and horizontally. ‘원’ was used most often with 19.898% out of consonants that are both horizontally and vertically aligned. 3,025 characters are found on the around 45,000,000 words on this research. The frequency of characters was found to be over 9 % of the six consonants, 이, 다, 는. The top 822 syllables represented 99 % of total use. The survey could suggest that the top 181 letters, which account for 80 % of the total syllables as the representative of the type design in Hangeul.
- 발행기관:
- 한국일러스아트학회
- 분류:
- 기타미술