애스크로AIPublic Preview
← 학술논문 검색
학술논문경영과학2022.06 발행KCI 피인용 1

예산제약이 존재하는 다기간, 다제품 재고관리문제에서 강화학습 기법의 효율성 개선 방안 연구

Improving the Efficiency of Reinforcement Learning for a Multi-period, Multi-item Inventory Control Problem with a Budget Constraint

김지헌(이화여자대학교 빅데이터분석학 협동과정); 민대기(이화여자대학교)

39권 2호, 17~28쪽

초록

This paper considers the use of reinforcement learning for a multi-period, multi-item inventory control problem with a budget constraint. In the problem, we decide the order quantities of multiple items considering budget constraints so as to minimizes the total inventory cost including inventory holding cost and backlog cost. The previous literature proposed a modified Q-learning that include an optimization model in the Q-learning procedure to handle budget constrained actions, but it lacks the scalability. To address this issue, this paper proposed a two-stage method: the Q-learning learns actions without considering the budget constraint in the first stage, and an optimization model adjusts the learned actions so as to satisfy the budget constraint in the second stage. Numerical study compares the performance of the proposed two-stage method with others such as a conventional Q-learning without the budget constraint and the modified Q-learning in the literature. The numerical experiments reveal that the proposed method significantly reduces the computation time without increasing the total inventory cost.

Abstract

This paper considers the use of reinforcement learning for a multi-period, multi-item inventory control problem with a budget constraint. In the problem, we decide the order quantities of multiple items considering budget constraints so as to minimizes the total inventory cost including inventory holding cost and backlog cost. The previous literature proposed a modified Q-learning that include an optimization model in the Q-learning procedure to handle budget constrained actions, but it lacks the scalability. To address this issue, this paper proposed a two-stage method: the Q-learning learns actions without considering the budget constraint in the first stage, and an optimization model adjusts the learned actions so as to satisfy the budget constraint in the second stage. Numerical study compares the performance of the proposed two-stage method with others such as a conventional Q-learning without the budget constraint and the modified Q-learning in the literature. The numerical experiments reveal that the proposed method significantly reduces the computation time without increasing the total inventory cost.

발행기관:
한국경영과학회
분류:
경영학

AI 법률 상담

이 논문의 주제에 대해 더 알고 싶으신가요?

460만+ 법률 자료에서 관련 판례·법령·해석례를 찾아 답변합니다

AI 상담 시작
예산제약이 존재하는 다기간, 다제품 재고관리문제에서 강화학습 기법의 효율성 개선 방안 연구 | 경영과학 2022 | AskLaw | 애스크로 AI