Determining the optimal selling price for different commodities has always been one of the main topics of scientific and industrial research. Perishable products have a short life and due to their deterioration over time, they cause great damage if not managed. Many industries, retailers, and service providers have the opportunity to increase their revenue through optimal pricing of perishable products
that must be sold within a certain period. In the pricing issue, a seller must determine the price of several units of a perishable or seasonal product to be sold for a limited time. This article examines pricing policies that increase revenue for the sale of a given inventory with an expiration date. Booster learning algorithms are used to analyze how companies can simultaneously learn and optimize pricing strategy in response to buyers. It is also shown that using reinforcement learning we can model a demand-dependent problem. This paper presents an optimization method in a model-independent environment in which demand is learned and pricing decisions are updated at the moment. We compare the performance of learning algorithms using Monte Carlo simulations.