The agent’s goal is to learn a policy (i.e. Each action puts the agent in a different environmental state, usually according to some probability distribution, where the agent then has the possibility of receiving some reward. That being said, I am by no means an expert and any and all feedback is much appreciated!Ī Markov Decision Process captures how an agent takes actions in an environment. For certain concepts, I will try to go into as much detail as possible about my task specific implementations, so some familiarity with probability, machine learning, RL, and DL is recommended.Solving a Rubik’s Cube with reinforcement learning is not a new problem and I will be basing most of my work off this paper by Stephen McAleer et al., with some modifications. Here, I have documented my (ongoing) attempt to do just that, by training an agent to solve a Rubik’s Cube. While the coursework was very informative, I wanted to take it a step further. One topic that particularly caught my eye was reinforcement learning (RL), which we approached from both the traditional direction of Markov Decision Processes (MDP) and from the direction of Deep Learning (DL). Last year, I started my journey into machine learning through a Master’s program at Cornell Tech. The task of actually solving the cube was done using Kociemba’s Algorithm. OpenAI made headlines in Fall 2019 for solving the related but more difficult RL task of teaching a physical hand to manipulate the cube.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |