CS Doctoral student explores partial monitoring in online learning
Can you teach an old computer new tricks? Computing science PhD student Gabor Bartok is certainly trying.
Machine learning is all about developing algorithms that would allow computers tolearn and make intelligent decisions based on data. In online learning, a subfield of machine learning, researchers develop sequential decision-making (or learning) algorithms. The learner uses a reward/loss system to evaluate its hypotheses, then the algorithm is adjusted according to its performance.
"We receive datapoints in a timely manner," says Gabor. "In each round, we receive a datapoint, update our decision maker based on some feedback, and use the new decision maker for the next datapoint. This way, if we have a stream of data, the learning never stops."
Sometimes learning is even more challenging because the feedback is limited.
"For example, we would know if they did or didn’t click our ad, but that’s it," says Gabor. "We have no knowledge of the other possible outcomes like what they might have clicked instead."
Gabor’s work is not specific to any of his examples – in fact, it is highly theoretical.
"My goal is to figure out what makes learning easy and what makes it hard," he says, "Find out how quickly we can learn and what algorithms can be used to achieve it. With time, the decision making gets better, but the question is how much better and in how much time."
Photos, 2011; article 2012.