UTSC stream teaches how to tame Big Data

Advances in computer and communications technology have left the world awash in huge amounts of information. UTSC is introducing a new stream to teach undergraduates how to deal with it.

The new statistical machine learning and data mining stream is one of the first of its kind in Canada. It is being offered by the Department of Computer and Mathematical Sciences and will focus on tools and techniques for working with massive amounts of electronic information.

Data mining and machine learning refers to looking for patterns in that data that give people the ability to predict behaviors that they couldn’t have predicted in the past.

These techniques are being used for everything from analyzing speech or recognizing people’s faces to predicting customer behavior. All are based on understanding and manipulating huge quantities of data so that people can build systems with very high reliability.

“Statistics traditionally was a field that allowed you to try and make confident inferences from relatively small amounts of data,” said David Fleet, head of computer and mathematical sciences. “Big data and data analytics today are really about building models where we have massive amounts of heterogeneous data, so much data that we don’t know how to visualize it and make sense of it.”

This new option has already received acclaim from companies dealing in huge data collection and IT such as Google, Yahoo and Microsoft.

Techniques from data mining and machine learning are used to allow cameras to recognize and focus on faces; they allow computers to understand human speech and transcribe it; they allow Netflix to recommend movies to you based on your previous selections and other customers’ selections.

Students interested in the option should have reasonably good mathematical skills, very strong computer science skills and a keen interest in working with data and being on the cutting edge of technology evolution.

