The University of Texas at Dallas

Erik Jonsson School of Engineering and Computer Science

Content

Assistant Professor Explores a New Dimension of Data Analysis

When you think of geometry, you might think of mathematical equations involving the size, shape and position of objects, and the properties of space. And you’d be right.

But what if you could take these geometric principles, feed them into a computer and apply them to large amounts of data, so you could “see” the data expressed as a shape that helps you understand some of the data’s properties?

Dr. Benjamin Raichel, assistant professor of computer science, recently received a National Science Foundation Early Career Development (CAREER) Award for a project that does just that.

Raichel said his research will contribute to the idea of a geometric take on data analysis that will make it both faster and easier to manipulate large amounts of information.

He admits it’s an abstract concept.

“One of the main benefits of the grant is trying to get researchers to use these geometric insights in handling their data,” he said. “If you understand the geometry of your data, that can improve performance, running time and classification.”

The five-year grant, totaling nearly $500,000, will support his project called Giving Form to Data with a Geometric Scaffold. Raichel said using geometry to structure data and the world around us sounds like a new idea, but it actually dates back to the time of Plato and Aristotle.

“Geometry is something deeply ingrained in us as humans,” said Raichel, who joined the faculty of the Erik Jonsson School of Engineering and Computer Science in 2015. “This is why it’s one of the oldest branches in mathematics. It’s something we can reason about. We understand geometry at an intuitive level.”

Today’s data sets, especially in areas such as machine learning, are often massive and high-dimensional. For example, when trying to classify news articles, each article might be represented as a point where the frequency of each word is a different dimension. This leads to high-dimensional spaces that are hard to understand and where low-dimensional tools break down, what noted mathematician Richard Bellman dubbed the “curse of dimensionality.”

Raichel’s approach is to map to a smaller and simpler subset of the data points that keep the same geometric structure. He said this allows for a more efficient computation of the data and can improve results by removing extraneous information.

“The goal is to identify the geometric structure of the data, summarize it and embed it into a simpler space where computations can be done more efficiently,” Raichel said. “The ultimate purpose is to develop better algorithms for handling data, whether it be for clustering, classification or any number of other computational tasks.”

Raichel will also use his CAREER grant to support student research in the field, develop new courses and organize seminars to raise the profile of using geometry to give form to data.