Willie Jones covers transportation for IEEE Spectrum and the history of technology for The Institute.
Stop for a second and think about the Internet without digital images or video. There would be no faces on Facebook. Instagram and TikTok probably wouldn’t exist. Those Zoom meetings that took the place of in-person gatherings for school or work during the height of the COVID-19 pandemic? Not an option.
Digital audio’s place in our Internet-connected world is just as important as still images and video. It has changed the music business—from production to distribution to the way fans buy, collect, and store their favorite songs.
What do those millions of profiles on LinkedIn, dating apps, and social media platforms (and the inexhaustible selection of music available for download online) have in common? They rely on a compression algorithm called the discrete cosine transform, or DCT, which played a major role in allowing digital files to be transmitted across computer networks.
“DCT has been one of the key components of many past image- and video-coding algorithms for more than three decades,” says Touradj Ebrahimi, a professor at Ecole Polytechnique Fédérale de Lausanne, in Switzerland, who currently serves as chairman of the JPEG standardization committee. “Only a few image-compression standards not using DCT exist today,” he adds.
The Internet applications people use every day but largely take for granted were made possible by scientists and engineers who, for the most part, toiled in anonymity. One such “hidden figure” is Nasir Ahmed, the Indian-American engineer who figured out an elegant way to cut down the size of digital image files without sacrificing their most critical visual details.
Ahmed published his seminal paper about the discrete cosine transform compression algorithm he invented in 1974, a time when the fledgling Internet was exclusively dial-up and text-based. There were no pictures accompanying the words, nor could there have been, because Internet data was transmitted over standard copper telephone landlines, which was a major limitation on speed and bandwidth.
“Only a few image-compression standards not using DCT exist today.” –Touradj Ebrahimi, EPFL
These days, with the benefit of superfast chips and optical-fiber networks, data download speeds for a laptop with a fiber connection reach 1 gigabit per second. So, a music lover can download a 4-minute song to their laptop (or more likely a smartphone) in a second or two. In the dial-up era, when Internet users’ download speeds topped out at 56 kilobits per second (and were usually only half that fast), pulling down the same song from a server would have taken nearly all day. Getting a picture to appear on a computer’s screen was a process akin to watching grass grow.
Ahmed was convinced there had to be a way to cut down the size of digital files and speed up the process. He set off on a quest to represent with ones and zeros what is critical to an image being legible, while tossing aside the bits that are less important. The answer, which built on the earlier work of mathematician and information-theory pioneer Claude Shannon, took a while to come into focus. But because of Ahmed’s determination and unwavering belief in the value of what he was doing, he persevered even after others told him that it was not worth the effort.
Raised to Love Technology
It seemed almost preordained that Ahmed would have a career in one of the STEM fields. Nasir, who was born in Bengaluru, India, in 1940, was raised by his maternal grandparents. Ahmed’s grandfather was an electrical engineer who told him that he had been sent to the United States in 1919 to work at General Electric‘s location in Schenectady, N.Y. He shared tales of his time in the United States with his grandson and encouraged young Nasir to emigrate there. In 1961, after earning a bachelor’s degree in electrical engineering at the University of Visvesvaraya College of Engineering, in Bengaluru, Ahmed did just that, leaving India that fall for graduate school at the University of New Mexico, in Albuquerque. Ahmed earned a master’s degree and a Ph.D. in electrical engineering in 1963 and 1966, respectively.
During his first year in Albuquerque, he met Esther Parente, a graduate student from Argentina. They soon became inseparable and were married while he was working toward his doctorate. Sixty years later, they are still together.
The Seed of an Idea
In 1966, Ahmed, fresh out of grad school with his Ph.D., was hired as a principal research engineer at Honeywell’s newly created computer division. While there, Ahmed was first exposed to Walsh functions, a technique for analyzing digital representations of analog signals. The fast algorithms that could be created based on Walsh functions had many potential applications. Ahmed focused on using these signal-processing and analysis techniques to reduce the file size of a digital image without losing too much of the visual detail in the uncompressed version.
That research focus remained his primary interest when he returned to academia, taking a job as a professor in the electrical and computer engineering department at Kansas State University, in 1968.
Ahmed, like dozens of other researchers around the globe, was obsessed with finding the answer to a single question: How do you create a mathematical formula for deciphering which of the ones and zeros that represent a digital image need to be kept and which can be thrown away? The things he’d learned at Honeywell gave him a framework for understanding the elements of the problem and how to attack it. But the majority of the credit for the eventual breakthrough has to go to Ahmed’s steely determination and willingness to take a gamble on himself.
In 1972, he sought grant funding that would let him afford to spend the months between Kansas State’s spring and fall semesters furthering his ideas. He applied for a U.S. National Science Foundation grant, but was denied. Ahmed recalls the moment: “I had a strong intuition that I could find an efficient way to compress digital signal data. But to my surprise, the reviewers said the idea was too simple, so they rejected the proposal.”
Undaunted, Ahmed and his wife worked to make the salary he earned during the nine-month school year last through the summer so he could focus on his research. Money was tight, the couple recalls, but that moment of financial belt-tightening only seemed to heighten Ahmed’s industriousness. They persevered, and Ahmed’s long days and late nights in the lab eventually yielded the desired result.
DCT Compression Comes Together
Ahmed took a technique for turning the array of image-processing data representing an image’s pixels into a waveform, effectively rendering it as a series of waves with oscillating frequencies, and combined it with cosine functions that were already being used to model phenomena such as light waves, sound waves, and electric current. The result was a long string of numbers with values bounded by 1 and –1. Ahmed realized that by quantizing this string of values and performing a Fourier transformation to break the function into its constituent frequencies, each pixel’s data could be represented in a way that was helpful for deciding what data points must be kept and what could be omitted. Ahmed observed that the lower-frequency waves corresponded to the necessary or “high information” regions of the image, while the higher-frequency waves represented the bits that were less important and could therefore be approximated. The compressed-image files he and his team produced were one-tenth the size of the originals. What’s more, the process could be reversed, and a shrunken data file would yield an image that was sufficiently similar to the original.
After another two years of laborious testing, with he and his two collaborators running computer programs written on decks of data punch cards, the trio published a paper in IEEE Transactions On Computers titled “Discrete Cosine Transform” in January 1974. Though the paper’s publication did not make it immediately clear, the worldwide search for a reliable method of doing the lossy compression that Claude Shannon had postulated in the 1940s was over.
JPEGs, MPEGs, and More
It wasn’t until 1983 that the International Organization for Standardization (ISO) began working on the technology that would allow photo-quality images to accompany text on the screens of computer terminals. To that end, ISO established the Joint Photographic Experts Group, better known by the ubiquitous acronym JPEG. By the time the first JPEG standard was published in 1992, DCT and advances made by a cadre of other researchers had come to be recognized by the group as basic elements of their method for the digital compression and coding of still images. “This is the beauty of standardization, where several dozen bright minds are behind the success of advances such as JPEG,” says Ebrahimi.
And because video can be described as a succession of still images, Ahmed’s technique was also well suited to making video files smaller. DCT was the compression technique of choice when ISO and the international Electrotechnical Commission (IEC) established the Moving Picture Experts Group, or MPEG, for the compression and coding of audio, video, graphics, and genomic data in 1988. When the first MPEG standard was published in 1993, the World Wide Web that now includes Google Maps, dating apps, and e-commerce businesses was just four years old.
The ramping up of computer speeds and network bandwidth during that decade—along with the ability to transmit pictures and video via much smaller files—quickly transformed the Internet before anyone knew that Amazon would eventually let readers judge millions of books by their covers.
Having solved the problem that had monopolized his time and attention for several years, Ahmed resumed his career in academia. In 1993, the year the first MPEG standard went on the books, Ahmed left Kansas State and returned to the University of New Mexico. There he was a presidential professor of electrical and computer engineering until 1989, when he was promoted to chair of the ECE department. Five years after that, he became dean of UNM’s school of engineering. Ahmed held that post for two years until he was named associate provost for research and dean of graduate studies. He stayed in that job until he retired from the university in 2001 and was named professor emeritus.