ICS News Archive

If looking for the latest news, go here…

January 7, 2020

The Internet at 50 and Its Influence on ICS

On Oct. 29, 2019, the internet turned 50. How have its five decades of existence changed the field of computing, and what comes next? These and other questions are considered by the following three professors from the Donald Bren School of Information and Computer Sciences (ICS):

Crista Lopes, professor of informatics and faculty director of the master of software engineering (MSWE) program, whose research focuses on software engineering, programming languages and distributed virtual environments;
Annie Qu, professor of statistics, who recently joined the ICS faculty and who works on cutting-edge statistical methods and theory in machine learning — for example, natural language processing, recommender systems, imaging and network data; and
Gene Tsudik, Distinguished Professor of computer science, who is an expert in security and privacy and who recently returned from a Fulbright Specialist visit to Myanmar. He is also the vice chair for graduate studies for the Department of Computer Science.

Lopes, Qu and Tsudik talk here about how the internet has influenced — and will continue to influence — their fields, discussing a wide range of topics, from artificial intelligence to “zombie” IoT devices.

How has the internet changed or defined your field?

Lopes: How didn’t the internet change our field? When I started college in the 1980s, the internet was still very small and limited to email, file transfers, and bulletin board systems (BBSs). There was no web and no search engines. I remember applying to graduate schools in the U.S. by filling out paper forms and sending them by snail mail. I also remember discovering some professors’ email addresses printed on some papers, and the excitement of emailing them and receiving replies! And how could I forget the early days of messaging, with the “talk” program in UNIX! And fax machines. Looking back, everything felt so different. Mainly, it was much harder to find information. You had to read through paper manuals to find out how to program this or that. You had to request papers from libraries or directly from the authors. The only news of the world came printed on newspapers and on TV. The hyperconnectivity we live in today feels like we moved to another planet!

Qu: The internet has changed the field of statistics in a very fundamental way. The most significant changes are caused by the size of data and the dimensionality of variables. We used to work on low-dimensional data, and now enter the big data era with very high-dimensional data, as collection for very large-scale data is much more feasible due to the internet. This imposes a strong demand to revolutionize the statistics field in developing cutting-edge methods, theory and computing tools to handle high-dimensional data. For example, traditional variable selection is sufficient when the variable size is small, but it can no longer handle high-dimensional variables. In statistical hypothesis testing, the concept of false discovery rate was introduced due to the large scale of multiple hypothesis testing.

Tsudik: I’m a security and privacy researcher. For me, the internet has been changing the game tremendously since my graduate school days in the late 1980s. Just about the time I began grad school in ’86, the internet was getting popular on the academic side. In fact, my dissertation was internet-focused. It was my ticket into security and privacy research. Back then, people used the internet for the most part to access scarce resources like supercomputers that cost millions of dollars. Or you connected to the internet to transfer files, print documents or read/send email. The internet was compute-centric. Since then, it has gradually become information-centric. Gazillions of mongo-bytes of data are being produced and consumed every single moment. That’s the biggest game changer in my opinion, as is, of course, the scale, with billions of devices (of varying computing intelligence) and people connected to the internet at any given time. What gets really interesting is that most of these devices, like a Ring doorbell or a Nest thermostat, are not traditional computers. These IoT, embedded or cyber-physical devices are special-purpose gadgets; they’re the primitive life forms of the computing world. They’re not fully developed (as are general-purpose computers) but are now suddenly connected to the internet, and they can sense and/or actuate the environment. This creates numerous security and privacy (and safety!) problems.

What changes do you expect to see over the next few decades thanks to continued advances stemming from the internet?

Lopes: It’s really hard to predict the future. I don’t think we’re in a good place right now, with all the lies and disinformation that are being spread through the internet, and with the malicious use of technology. I hope we, somehow, get vaccinated against this, and find a way of restoring the sense of truth.

Qu: I would expect more development on artificial intelligence, and statistical thinking will be very useful especially for machine learning and data science. The explosive growth of large volumes of data with complex structures has led to the data science revolution, due to many different domain applications arising. This revolution has led to extraordinary advances in artiﬁcial intelligence and machine intelligence. Most importantly, data science has created opportunities for new directions in statistical research such as NLP, reinforcement and generative learning, recommender systems and information retrievers.

Tsudik: I expect lots of progress in all types of transportation, fueled in part by the potential blending of the cellular and internet ecosystems (the two are still largely separate today). I also anticipate significant advances in automating and instrumenting home/office/public spaces, which will undoubtedly bring much improvement as well as exacerbate privacy problems. Finally, I think that rapidly dwindling storage costs and increasing communication speeds will fuel changes in social networking: people will begin recording/storing/exposing more of their everyday lives — for example, recording and casting 24/7.

How can we better harness the power and potential of the internet to improve our daily lives?

Lopes: I think it’s time to bring regulation to the table. The internet has been largely the Wild West so far. It needs to grow up.

Qu: The power and potential of harnessing the internet to improve our daily lives are tremendous. Examples include using mobile health data to monitor our health daily, or incorporating personal history data and other users’ information to make better recommendations for reading news articles, purchasing, dining and entertainment. In addition, using mobile apps and social media, we can share resources — for example, transportation sharing. We can also better monitor and avoid traffic, use driver-less cars and shop in stores without waiting in line. All these revolutions require advanced technology to process imaging, text, speech and video data with instantaneous speed and high accuracy, and require us to train next-generation data scientists with multiple skills of data processing, computing and analytics.

Tsudik: In the last 15 years, we’ve started to see nontraditional computing devices connecting to the internet. In the past, these devices looked like computers — they were laptops or desktops. Again, what gets interesting is all these other devices used to be analog and mechanical; suddenly, they’ve acquired a digital personality. The doorbell senses a finger and actuates by producing a sound and taking a video. Or consider medical devices: for example, if you are a diabetes-sufferer, you can wear an insulin pump that injects insulin when needed, communicates wirelessly and senses/produces information. It clearly offers real benefits while prompting many security and privacy headaches.

What issues have we overlooked in terms of dealing with the influence of the internet?

Lopes: The spread of lies was largely overlooked. We heard about kids’ addiction to their phones, about the potential harmful effects of video games, about online porn, even about our data being tracked for nefarious purposes. But we completely overlooked the danger of well-planned psychological operations (aka propaganda) on mass media platforms. We assumed the Cold War was over. We were wrong.

Qu: Data privacy and reliability are the major issues. Recent Facebook data privacy breaches remind us that we should never take data privacy for granted. The internet facilitates for identity theft and credit fraud. People have lost trust in social media such as Facebook for sharing personal information for political purposes. In addition, data distortion through biased sampling and missing data could lead to misinforming people, while the internet can make unverified or misleading information spread quickly.

Tsudik: The social networking of today encourages both voyeurism and exhibitionism. It exacerbates the “look at me, I’m bored. Please feed me more information.” I am one of the billions of people who catch myself, when I’m sitting around for a second, instead of maybe looking at the sky or at nature, I whip out my phone and look at the news. Most of the time, there’s nothing interesting happening, but I am an information addict. The information-centered nature of the internet — and social networking in particular — is encouraging this kind of behavior. Information is like a drug. The other important problem is that the masses are simply not educated enough about security. If you have a child who goes to middle school, they take health education — it’s mandatory. Why are we not teaching them internet health and hygiene? We teach kids about stranger danger when they’re 4 years old, but what about stranger danger on the internet? We don’t have much internet literacy for adults either. They make simple mistakes and fall victim to scams and phishing because they’re unaware of the basics of internet privacy and security. This is something the federal (or at least state) government should address.

It has been reported that fake videos could be the next big problem in the 2020 elections. What role, if any, will your field play in terms of helping address this issue or the need for election security?

Lopes: I think we are severely under prepared for what’s coming, both in terms of legislation, infrastructure and technological defenses. I expect the worst, honestly.

Qu: Data could be manipulated and distorted in political campaigns. Statisticians and data scientists should play active roles to develop improved statistics tools for data verification and validation to ensure data integrity and reproducibility. Scalable computing technology is highly needed to achieve real-time data checking.

Tsudik: Deep fakes are already a giant problem. My hope lies with the AI or machine-learning side of things, and image processing. Election security is a different story. Most people tend to be concerned with privacy, because we want to know that our votes are private. Security means the integrity of votes, and while it’s not a pervasive problem, it’s a problem in some places because we don’t have a uniform voting system. The other thing with voting and elections is fault-tolerance. What if things crash? Would the election go haywire? Would it have to be redone? That’s something I personally worry about more than security and privacy — somebody causing massive denial-of-service, or pervasive crashes of voting machines. What if, instead of hacking into a voting machine, I send it some specially crafted packets or messages so that it cannot reboot, or just starts spinning on empty? There’s no security problem, there’s no privacy problem, but you can’t vote, or you’ve voted but then I erased everything. There are FIPS [federal information processing standards] that govern these things, but I think there have been demonstrated cases where machines that were approved and stamped with the highest marks still did not provide the desired level of fault-tolerance.

What will be the biggest challenge we as a society need to address in the next 10 years, and how will the internet and your field of study help?

Lopes: The biggest challenge: how to preserve the sense of truth in a world where realities are shaped by words and images instantly available everywhere via the internet. I don’t think technology alone will be able to tackle this. Education is paramount, and so is regulation.

Qu: The biggest challenges we will face in the next 10 years are the global environment, climate change and disparities in social justice. The internet can help us to be more aware of these challenges so we can make more efforts to solve or mitigate these critical problems. In the statistics field, we need to be more active in using historical and current data to show that these are serious problems, and provide accurate trajectory predictions on the sustainability of the earth in the next decade.

Tsudik: One of the biggest challenges in my field is longevity of security. For example, how do we protect information today? We put it in the cloud and we encrypt it. This includes sensitive stuff, like private photos or medical records or DNA. What happens 10 years from now? Encryption that was secure today is no longer secure. So, if somebody gets a hold of that old data, it becomes trivially decryptable. Sensitivity of some private data doesn’t dissipate over time. What’s the alternative? The alternative is to say, “Well, I’m going to think into the future and instead of encrypting it using something that is good enough for today, I’m going to do something that I think will be good enough for 30 years from today.” Then it’s going to take you a year to encrypt it, or every time you want to access it, you have to spend a lot of time and effort, so you’re not going to do it. That’s one of the bigger problems we need to address.

What research projects will you be focused on in the coming year?

Lopes: My group is working on three projects at the moment. One is rethinking the programming models we use for replicated objects in distributed systems, and how to best serve the needs of those applications. Another one is to collect, curate and distribute software artifacts for research. The third is to build bridges between neural network programming and traditional programming.

Qu: I will focus on research in differential privacy to ensure that we can share data without leaking confidential information. This will be a very important problem that we need to address in the future. In addition, my students and I are currently working on optical imaging data to identify hidden spatial-distributed microvesicles, which are biomarkers for invasive breast cancer at an early stage. We are also working on dynamic network data to identify dynamic changes of community detection and perform hyperlink predictions. Furthermore, we are developing a novel active learning method in clustering analysis when the size of unlabeled data is super large.

Tsudik: In the very near future, the average citizen will be able to, for very little money, obtain a fully digitized copy of their own genome, which is essentially 3.2 billion letters (ACGT). Some of our own bioinformatics faculty (notably, Pierre Baldi) work on this topic on the computational genomics side. Once this happens, you could go to a doctor and be prescribed a treatment based on your genome. But what about security and privacy? If someone or something (e.g., malware) surreptitiously modifies your digitized genome, you could get administered the wrong medicine. This could literally be a life-or-death situation, which motivates my group’s research on genomic security.

Another area is cyberphysical/IoT/embedded system security, where we are focusing on designing formally verified, provably secure techniques so that you can actually trust that a device is not compromised or infected with malware. We’re focusing on the low-end devices, like smart doorbells or lightbulbs, which have no ability to run any kind of anti-malware. Yet, they are accessible and can be compromised, as was the case in 2017 with the Mirai botnet — a massive, worldwide DDOS [distributed denial of service] attack that focused on home/consumer DVR cameras. This clever malware infected a camera and would sit there like a zombie. Then, on command from some far-away control center, it would wake up and send internet packets to some specified target(s). It would say, “Oh, today we are going to target ICS at UCI,” and a gazillion of these cameras from all over the world would start sending traffic. Imagine, it’s like being hosed from all directions at the same time by these infected zombies that don’t know what they’re doing. This poses a real danger. Our research results would let people disinfect their devices to get rid of the malware; that’s a step forward and kind of exciting.

— Shani Murray