In this research, we develop a privacy-aware federated learning framework allowing participants to use edge devices to continuously monitor and detect abnormal psychological states such as drowsiness in driving and suicidal thoughts. We propose a new federated learning framework to allow the best trade-off between the resource constraints of edge devices and the quality of the trained model. A reinforcement learning based adaptive sampling technique is applied to address the unique non-iid data distribution challenge of health data.
Federated Learning Reinforcement Learning Smart Health Privacy-aware Machine Learning Distributed Systems
2019 - PresentThe detection of misleading content on social media has become a critical undertaking with the increasing popularity of online social media. This project focuses on an important but largely unsolved problem: detecting fauxtography (i.e., social media posts with misleading images). [Challenge: can you identify the fauxtography in these posts?] [Answer] We develop FauxBuster, an end-to-end supervised learning scheme that can effectively track down fauxtography by exploring the valuable clues from user's comments of a post on social media. The FauxBuster is content-free in that it does not rely on the analysis of the actual content of the images, and hence is robust against malicious uploaders who can intentionally modify the presentation and description of the images. We evaluate FauxBuster on real-world data collected from two mainstream social media platforms - Reddit and Twitter. Results show that our scheme is both effective and efficient in addressing the fauxtography problem.
Machine Learning Supervised Learning Network Embedding Natural Language Processing
Modern Internet of Things (IoT) systems are increasingly leveraging deep neural networks (DNNs) with the goal of enabling intelligence at the edge of the network. While applying DNNs can greatly improve the accuracy of autonomous decisions and inferences, a significant challenge is that DNNs are traditionally designed and developed for advanced hardware (e.g., GPU clusters) and can not easily meet the real time requirements when deployed in a resource-constrained edge computing environment. While many systems have been proposed to facilitate deep learning at the edge, a key limitation lies in the under-utilization of the parallelizable GPU resources of edge nodes (e.g., IoT devices). In this project, we propose EdgeBatch, a collaborative intelligent edge computing framework that minimizes the delay and energy consumption of executing DNN tasks at the edge by sharing idle GPU resources among privately owned IoT devices. EdgeBatch develops 1) a stochastic task batching mechanism that identifies the optimal batching strategy for the GPUs on IoT devices given uncertain task arrival times, and 2) a dynamic task offloading scheme that coordinates the collaboration among edge nodes to optimize the utilization of idle GPU resources in the system. We implemented EdgeBatch on a real-world edge computing testbed that consists of heterogeneous IoT devices (Jetson TX2, TX1, TK1, and Raspberry Pi3s).
Deep Learning Edge+AI Distributed Systems Constrained Optimization Online Learning Task Batching
This project focuses on designing new resource management techniques to address unique challenges in real-time task assignment to privately-owned edge devices (e.g., smartphones and IoT devices). We address two of the most challenging issues: 1) heterogeneity of the edge where the edge devices owned by human often have diversified computational power, runtime environments, network interfaces, and hardware equipment. Such heterogeneity poses significant challenges in the resource management of edge compuyting systems. 2) non-cooperativeness where owner of the devices are rational actors and may refuse to provide computing power or share device context due to various concerns such as privacy and energy. In light of such challenges, we develop new resource management frameworks, such as CoGTA, HeteroEdge, etc. to address the heterogeneity and non-cooperativeness of edge computing systems by 1) developing a middleware that can mask out the heterogenous execution environment and hardware details to provide a uniform interface to the applications; and 2) proposing a series of game-theoretic task allocation models to effectively map tasks to the heterogeneous edge devices in a way that not only maximizes the owners' utilities but also satisfies the QoS of the application (e.g., energy efficiency and deadline hit rate).
Game Theory Task Mapping Dynamic Supply Chain Model Online Learning Lyapunov Control Container Technology
In this project, we develop AI-based damage assessment applications where deep neural network approaches are used to automatically identify damage severity of impacted areas from imagery reports in the aftermath of a disaster (e.g., earthquake, hurricane, landslides). We propose CrowdLearn, a crowd-AI hybrid system that leverages the crowdsourcing platform to troubleshoot, tune, and eventually improve the black-box AI algorithms by welding crowd intelligence with machine intelligence. We also develop an interactive Disaster Scene Assessment (iDSA) scheme that allows AI algorithms to directly interact with humans to identify the salient regions of the disaster images in damage assessment applications. New incentive designs and active learning techniques are proposed to ensure reliable, timely, and cost-efficient responses from the crowdsourcing platforms.
Human-AI Interactive Machine Learning Convolutional Neural Network Computer Vision Attention Mechanism Active Learning Multi-armed Bandit
Can you imagine everytime you take a beautiful landscape picture, your mobile phone will automatically find a poem to describe it? We have developed iPoemRec, the first classical poetry recommender system using visual inputs. The iPoemRec system can explicitly model the artistic conception (e.g., metaphors) of poems and images and effectively recommend poetry that matches the sentiment and theme of the image. Using real-world datasets and a user study, we have demonstrated that iPoemRec can recommend classical poems to users with high relevance and receive significantly higher user ratings than the state-of-the-art baselines.
Knowledge Graph Recommender System Network Embedding Natural Language Processing
Have you noticed there are free live streams of Football/NBA events or Game of Thrones on YouTube? These videos are often uploaded without the permission of content owners and must be correctly identified and removed from the system to protect the copyright of the content owners. The problem is difficult because i) streamers can be sophisticated and modify the title or tweak the presentation of the video to bypass the detection system [Examples] [Another Funny One] ; ii) legal videos and copyright-infringing ones may have very similar visual content and descriptions. We found current commercial copyright detection systems have critical flaws: a large amount of copyrighted content bypasses the detection system while legal streams are taken down by mistake. In this research, we develop copyright infringement detection systems that exploit the linguistic cues from live chat messages from the audience. Experiments on YouTube shows that our tools are effective and efficient in identifying the copyright-infringing videos.
Machine Learning Supervised Learning Bayesian Network Natural Language Processing
The proliferation of online Location-Based Social Networks (LBSN) has offered unprecedented opportunities for understanding fine-grained spatio-temporal behaviors of users and developing new location-aware recommender systems. In this work, we focus on the problem of POI prediction where the goal is to predict the next venue people will likely visit by exploiting their online check-in traces and the latent intents. We develop a Context-aware Sparse Check-in Venue Prediction (CSCVP) scheme inspired by natural language processing techniques. CSCVP predicts the venue category information and explores the similarity between users to address data sparsity challenge by significantly reducing the prediction space. It also leverages the Probabilistic Latent Semantic Analysis (PLSA) model to incorporate the latent user intent into the prediction model. Finally, we develop a novel Temporal Adaptive Ngram (TA-Ngram) model in CSCVP to capture the dynamic and non-deterministic dependency between check-ins.
Recommender System Topic Modeling Ngram Predictive Model Latent Intent Analysis
This project targets at combating misinformation and malicious users on social media. We have developed new Truth Discovery algorithms to jointly estimate the reliability of social media users as well as identifying truthful information on social media platforms during critical disaster events. We are building scalable and efficient distributed system platforms to process massive social media streams.
Truth Discovery Sentiment Analysis Hidden Markov Model Natural Language Processing Spatio-temporal Inference Expectation Maximization (EM)
An interesting observation on social media platform is that users fail to behave in accordance with privacy awareness even they have already perceived potential risks. For example, we found many people, even aware of privacy risks of social media platforms for information aggregation, do not carefully read privacy policies/ terms of aggrements on these sites. In this research, we perform a survey from over 1200 people to measure social network users’ privacy attitude, privacy perception and their actual behavior when using social networking sites. The study targeted at three populations of different cultural contexts: U.S. college students, Chinese students in the U.S. and Chinese students in China. It also targeted at 6 populate social network sites. Some interesting findings: 1) People have different privacy concerns toward different types of information. 2) Cultural differences are indeed an important factor in social network users’ privacy perceptions and behaviors. For example, real names are considered more private for Chinese users as compared to American users.
Statistical Testing Security and Privacy User Study