Home

Prospective Students

I am looking for highly motivated PhD-track students to work on multiple funded research projects starting in Fall 2018. Preferred qualifications include prior research experience in machine learning as an undergraduate or MS-level RA; a strong course background in multivariate calculus, probability and statistics, linear algebra, and numerical analysis; and a strong interest in working on interdisciplinary research bridging machine learning, mobile and distributed systems, and health and behavioral science. Application instructions can be found here: https://www.cics.umass.edu/admissions/application-instructions.

Research Interests

My research interests lie at the intersection of artificial intelligence, machine learning, and statistics. I am particularly interested in hierarchical graphical models and approximate inference/learning techniques including dynamic programming, Markov Chain Monte Carlo and variational Bayesian methods. My current research has a particular emphasis on models and algorithms for multivariate time series data and explores both probabilistic and neural network-based models and their combination.

Thanks to awards from ARL, IARPA, NSF and NIH, my current application focus is on machine learning-based analytics for mobile and wearable sensor data, as well as electronic health records data. I am also interested in large-scale, real-time, heterogeneous distributed machine learning systems that bridge mobile and embedded computing with cloud-based systems including distributed prediction cascades and distributed real-time active learning. My research group collaborates widely with researchers in mobile and distributed computing, mobile health, behavioral science, and medicine.

In the past, I have worked on a broad range of applications including collaborative filtering and ranking, unsupervised structure discovery and feature induction, object recognition and image labeling, and natural language processing, and I continue to consult on projects in these areas.

Recent Publications

Natarajan, Annamalai, Gustavo Angarita, Edward Gaiser, Robert Malison, Deepak Ganesan, and Benjamin Marlin. "Domain Adaptation Methods for Improving Lab-to-field Generalization of Cocaine Detection using Wearable ECG." 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing. 2016. Abstractnatarajan-ubicomp16.pdf

Mobile health research on illicit drug use detection typically involves a two-stage study design where data to learn detectors is first collected in lab-based trials, followed by a deployment to subjects in a free-living environment to assess detector performance. While recent work has demonstrated the feasibility of wearable sensors for illicit drug use detection in the lab setting, several key problems can limit lab-to-field generalization performance. For example, lab-based data collection often has low ecological validity, the ground-truth event labels collected in the lab may not be available at the same level of temporal granularity in the field, and there can be significant variability between subjects. In this paper, we present domain adaptation methods for assessing and mitigating potential sources of performance loss in lab-to-field generalization and apply them to the problem of cocaine use detection from wearable electrocardiogram sensor data.

Li, Steven Cheng-Xian, and Benjamin M. Marlin A scalable end-to-end Gaussian process adapter for irregularly sampled time series classification. Advances in Neural Information Processing Systems., 2016. Abstractli-nips2016.pdf

We present a general framework for classification of sparse and irregularly-sampled time series. The properties of such time series can result in substantial uncertainty about the values of the underlying temporal processes, while making the data difficult to deal with using standard classification methods that assume fixed-dimensional feature spaces. To address these challenges, we propose an uncertainty-aware classification framework based on a special computational layer we refer to as the Gaussian process adapter that can connect irregularly sampled time series data to to any black-box classifier learnable using gradient descent. We show how to scale up the required computations based on combining the structured kernel interpolation framework and the Lanczos approximation method, and how to discriminatively train the Gaussian process adapter in combination with a number of classifiers end-to-end using backpropagation.

Sadasivam, Rajani Shankar, Erin M. Borglund, Roy Adams, Benjamin M. Marlin, and Thomas K. Houston. "Impact of a Collective Intelligence Tailored Messaging System on Smoking Cessation: The Perspect Randomized Experiment." Journal of Medical Internet Research. 18.11 (2016): e285:1-13. AbstractFull Text

Background

Outside health care, content tailoring is driven algorithmically using machine learning compared to the rule-based approach used in current implementations of computer-tailored health communication (CTHC) systems. A special class of machine learning systems (“recommender systems”) are used to select messages by combining the collective intelligence of their users (ie, the observed and inferred preferences of users as they interact with the system) and their user profiles. However, this approach has not been adequately tested for CTHC.
Objective

Our aim was to compare, in a randomized experiment, a standard, evidence-based, rule-based CTHC (standard CTHC) to a novel machine learning CTHC: Patient Experience Recommender System for Persuasive Communication Tailoring (PERSPeCT). We hypothesized that PERSPeCT will select messages of higher influence than our standard CTHC system. This standard CTHC was proven effective in motivating smoking cessation in a prior randomized trial of 900 smokers (OR 1.70, 95% CI 1.03-2.81).
Methods

PERSPeCT is an innovative hybrid machine learning recommender system that selects and sends motivational messages using algorithms that learn from message ratings from 846 previous participants (explicit feedback), and the prior explicit ratings of each individual participant. Current smokers (N=120) aged 18 years or older, English speaking, with Internet access were eligible to participate. These smokers were randomized to receive either PERSPeCT (intervention, n=74) or standard CTHC tailored messages (n=46). The study was conducted between October 2014 and January 2015. By randomization, we compared daily message ratings (mean of smoker ratings each day). At 30 days, we assessed the intervention’s perceived influence, 30-day cessation, and changes in readiness to quit from baseline.
Results

The proportion of days when smokers agreed/strongly agreed (daily rating ≥4) that the messages influenced them to quit was significantly higher for PERSPeCT (73%, 23/30) than standard CTHC (44%, 14/30, P=.02). Among less educated smokers (n=49), this difference was even more pronounced for days strongly agree (intervention: 77%, 23/30; comparison: 23%, 7/30, P<.001). There was no significant difference in the frequency which PERSPeCT randomized smokers agreed or strongly agreed that the intervention influenced them to quit smoking (P=.07) and use nicotine replacement therapy (P=.09). Among those who completed follow-up, 36% (20/55) of PERSPeCT smokers and 32% (11/34) of the standard CTHC group stopped smoking for one day or longer (P=.70).
Conclusions

Compared to standard CTHC with proven effectiveness, PERSPeCT outperformed in terms of influence ratings and resulted in similar cessation rates.

Hiatt, Laura, Roy Adams, and Benjamin Marlin. "An Improved Data Representation for Smoking Detection with Wearable Respiration Sensors." IEEE Wireless Health. 2016. hiatt-wh2016.pdf

Late breaking extended abstract.

Sadasivam, Rajani Shankar, Sarah L. Cutrona, Rebecca L. Kinney, Benjamin M. Marlin, Kathleen M. Mazor, Stephenie C. Lemon, and Thomas K. Houston. "Collective-Intelligence Recommender Systems: Advancing Computer Tailoring for Health Behavior Change Into the 21st Century." Journal of Medical Internet Research. 18.3 (2016). AbstractFull Text

What is the next frontier for computer-tailored health communication (CTHC) research? In current CTHC systems, study designers who have expertise in behavioral theory and mapping theory into CTHC systems select the variables and develop the rules that specify how the content should be tailored, based on their knowledge of the targeted population, the literature, and health behavior theories. In collective-intelligence recommender systems (hereafter recommender systems) used by Web 2.0 companies (eg, Netflix and Amazon), machine learning algorithms combine user profiles and continuous feedback ratings of content (from themselves and other users) to empirically tailor content. Augmenting current theory-based CTHC with empirical recommender systems could be evaluated as the next frontier for CTHC.

Adams, Roy, Nazir Saleheen, Edison Thomaz, Abhinav Parate, Santosh Kumar, and Benjamin Marlin. "Hierarchical Span-Based Conditional Random Fields for Labeling and Segmenting Events in Wearable Sensor Data Streams." International Conference on Machine Learning. 2016. Abstracticml2016_hns.pdf

The field of mobile health (mHealth) has the potential to yield new insights into health and behavior through the analysis of continuously recorded data from wearable health and activity sensors. In this paper, we present a hierarchical span-based conditional random field model for the key problem of jointly detecting discrete events in such sensor data streams and segmenting these events into high-level activity sessions. Our model includes higher-order cardinality factors and inter-event duration factors to capture domain-specific structure in the label space. We show that our model supports exact MAP inference in quadratic time via dynamic programming, which we leverage to perform learning in the structured support vector machine framework. We apply the model to the problems of smoking and eating detection using four real data sets. Our results show statistically significant improvements in segmentation performance at the p=0.005 level relative to a hierarchical pairwise CRF.

Jacek, Nicholas, Meng-Chieh Chiu, Benjamin Marlin, and Eliot J. B. Moss. "Assessing the Limits of Program-Specific Garbage Collection Performance." Programming Language Design and Implementation. 2016. Abstractp584-jacek.pdf

Distinguished Paper Award

We consider the ultimate limits of program-specific garbage collector performance for real programs. We first characterize the GC schedule optimization problem using Markov Decision Processes (MDPs). Based on this characterization, we develop a method of determining, for a given program run and heap size, an optimal schedule of collections for a non-generational collector. We further explore the limits of performance of a generational collector, where it is not feasible to search the space of schedules to prove optimality. Still, we show significant improvements with Least Squares Policy Iteration, a reinforcement learning technique for solving MDPs. We demonstrate that there is considerable promise to reduce garbage collection costs by developing program-specific collection policies.

Dadkhahi, Hamid, Nazir Saleheen, Santosh Kumar, and Benjamin Marlin. "Learning Shallow Detection Cascades for Wearable Sensor-Based Mobile Health Applications." ICML On Device Intelligence Workshop. 2016. Abstractdadkhahi-icml-odi2017.pdf

The field of mobile health aims to leverage recent advances in wearable on-body sensing technology and smart phone computing capabilities to develop systems that can monitor health states and deliver just-in-time adaptive interventions. However, existing work has largely focused on analyzing collected data in the off-line setting. In this paper, we propose a novel approach to learning shallow detection cascades developed explicitly for use in a real-time wearable-phone or wearable-phone-cloud systems. We apply our approach to the problem of cigarette smoking detection from a combination of wrist-worn actigraphy data and respiration chest band data using two and three stage cascades.

Funded Projects

[2017-2020] Enhancing Context-Awareness and Personalization for Intensively Adaptive Smoking Cessation Messaging Interventions. See NSF award listing.

[2017-2022] Alliance for IoBT Research on Evolving Intelligent Goal-driven Networks (IoBT-REIGN) (with Prashant Shenoy, UMass PI. UIUC prime to ARL.). See ARL and UMass Amherst press releases, and the IoBT website.

[2017-2020]  mPerf: A Theory-driven Approach to Model and Predict Everyday Job Performance Using Mobile Sensors (with Deepak Ganesan, UMass PI. U. Memphis prime to IARPA). See project website.

[2014-2018] Center of Excellence for Mobile Sensor Data to Knowledge (with Santosh Kumar, U. Memphis, PI). See center website.

[2014-2019]. NSF CAREER: Machine Learning for Complex Health Data Analytics.

[2013-2016] Accurate and Computationally Efficient Predictors of Java Memory Resource Consumption (with Eliot Moss, PI).

[2012-2015]  SensEye: An Architecture for Ubiquitous, Real-Time Visual Context Sensing and Inference (with Deepak Ganesan, PI).

[2012-2015]  Patient Experience Recommender System for Persuasive Communication Tailoring (with Tom Houston, UMMS, PI).

[2012-2014] Foresight and Understanding from Scientific Exposition (With Andrew McCallum, PI and Raytheon BBN Technologies)