Honors Thesis
Computer science majors of junior or senior standing with at least a 3.3 grade point average in CS courses are eligible to join the Departmental Honors Program.
Requirements
In order to graduate with the Departmental Honors designation, a student must:
- Maintain at least a 3.3 grade point average in CS courses
- Have a Morrissey Contract Form approved by a faculty advisor and by the Honors committee by the end of their junior year (submit the forms to Mary Mulkeen in the CS main office)
- Complete two sections of CSCI4961 Honors Thesis during their senior year
- Submit a written honors thesis by the last day of class in the second semester of their senior year
- Make an oral presentation of their thesis at end of their senior year
Recent Honors Theses
Author: Jeremy Chan
Title: Modelling the Spread of Complaints Among Police Officers
Abstract: Recent research has proposed that police behavior, including excessive use of force and misconduct, can spread through a contagion process on police social interaction networks. However, the literature is mixed on this topic, where null findings have followed positive findings of contagion in some cities in others. Robust and easy-toimplement statistical tests are needed to help police agencies detect the contagion of unwanted police behavior in policing networks. In this thesis, we contribute to this literature on whether excessive police use of force is socially contagious in officer networks and present several models of excessive use of force complaints. Our proposed algorithms do not require as many parametric modeling assumptions as existing models. We propose a set-based Resampling Test variation as a baseline model to test the hypothesis that the use of force arises through a random, independent event process. We then compare this baseline to two other algorithms we adapt for modeling police use of force: the Pólya Urn Model and Spillover Pólya Urn Model. Finally, we implement these three algorithms using police use of force data from Chicago. Our results indicate that contagion plays a role in the excessive use of force in Chicago police networks.
Author: Jerry Hou
Title: Passive Acoustic Eavesdropping
Abstract: The goal of this project is to demonstrate a potential eavesdropping hazard associated with 60GHz wireless communications, where acoustic signals can be detected by an eavesdropper by observing the wireless communications signals. Unlike recent works which use radar waveforms to perform eavesdropping by ranging, our system uses binary-shift-keying (BPSK) signals by exploiting variations in the multipath environment between a transmitter and receiver caused by vibrations due to an acoustic excitation. As such this system demonstrates a passive eavesdropping hazard, where an eavesdropper can passively observe the wireless communications signal transmitted by a legitimate source to listen in on acoustic signals, without requiring the eavesdropper to transmit any signal of its own. Using a custom deep learning frame work using pytorch, we have proved that we could obtain an accuracy of 30% classifications of accuracy for words.
Author: Joy Kondo
Title: DataGarden: Exploring Our Community in a VR Data Visualization
Abstract: With the ever-increasing need to enhance data literacy, methods and approaches to effectively represent and analyze data are being developed. Traditional representations of data such as charts and abstract visuals detract from the humanity represented by datasets.
This thesis presents DataGarden, a system that supports embodied interactions with humane representations of data in an immersive VR environment for users to think about the people behind the data. In this interactive system, users are able to interact with their personal visualizations in our virtual garden, as well as with the visualizations created using the data of the Boston College community.
Author: John Marangola
Title: An Efficient Solver for the Perfect Phylogeny Model with Time Difference Regularization
Abstract: Many inference tools use the Perfect Phylogeny Model (PPM) to learn trees from noisy variant allele frequency (VAF) data. The PPM often leads to inference ambiguity, as many different trees provide equally adequate explanations for the observed VAF data. Furthermore, the PPM does not capture any notion of time. While irrelevant for spatial sampling, this is problematic when samples are taken in time; randomly reordering time samples does not change the best tree found under the PPM, a result which is counter-intuitive. To resolve this ambiguity, we augment the PPM such that it makes sense of the growth of the mutants over time. In particular, our augmented PPM assumes that sampled mutant growth paths, which are sampled coarsely in time, proceed in spurts, and are well modeled by paths with small L1-norm derivatives. For settings where the trees are small, we advocate for an inference algorithm that conducts a (semi) exhaustive search over the entire space of trees. This task requires a new, extremely fast inference engine. We develop a homotopy algorithm that extends the solution of the original PPM to the solution of the novel augmented model. Key to its efficiency are several computational insights that make our implementation significantly outperform competitor solvers.
Author: James Noonan
Title: Visible Light Positioning with Unmodified LEDs
Abstract: As industry and academia become increasingly automated, the need for costeffective, scalable indoor positioning algorithms increases. Current approaches, in the new research field of Visible Light Positioning (VLP), utilize modulated Light-Emitting Diodes (LEDs) for localization. In these works, luminous intensity levels of LEDs are varied rapidly, and the received light levels are used to find the position of a receiver. In this paper, we propose an alternative positioning technique that utilizes visible light, but with unmodified LEDs. Our proposal includes a rudimentary approach, as well as a proof of concept for a cost-effective, easily replicable and robust technique for indoor localization. We leverage the inherent manufacturing variations in LED bulbs, that result in uniquely identifiable light emission, to localize two-dimensional position directly through a simple multiple output regression neural network trained on light samples taken in the target environment, eliminating the need for component decomposition, hardware modification, or triangulation.
Author: Tyler Osborne
Title: Cognitive States: Belief State Inference via Deep Learning
Abstract: The Cognitive States project is an ongoing investigation of how well a pre-trained deep learning model, fine-tuned with various corpora of annotated texts, can infer belief states. Overall, the goal of this project is to make incremental progress towards more advanced belief state and sentiment detection capabilities. Previous research has focused on achieving state-of-the-art F1 results on classification tasks as well as end-to-end generative tasks defined on two corpora annotated for belief, sentiment, or both, Factbank and MPQA, using two models, BERT and Flan-T5 (Murzaku et al). We use the same models to define similar tasks on the Language Understanding (LU) corpus, in order to corroborate insights gained from previous work. Furthermore, we present a novel database representation for fine-tuning data, allowing for the unification of Factbank, MPQA, LU, and additional annotation-based belief/sentiment corpora into a single dataset for seamless use in multi-task learning contexts, requiring unifying data transformations such as converting unigram head words to n-gram spans. Our results for LU's majority class align with those of Murzaku et al. on all tasks, whereas our approaches performed less well on minority classes. Plans to improve minority performance include leveraging a few-shot approach or generating synthetic data by swapping out words in existing examples with close synonyms.
Author: Qiyuan Zhou
Title: Exploratory Analysis and Predictive Modeling for Electrocardiogram (ECG) and Photoplethysmogram (PPG) Human Heart Activity Data
Abstract: Electrocardiography (ECG) and Photoplethysmography (PPG) are two widely used techniques for monitoring cardiovascular activity. ECG is a well-established method for detecting the electrical activity of the heart, while PPG utilizes optical technology to measure variations in blood volume in peripheral tissues. This thesis explores two applications of PPG and ECG signals, utilizing a PPG dataset with Human Activity Recognition labels and an ECG dataset labeled with various cardiac conditions. Preprocessing was carried out on the raw time-series data, through detrending, bandpass filtering, and outlier exclusion. Two reduced versions of the data were also considered, one using Heart Rate Variability (HRV) summary measures, and the other a spectral representation based on the Fast Fourier Transform (FFT). Exploratory Data Analysis and predictive data modeling using machine learning techniques were then performed on the preprocessed datasets. We comment on the predictive performance of the models, try to understand the results from a physiological perspective, and suggest possible directions for future work.