Use the Menu or Search Box to navigate.
Links: Spring 2026 overview · Winners · All years · Site home
Projects — Big Data and AI Trends Market (Spring 2026)
- Team 1: Clinical Note Intelligence: An Agentic Hybrid Retrieval Framework Combining Structured Search and Retrieval-Augmented Generation
- Team 2: Karen – AI Complaint Assistant
- Team 3: PotholeVision – Automating Pothole Detection and GeoMapping
- Team 4: From Clicks to Actions: Spark-Powered Funnel Analysis with LLM-Driven Recommendations
- Team 5: PersonaPath: Personalized Travel & Dining Recommendation Engine (Behavioral Profiling)
- Team 6: Data Quality Remediation Assistant: AI-Driven Anomaly Detection & ETL Fix Generation at Scale
- Team 7: Demand Sense: An AI-Backed Driver Nudge System for Demand-Aware Repositioning
- Team 8: NFL Contract Prediction and Evaluation with LLM-Based Recommendations
- Team 9: InsideInsight: Agentic AI for Airbnb Pricing Strategy and Performance Optimization
- Team 10: An AI Copilot for Detecting Delayed Market Reactions to Corporate Disclosures
- Team 11: Detect Hidden Drug Safety Risks Faster with AI — FDA FAERS Analytics
- Team 12: TheaterIQ: AI-Driven Scheduling and Promotional Intelligence for Movie Theater Operations
Team 1: Clinical Note Intelligence: An Agentic Hybrid Retrieval Framework Combining Structured Search and Retrieval-Augmented Generation
Members: Ethan Armstrong, Ankit (Ziqi) Cao, Ko Jung Hsu, Cole Johnson, Mashhood Khan, Wenyu Zhong
Abstract: An AI chatbot for querying large-scale clinical datasets with natural language while keeping answers grounded in real patient data. It combines structured cohort/statistical retrieval with a RAG pipeline (e.g., LangChain/LlamaIndex) to surface trends like readmission patterns and treatment outcomes in an interactive experience.
Team 2: Karen – AI Complaint Assistant
Members: Mohameddeq Ali, Cora Goodwin, Midori Neaton, Raja Sori, Xupei Ye, Kyle Zhu
Abstract: An AI assistant that turns high-volume financial complaint narratives into prioritized, analyzable insights. It structures complaint text into topics (e.g., BERTopic) and ranks them with a custom priority score, then provides a natural-language interface for querying metrics and generating visualizations.
Team 3: PotholeVision – Automating Pothole Detection and GeoMapping
Members: Chunfang Wang, James Pashek, Joseph Sheehan, Madhu Damani, Moses Effah Akoto, Tao Fang
Abstract: A big data pipeline that detects potholes from video/images using a CNN model and publishes positive detections to an ArcGIS dashboard. City planners can see pothole counts, exact locations, and traffic context to prioritize repairs and reduce safety risk.
Team 4: From Clicks to Actions: Spark-Powered Funnel Analysis with LLM-Driven Recommendations
Members: Shang Chi Hsu, Xiang Li, Ashwini Manokar, Meenakshi Narendra, Isabel O'Grady
Abstract: A Spark-based clickstream analytics pipeline that measures conversion funnels (view → cart → purchase) and pinpoints drop-off drivers like repeated views or cart abandonment. It adds prioritization to focus on high-impact products/behaviors, and can translate findings into structured recommendations via an LLM.
Team 5: PersonaPath: Personalized Travel & Dining Recommendation Engine (Behavioral Profiling)
Members: Esther Baumgartner, Hsin Kuei Chang, Saloni Jain, Fu Lee, Dhairya Lunia
Abstract: A Yelp-based recommendation prototype that builds behavior-driven user and business profiles from review text. Topic modeling and contextual tags create “personas” and experience clusters; a RAG-enabled LLM then retrieves and ranks recommendations with explanations grounded in reviewer language.
Team 6: Data Quality Remediation Assistant: AI-Driven Anomaly Detection & ETL Fix Generation at Scale
Members: Sean Cabaniss, Yung Hsuan Hsieh, Ching-Fen Hung, Yonghui Kim, Omkar Thombare
Abstract: A scalable assistant that detects data quality issues (e.g., null anomalies and statistical outliers) with Spark, then uses a 2-step LLM agent flow to diagnose root causes and generate runnable PySpark remediation code. A human-in-the-loop Streamlit UI reviews fixes before execution, with auditability and before/after tracking.
Team 7: Demand Sense: An AI-Backed Driver Nudge System for Demand-Aware Repositioning
Members: Davey Johnson, Hengrui Li, Huiguo Liu, Mansi Malpani, Mounika Polamreddy
Abstract: A demand analytics pipeline over NYC TLC trips that identifies high-opportunity zones by time window using Spark-based aggregation and normalized scoring. An LLM layer converts the structured demand signals into concise, explainable “nudges” that tell drivers where demand is strong and why, with grounding and confidence filtering.
Team 8: NFL Contract Prediction and Evaluation with LLM-Based Recommendations
Members: Adam Getzkin, Mallika Kommera, Jay Pederson, Ariel Zhan, Zhen Zhang
Abstract: A predictive analytics system that estimates NFL contract value from performance and historical contract data, then uses an LLM-driven recommendation agent to answer natural-language questions like “undervalued RBs under $6M.” The agent rewrites queries into structured filters, compares predicted vs. market values, and returns ranked recommendations with grounded explanations.
Team 9: InsideInsight: Agentic AI for Airbnb Pricing Strategy and Performance Optimization
Members: Bhavisha Chafekar, Jyothirmai Sri Peesapati, Phoenix Ferrari, Stephen Weiler, Tzu-Yu Chen
Abstract: An end-to-end big data pipeline on Inside Airbnb data to analyze pricing, availability, and guest feedback at scale. It produces structured insights (e.g., neighborhood-level drivers of occupancy and satisfaction) and uses an agentic AI layer to convert analytics into actionable recommendations for hosts and property managers.
Team 10: An AI Copilot for Detecting Delayed Market Reactions to Corporate Disclosures
Members: Kristina Dennise Paraiso, Evelyn Lai, Zhichen Yang, Parul Chaudhary, Shivanshu Dagur
Abstract: A disclosure intelligence system that ingests SEC filings, uses a grounded LLM to score importance, and analyzes delayed stock price reactions to identify “underinterpreted” events. It surfaces a prioritized watchlist with traceable explanations so analysts can focus on disclosures the market may be slow to price in.
Team 11: Detect Hidden Drug Safety Risks Faster with AI — FDA FAERS Analytics
Members: Amogha Yalgi, Austin Ganje, Hannah Huang, Hayden Herstrom, Rachel Le
Abstract: An analyst-facing application that cleans and standardizes the FDA FAERS dataset at scale to surface emerging high-risk drug–event patterns. It enables smarter search, risk detection, and LLM-assisted summarization so users can review safety signals quickly and drill into trends interactively.
Team 12: TheaterIQ: AI-Driven Scheduling and Promotional Intelligence for Movie Theater Operations
Members: Sam Benson Devine, Jack Halverson, Tobias Knight, Qiqi Li, Yehan Wang
Abstract: A scheduling and promotions copilot for independent theaters that scores each film × audience × timeslot with a Match Score, then proposes weekly schedules and targeted promotional briefs. It combines large-scale data processing with a grounded agent workflow so recommendations are explainable and auditable before a human approves.